Browser Robotization & Automation

  • This concept is not for the faint hearted or those of a nervous disposition.

    In its most simplistic sense this is a sort of Anti-Javascript wherein Javascript was originally meant for input validation and other light work.

    Before anything else can be mentioned, it is important to clear the air and say 'Security'. I will attempt to mitigate the impact of security later in my suggestion.

    The goal, to create a scripting language and its API cousin to allow automation of repetitive online tasks. Many of us in work find ourselves visiting the same site everyday, scraping and downloading or updating the same something!

    Already, Password Managers make logging into a web-site trivial. Let's add to that. Create a simple scripting language to perform activities like click buttons, check-boxes, follow links, select from drop-boxes etc, scrape tables and save them as CSVs and automatically down load files to specified location. Even to repetitively fill in forms after ingesting a specified (file created in directory, file exists, file name matches pattern, file name matches pattern for date/sequence) CSV file. Importantly the script would also a have a logging function. To make this composable and modular the activities will be able to trigger and call other activities.

    The activity script builder like the regular developer mode of the browser needs access to the DOM. Dynamic content is no-longer a modern concept but is still challenging to process. Therefore, it should be possible define an action on all elements that belong to a parent in the DOM. There is a strong need for security therefore the activity script would need to be explicitly granted permission to access credentials in the Password Manager.

    The API provides access to trigger, activities, retrieve log data, pull download, pull scrapped CSV files, pull ingest files, run time stats. The API would not be able to inject activities or direct commands. All of these activities and API run in a headless browser. Some API calls may also require authentication.

    The none-scripting and non-API features that are important are scheduling, concurrency and security. As well as distribution and dependency management.

    To allow activities to run the user profile select to run the activity must have the 'Runner' role attached. The profile would be recommended to be password protected.

    This sounds like a lot of effort and not really within the scope of a web-browser, right? So how do I justify this? Other than just saying that Vivialdi is a browser for our friends... And we've all got one or two repetitive little web based tasks (and maybe one biggy) which could easily swept under the carpet with a little help. Also Robotic Process Automation is a huge and growing market. By including automation features and the ability of for RPA tools to integrate with Vivaldi, this creates a new way for Vivaldi to serve the web-market. These tools if implemented correctly and with security in mind would be greatly appreciated by many many users!

  • As this is a scripting language and there is a great need to process dynamic content, the scripts should include flow control structures and the ability to respond to changes on the page.

    For language style I would suggest a declaritive style of language. With the idea of concurrency, you should be able to say where you would normally iterate over a set of data because you can only handle one at a time, that script can fork here and manage its own time.

    Admittedly this has become more complex than I first imagined. The justifications remain, for example a lot of business process applications are now web-based. Look at ticket tracking applications and others.

  • Hmmm.... Well, I actually have such a thing and have been developing it and improving it for the past 9 years. It's a state machine plus page evaluation operations. It's trainable right on the page and handles very complex, multi-page automation. For the first vertical I automate adding coupon codes to more than 50k (already trained) shopping carts.

    [I plan to share a beta of this working in a Vivaldi Web Panel before the end of the month.]

  • This is a demonstration of revolutionary thinking. You think you're being clever and original where as in fact someone else already had the same idea nine plus years ago. Viva la revolution.

  • Moderator

    Unfortunately Vivaldi has no Selenium driver for automated testing.

    @dies_felices And i guess will take some years before your API wishes may be added and the Vivaldi team would need more developers and financial resources to add this.

  • @Gwen-Dragon That is essentially the kind of thing which I'm suggesting. However, this is a feature which I believe would work hand in hand with the developer tools feature. As for a time scale, my thought is that if you plan for it at design time then when you come to implement the feature a lot of the supporting infrastructure will already be in place. So, I'm not saying over night or even that this aught to be a priority but somewhere on the road-map after the Email client but before Vivaldi OS 🙂

  • When I wrote the feature request above, I had been going through some Robot Process Automation training using an application which made me think this process needs automation!

    My task on that day was navigating a web-form and downloading a file from a website. I thought, why can't this be automated natively in the browser. After all, this is exactly the sort of thing that people who use business process web applications do every day, all day long. So wouldn't it be nice if you could just get the browser to do that for you?

  • Why the browser, though? If you don't actually need to see the website, you don't need the rendering engine ... I don't know about now (though it is hard to imagine they've disappeared) but download managers and spiders are designed to download files with minimal interaction. (A spider is designed to download all public content on a given site so as to have a local copy.)

  • @sgunhouse The best answer that I can offer is that a regular web browser user doesn't look very far beyond their mouse pointer or touch screen for a tool. They are given a task with instructions on how to complete it and instructions for the fringe cases and then they get inducted into completing other esoteric tasks with their own idiosyncrasies and their lives become full of complexities.

    Thus if the browser can provide a simple short cut to overcome the repetitive tasks then that might be within reach. More importantly, the user is restricted to a narrow tool set that includes a browser and they're not allowed to download and install and develop automated processes which in turn may not be maintainable when they leave.

    Download managers don't strike me as being adequate to the task of reliably retrieving a file from a dynamically created web page though I may be wrong about that in some specific cases. I don't download a lot of files so I've just stuck my finger in the air and caught an opinion.

    When you talk about a spider downloading all content of a given web site. I accept if you're scrapping a site that you have legitimate access to, you'd at least in theory be able to traverse authentication barriers in some way. This isn't targeted and the user would have a job locating the details of the desired content.

    Once I had a daily task to download the previous days invoices in a report, from a trading partner. This meant logging into the partners site, navigating to the invoices page, provide the criteria for the query (tell it I wanted to search based on dates, select a date range of yesterday from two calendar picker controls), click the generate the report button and then down load it. After that I had to process the report but that's beyond the scope of this topic.

    The browser is able to collect the user's data, authentication details in a password manager or certificate store, Vivaldi already has these features, that gets you through the door. Then you need a mechanism that rather than OCRing the rendered controls reads their properties. At this point a programmatic interface can become as complex as you'd like but the goal remains to automate an otherwise repetitive task.

    One story I heard about the reasoning behind Javascript in the beginning was to achieve simple input validation, simple compared to modelling a 6502 which is one application it does today. Importantly though, Javascript still has the ability to perform input validation on your web pages.

    And if you didn't think I've been going on long enough. This seemed like such a simple question all those long hard hours ago when I started writing this response. Consistency! That's another important thing. If you or I were somehow string together a spider and/or download manager and couple of other tools needed to scrap the right page, we wouldn't do it in the same way, even using the same tool set we'd differ but with this functionality supported in the browser that can all be abstracted away and the task becomes much more repeatable.

    I don't mean to suggest that this is an easy to accomplish goal for the Vivaldi Devs but this would go a long way to making a lot of people's lives better. This is better inside the browser than outside.

  • @dies_felices said in Browser Robotization & Automation:

    Viva la revolution.

    Thinking about it and doing it are very different things as is the exact application.

    How about this as a shortened way to describe what you're talking about?


    That would be super-cool as a built-in feature and if you could add in a little bit of programming logic then this would be incredibly powerful, unique and amazing!

    @sgunhouse asked in Browser Robotization & Automation:

    Why the browser, though?

    And @Gwen-Dragon mentioned Selenium... And I told you in a PM about Puppeteer.

    We all think like programmers and it just finally clicked with me how cool this is as an end-user feature, one where you can push a button and watch the browser (your bot / macro) take over to preform your repetitive task.

    I don't really have experience watching someone else do my work but can imagine how nice it would be to sit back with a cocktail and watch my tasks get completed (without any of my effort).

    Watching automation is actually entertaining, liberating and ....

    PS You can get an idea of what I've implemented with my First Vivaldi Coupon Code Panel Extension. The power of running complex, multi-page macros will become even more clear when I publish the best price search feature next month.

  • Moderator

    @qpongo said in Browser Robotization & Automation:


    Already planned, but no timeline.

  • @qpongo


    The M word is just one letter away from being a four letter word but worse that in a programming context it's meaning has become too broad. Compare C macros to Lisp macros, very different animals. I imagine that you might be referring to the story of Mac?

    Thinking about it and doing it are very different things as is the exact application.

    Preparation is key. That's all I'd ask at this time, plan ahead.

    The tools that you and @Gwen-Dragon have mentioned are very worth while, useful, powerful and go a long way to achieving what's being discussed here.

    The broader issue here is that front end tools don't lend themselves to programmatic extension in the same way that maybe a library or a framework does. Excuse the 'M' but macros appear in image manipulation programs, word processors, spreadsheets. What they lack is the standardisation and forethought. As far as I can tell, with the the exception of the an Office suite they're not designed to work together. There in the applications really act as islands of functionality rather than a spectrum of expressiveness.

    Perhaps today, in view of this discussion I need to re-evaluate my appreciation of marcos on front end applications. They don't have to be clunky ill-fitting implementations of Basic. They can be elegant and beautiful.

  • @Gwen-Dragon

    Already planned, but no timeline.

    Of cause, I'm going to ask that you consider putting forward a timeline.

    I would also like to re-iterate some of the features being requested:

    • Openness, in the sense an external application or event may trigger a macro or talk to an API.
    • Security the counter point to expressiveness and ease of use but a challenge to allow real power to be released safely.
    • Simplicity a low entry barrier for new users and developers.
    • Expressiveness, tools that don't get in your way and meet your needs. The ability to shoot yourself in the foot is very different to getting shot in the foot by a misfire.
    • Distribution, the ability to create a macro once and re-use it elsewhere..

    Lastly I'd ask that in this endeavour Vivaldi is made the standard and model for front end macros.

    In the vein of repetition, I will say that I believe a feature set like this would put Vivaldi in front of a whole set of new users.

  • This post is deleted!

Looks like your connection to Vivaldi Forum was lost, please wait while we try to reconnect.