![]() |
|
![]() |
| Guidepup looks like it's a decent stab in that direction: https://www.guidepup.dev/
Only Windows and MacOS though, which is a problem for build pipelines. I too would very much like the page descriptions and the accessibility inputs to be the primary way of driving a page. It would make accessible access the default, rather than something you have to argue for. |
![]() |
| Spawn it in a dedicated network namespace (to contain the TCP socket and make it unreachable from any other namespace) and use `socat` to convert it to a UNIX socket. |
![]() |
| I have dreamed about a swappable engine.
Like, a wrapper that does my history and tabs and book marks - but let's me move from rendering in Chrome or Gecko or Servo or whatever. |
![]() |
| The same idea with built in Internet Explorer in Microsoft Edge, where you can switch to Internet Explorer mode and open website that only correctly works in Internet Exlorer |
![]() |
| Additionally, Playwright has some nice ergonomics in the API, though Puppeteer has since implemented a lot of it as well. Downloads and video capturing in Playwright is nicer. |
![]() |
| That is a huge oversimplification, if I ever saw one. If you look at the early commits, you can see that it isn't just a simple fork. For starters, the initial commit[1] is already using Typescript. As far as I am aware puppeteer is not and is written in vanilla JavaScript.
The license notice you mention is indeed there [2], but also isn't surprising they wouldn't reinvent the wheel for those things they wrote earlier and that simply work. Even if they didn't directly use code, Microsoft would be silly to not add it given their previous involvement with puppeteer. Even if it was originally a fork, they are such different products at this point that at best you could say that playwright started out as a fork (Which, again, it did not as far as I can tell). [1] https://github.com/microsoft/playwright/commit/9ba375c063448... [2] https://github.com/microsoft/playwright/blob/3d2b5e680147577... |
![]() |
| I'm not convinced. It looks like v0.10.0 contains ~half of Google's Puppeteer code and even in the latest release[0]the core package references Google's copyright several hundred times. Conceptually, the core, the bridge between a node server and the injected Chrome DevTools Protocol scripts are the same. Looks like Playwright started as a fork and evolved as a wrapper that eventually included APIs for Python and Java around Puppeteer. At the core there is a ton of code still used from Puppeteer.
[0] https://github.com/microsoft/playwright/tree/48627ad48405583... |
![]() |
| It works for me with stock Chromium and Chrome on Linux. But for Firefox, i apparently need a custom patched build, which isn't available for the distro i run, so i haven't confirmed that. |
![]() |
| What’s the relationship between Selenium, Puppeteer and Webdriver BiDi? I’m a happy user of Playwright. Is there any reason why I should consider Selenium or Puppeteer? |
![]() |
| Not to make this an ad for my project, but I'm starting to document it more here: https://valetnet.dev/
The Raspberry Pi is configured to use the USB HID protocol to look and act like a mouse and keyboard when plugged into a phone. (Android and iOS now support mouse and keyboard inputs). For video, we have two models: - "Valet Link" uses an HDMI capture card (and a multi-port dongle) to pull the video signal directly from the phone if available. (This applies to all iPhones and high-end Samsung phones.) - "Valet Vision" which uses the Raspberry Pi V3 camera positioned 200mm above the phone to grab the video that way. Kinda crazy, but it works when HDMI output is not available. The whole thing is also enclosed in a black box so light from the environment doesn't affect the video capture. Then once we have an image, yes, you use whatever library you want to process and understand what's in the image. I currently use OpenCV and Tesseract (with Python). Could probably write a book about the lessons learned getting a "vision first" approach to automation working (as opposed to the lower-level Puppeteer/Playwright/Selenium/Appium way to do it. |
![]() |
| If it's a single file you could just make it a download.
There's also the newer file system APIs (though in Safari you'll be missing features and need to put some things in a Web Worker). |
I have yet to see a browser automation tool that does not use localhost bound TCP sockets. Apart from that, most tools do not offer strong authentication -- a browser is spawned and it listens on a socket and when the controlling application connects to the browser management socket, no authentication is required by default, which creates hidden vulnerabilites.
While browser sessions may only be controlled by knowing their random UUIDs, creating new sessions is usually possible to anyone on 127.0.0.1.
I don't know really, it's quite possible I'm just spreading lies here, please correct me and expand on this topic a bit.