For the best experience on desktop, install the Chrome extension to track your reading on news.ycombinator.com
Hacker Newsnew | past | comments | ask | show | jobs | submit | history | more simon_luv_pho's commentsregister

WebMCP doesn’t seem to be available for use inside webpages or extensions.


I'm 2 years too late for that one...


Everything happens at runtime, on the HTML level.

It uses a similiar process as `browser-use` but all in the web page. A script parses the live HTML, strips it down to its semantic essentials (HTML dehydration), and indexes every interactive element. That snapshot goes to the LLM, which returns actions referencing elements by index. The agent then simulates mouse/keyboard events on those elements via JS.

This works best on pages with proper semantic HTML and accessibility markup. You can test it right now on any page using the bookmarklet on the homepage (unless that page CSP blocks script injection of course).


Not yet. Currently focused on the more common interaction patterns. PRs welcome though!


Gotcha. Still very cool! Congrats on the release.


Thanks!


I added in the system prompt that it should skip CAPTCHAs and hand control back to the user. Currently working on a proper human-in-the-loop feature. That's actually one of the key advantages of running the agent inside your own browser.


Makes sense.

For curiosity's sake, have you had it try to attempt captchas?

If so, what were the results?


I haven’t. I don’t think it will work well.

I use a text-based approach. Captchas like “crossroad” usually need a screenshot, a visual model and coordinate-based mouse events.


Thanks!

It supports any OpenAI-compatible API out of the box, so AWS Bedrock, LiteLLM, Ollama, etc. should all work. The free testing LLM is just there for a quick demo. Please bring your own LLM for long-time usage.


I'm looking into a European testing endpoint. The legal and compliance requirements are quite hassle, and persuading my company to pay for that infrastructure is gonna be a tough sell.


Full transparency: I work at Alibaba and published this under Alibaba's open-source org. I sometines maintain it during work hours, so yes, Alibaba technically pays me for it. That said, this is my project — it's MIT-licensed, includes no backend service, and is open for anyone to audit.

The free testing LLM endpoint is hosted on Alibaba Cloud because I happen to have some company quota to spend, but it's not part of the library. Bring your own LLM and there is zero data transmission to Alibaba or anywhere else you haven't configured yourself.

I highly recommend using it with a local Ollama setup.


Thank you for sharing this!


Please use your own LLM api instead!

The free testing LLM is Qwen hosted by Aliyun. Qwen and DeepSeek are the only ones I can afford to offer for free. It's just there to lower the try-out barrier; please DO NOT rely on it.

The library itself does NOT include any backend service. Your data only goes to the LLM api you configured.

I tested it on local Ollama models it works fine.


Or why not stay fully local with WebLLM... https://webllm.mlc.ai


That looks great! I also thought about calling the Gemini nano model embedded into Chrome (only extensions can do that). But after some testing on smaller models I found that anything smaller than 9b can’t really handle the complex tool call schema I use.

Qwen3.5 4b is quite good but still gives messy json quite often. But it’s very promising!

Maybe after one more model iteration or some fine-toning we can go fully embedded?


Darn. Pageant would've been a nice name though. Maybe `page-agent.js` is more relevant in web dev community.


I think every successful Show HN post ends up with a "thought this was about X" or "didn't look up the name first?" comment. Consider it a win! I don't think anyone will mistake a tool for putty with your tool, but you might share a google search page with it.


I think page agent is good. I've never heard of putty's pageant. And I think it's better to distinguish it from general meaning of pageant (for beauty).


Thanks!


Consider applying for YC's Summer 2026 batch! Applications are open till May 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search:

HN For You