Let AI be your browser operator.
English | 简体中文
Your AI Operator for Web, Android, Automation & Testing
Instruction | Video |
---|---|
Use JS code to drive task orchestration, collect information about Jay Chou’s concert, and write it into Google Docs (By UI-TARS model) | |
Control Maps App on Android (By Qwen-2.5-VL model) | |
Using midscene mcp to browse the page (https://www.saucedemo.com/), perform login, add products, place orders, and finally generate test cases based on mcp execution steps and playwright example |
aiAssert()
, aiLocate()
, aiWaitFor()
.Midscene.js supports both multimodal LLMs like gpt-4o
, and visual-language models like Qwen2.5-VL
, Doubao-1.5-thinking-vision-pro
, gemini-2.5-pro
and UI-TARS
.
Visual-language models are recommended for UI automation.
Read more about Choose a model
Midscene will automatically plan the steps and execute them. It may be slower and heavily rely on the quality of the AI model.
await aiAction('click all the records one by one. If one record contains the text "completed", skip it');
Split complex logic into multiple steps to improve the stability of the automation code.
const recordList = await agent.aiQuery('string[], the record list')
for (const record of recordList) {
const hasCompleted = await agent.aiBoolean(`check if the record contains the text "completed"`)
if (!hasCompleted) {
await agent.aiTap(record)
}
}
For more details about the workflow style, please refer to Blog - Use JavaScript to Optimize the AI Automation Code
There are so many UI automation tools out there, and each one seems to be all-powerful. What’s special about Midscene.js?
Debugging Experience: You will soon realize that debugging and maintaining automation scripts is the real challenge. No matter how magical the demo looks, ensuring stability over time requires careful debugging. Midscene.js offers a visualized report file, a built-in playground, and a Chrome Extension to simplify the debugging process. These are the tools most developers truly need, and we’re continually working to improve the debugging experience.
Open Source, Free, Deploy as you want: Midscene.js is an open-source project. It’s decoupled from any cloud service and model provider, you can choose either public or private deployment. There is always a suitable plan for your business.
Integrate with Javascript: You can always bet on Javascript 😎
We would like to thank the following projects:
If you use Midscene.js in your research or project, please cite:
@software{Midscene.js,
author = {Xiao Zhou, Tao Yu, YiBing Lin},
title = {Midscene.js: Your AI Operator for Web, Android, Automation & Testing.},
year = {2025},
publisher = {GitHub},
url = {https://github.com/web-infra-dev/midscene}
}
Midscene.js is MIT licensed.