Let AI be your browser operator.
English | ็ฎไฝไธญๆ
Your AI Operator for Web, Android, Automation & Testing
Midscene.js allows AI to serve as your web and Android operator ๐ค. Simply describe what you want to achieve in natural language, and it will assist you in operating the interface, validating content, and extracting data. Whether you seek a quick experience or in-depth development, youโll find it easy to get started.
Instruction | Video |
---|---|
Post a Tweet (By UI-TARS model) | |
Use JS code to drive task orchestration, collect information about Jay Chouโs concert, and write it into Google Docs (By UI-TARS model) | |
Control Maps App on Android (By Qwen-2.5-VL model) |
Besides the default model GPT-4o, we have added two new recommended open-source models to Midscene.js: UI-TARS and Qwen2.5-VL. (Yes, Open Source models !) They are dedicated models for image recognition and UI automation, which are known for performing well in UI automation scenarios. Read more about it in Choose a model.
You can use multimodal LLMs like gpt-4o
, or visual-language models like Qwen2.5-VL
, gemini-2.5-pro
and UI-TARS
. In which UI-TARS
is an open-source model dedicated for UI automation.
Read more about Choose a model
There are so many UI automation tools out there, and each one seems to be all-powerful. Whatโs special about Midscene.js?
Debugging Experience: You will soon realize that debugging and maintaining automation scripts is the real challenge. No matter how magical the demo looks, ensuring stability over time requires careful debugging. Midscene.js offers a visualized report file, a built-in playground, and a Chrome Extension to simplify the debugging process. These are the tools most developers truly need, and weโre continually working to improve the debugging experience.
Open Source, Free, Deploy as you want: Midscene.js is an open-source project. Itโs decoupled from any cloud service and model provider, you can choose either public or private deployment. There is always a suitable plan for your business.
Integrate with Javascript: You can always bet on Javascript ๐
We would like to thank the following projects:
If you use Midscene.js in your research or project, please cite:
@software{Midscene.js,
author = {Xiao Zhou, Tao Yu, YiBing Lin},
title = {Midscene.js: Your AI Operator for Web, Android, Automation & Testing.},
year = {2025},
publisher = {GitHub},
url = {https://github.com/web-infra-dev/midscene}
}
Midscene.js is MIT licensed.