switchboard
A

Agent TARS

by ByteDance

ByteDance's multimodal AI agent for desktop and browser automation — controls GUIs, browsers, and shell via vision and MCP tools. Available as npm CLI and desktop app.

4
Skills
None
Auth
Yes
Streaming
No
Push

Skills

GUI Control

Control desktop application interfaces using vision-based element detection, clicking, and keyboard input.

Browser Agent

Navigate websites, fill forms, click elements, and extract data from web pages autonomously via the browser.

Shell Execution

Execute shell commands and scripts as part of multi-step agent workflows with streamed output support.

MCP Tool Integration

Connect to local or remote MCP servers and invoke their tools as steps in multi-tool agent pipelines.

Browser & Computer UseInfrastructure & Opscomputer-usegui-agentmultimodaldesktop-automationmcp-toolsbrowser-controlbytedance
Visit Agent
agent-tars
ByteDance's multimodal AI agent for desktop and browser automation — controls GUIs, browsers, and shell via vision and MCP tools. Available as npm CLI and desktop app.
fields
nameAgent TARS
providerByteDance
urlhttps://github.com/bytedance/UI-TARS-desktop
categoriesbrowser-computer · infrastructure
accesscli
authnone
streamingtrue
pushfalse
verifiedtrue
tagscomputer-use, gui-agent, multimodal, desktop-automation, mcp-tools, browser-control, bytedance
skills
gui-controlGUI ControlControl desktop application interfaces using vision-bas…
browser-agentBrowser AgentNavigate websites, fill forms, click elements, and extr…
shell-executionShell ExecutionExecute shell commands and scripts as part of multi-ste…
mcp-tool-integrationMCP Tool IntegrationConnect to local or remote MCP servers and invoke their…