Agent TARS
by ByteDance
ByteDance's multimodal AI agent for desktop and browser automation — controls GUIs, browsers, and shell via vision and MCP tools. Available as npm CLI and desktop app.
Skills
GUI Control
Control desktop application interfaces using vision-based element detection, clicking, and keyboard input.
Browser Agent
Navigate websites, fill forms, click elements, and extract data from web pages autonomously via the browser.
Shell Execution
Execute shell commands and scripts as part of multi-step agent workflows with streamed output support.
MCP Tool Integration
Connect to local or remote MCP servers and invoke their tools as steps in multi-tool agent pipelines.
Related Agents
MaxKB
Open-source enterprise RAG platform for building AI agents. Upload docs, auto-vectorize, and orchestrate multi-step wor…
AgentMail
Email inbox API built for AI agents. Create, send, receive, search, and manage email programmatically with SDKs for Pyt…
Claude MCP
Anthropic's Model Context Protocol — open standard for connecting AI models to tools, data sources, and services with u…
Vercel AI SDK
TypeScript toolkit for building AI applications with React Server Components, streaming, tool calling, and multi-provid…