
Microsoft today announced Fara-7B, its first agentic small language model (SLM) designed for computer use.
“Unlike traditional chat models that generate text-based responses, Computer Use Agent (CUA) models like Fara-7B leverage computer interfaces, such as a mouse and keyboard, to complete tasks on behalf of users,” Microsoft Research explains. “With only 7 billion parameters, Fara-7B achieves state-of-the-art performance within its size class and is competitive with larger, more resource-intensive agentic systems that depend on prompting multiple large models. Fara-7B’s small size now makes it possible to run CUA models directly on devices. This results in reduced latency and improved privacy, as user data remains local.”
Fara-7B supports creating non-research agentic experiences, meaning real-world experiences like filling out web forms, searching, booking travel, managing accounts, and other everyday web tasks. But it’s also is an experimental release that Microsoft will use to invite the public to experiment and provide feedback. So the recommendation is that you use it in a sandboxed environment, avoid using sensitive data, and basically just use the model responsibly as it evolves.
Microsoft trained Fara-7B using “a novel synthetic data generation pipeline for multi-step web tasks” and drawing “from real web pages and tasks sourced from human users.” It basically visually perceives a webpage and then takes actions on it, like scrolling, typing, and clicking on individual elements.
Compared to larger, cloud-based models, Fara-7B apparently performs pretty well, and it’s been optimized for some tasks, like price comparisons and locating job postings, that Microsoft says are underrepresented in benchmarks.
You can access Fara-7B now in Microsoft Foundry and Hugging Face, and there’s a version specially optimized for the NPU in Copilot+ PCs too. You can access that via AI Toolkit in Visual Studio Code, Microsoft says.