- FavTutor
- Posts
- OpenAI’s new Operator Agent Can Do Your Work
OpenAI’s new Operator Agent Can Do Your Work
Including the latest AI news of the week
Hello, AI Enthusiasts!
Welcome to FavTutor’s AI Recap! We’ve gathered all the latest and important AI developments for the past 24 hours in one place, just for you.
In Today’s Newsletter: 😀
OpenAI’snew Operator Agent Can Do Your Work
Hugging Face open-sources the World’s Smallest Vision Model
AMD’s Agent Laboratory transforms LLMs into Research Assistants
OpenAI
👋 OpenAI’s new Operator Agent Can Do Your Work
I told you about this feature yesterday, which has now been launched. OpenAI’s Operator is an AI assistant that can navigate the web independently. This new feature is powered by a new Computer-Using Agent (CUA) model. ‘Operator’ is currently available only to US-based ChatGPT Pro users.
Insights for you:
OpenAI has launched Operator, an AI agent capable of autonomously navigating and interacting with websites through “GPT-4o” vision capabilities and actions like tapping, clicking, and scrolling.
Users can describe a task, such as filling out forms or ordering groceries, and the agent will complete it independently. It handles the task in a separate browser window within the ChatGPT interface.
If it encounters challenges, it can leverage its reasoning capabilities to self-correct. When it gets stuck, it simply hands control back to the user.
Start learning AI in 2025
Everyone talks about AI, but no one has the time to learn it. So, we found the easiest way to learn AI in as little time as possible: The Rundown AI.
It's a free AI newsletter that keeps you up-to-date on the latest AI news, and teaches you how to apply it in just 5 minutes a day.
Plus, complete the quiz after signing up and they’ll recommend the best AI tools, guides, and courses – tailored to your needs.
Hugging Face
🤏 Hugging Face open-sources the World’s Smallest Vision Model
Hugging Face has introduced the world’s smallest vision-language model, SmolVLM-256M that runs on devices as small as smartphones while outperforming their predecessors that require massive data centers. SmolVLM-256M can answer questions about scanned documents, describe videos, and explain charts.
Insights for you:
Hugging Face open-sourced SmolVLM-256M, a new vision language model with the lowest parameter count in its category.
They released two models: SmolVLM-256M and SmolVLM-500M, which are 256 million parameters and 500 million parameters in size, respectively.
The models are designed to work well on laptops with less than around 1GB of RAM. According to Hugging Face, it could potentially run in browsers as well, powered by WebGPU.
AMD
🧪 AMD’s Agent Laboratory transforms LLMs into Research Assistants
Johns Hopkins University and AMD have developed Agent Laboratory, a new open-source framework that pairs human creativity with AI-powered workflows. Unlike other AI tools that try to come up with research ideas on their own, Agent Laboratory focuses on helping scientists carry out their research more efficiently.
Insights for you:
Agent Laboratory, an open-source framework developed by AMD combines human ideation with AI-driven workflows to accelerate machine learning research.
The workflow is divided into three main phases: literature search by a PhD agent, creation of a detailed research plan by PhD and postdoc agents, and implementation and execution of experiments by an ML engineer agent.
When the experiments are complete, PhD and professor agents work together to write up the findings. Using a tool called paper-solver, they generate and refine a comprehensive academic report.