Microsoft Makes SkillOpt, AI Agent Can Learn Without Re-training Models

Microsoft has introduced SkillOpt, an open source framework that can improve the capabilities of AI agents without changing the weights of their underlying models.

As reported by VentureBeat, quoted Monday, June 15, SkillOpt is designed to improve the "skills" of AI agents. AI agents are artificial intelligence systems that can independently perform certain tasks, such as writing code, reading documents, or using digital tools.

The skill is in the form of a collection of instructions in a markdown document (.md). The content can be in the form of work rules, output format, how to use tools, to steps to avoid errors.

Until now, AI agent skills are usually improved manually. Developers have to change the instructions one by one. The process is often like guessing. Which sentence makes the AI more accurate, which one actually lowers performance.

SkillOpt tries to make the process more measurable. Microsoft treats skill documents as objects that can be trained. The system reads the work of AI agents, finds patterns of errors, and then proposes changes to the instructions.

But the changes are not immediately used. SkillOpt tests it first. If the performance goes up, the change is accepted. If it goes down, the change is rejected and saved as a bad example so that it is not repeated.

Yifan Yang, Senior Research SDE at Microsoft Research Asia, said the main problem is not just changing skills, but ensuring that the change actually improves performance.

"The problem is not whether the team can change the skill, but they can't guarantee that the change is an improvement," Yang told VentureBeat.

According to Yang, there are three sources of problems. Changes can be too far, not validated, or old mistakes keep coming back because the system has no "negative memory".

He gave an example, a change in instructions that was not tested had lowered the GPT-5.5 score on SpreadsheetBench from 41.8 to 41.1.

SkillOpt uses a principle similar to deep learning or deep learning. There is a limit to the number of changes, validation tests, and mechanisms to maintain learning that has proven useful. The difference is that SkillOpt does not touch the weights of the AI model. Model weights are core parameters that determine how the AI model works.

In the tests cited by VentureBeat, Microsoft tried SkillOpt on various models, from GPT-5.5 to GPT-5.4-mini and Qwen3.5-4B. The tests include question and answer, code creation with tools, and multimodal document reasoning, i.e. documents that combine text and images.

As a result, SkillOpt improved performance on all 52 combinations of models, benchmarks, and work environments tested. Benchmarks are standard tests to measure the capabilities of AI models. In GPT-5.5, the average improvement reached 23.5 points compared to the condition without skill.

The small model also benefits greatly. GPT-5.4-nano almost doubles the score on the multimodal document question-answering and doubles the performance on the sequential decision-making task.

For companies, the technology is attractive because much of the work of AI is still prone to mistakes on important things: taking numbers from contracts, invoices, and forms; maintaining format; using the tool correctly; and producing outputs that can be audited.

Yang said the improvement was not because AI memorized the answers. The system got better because it studied the work procedure.

SkillOpt can also be moved between environments. A spreadsheet skill trained on the Codex CLI, for example, can be used in Claude Code and produce a 59.7-point increase over Claude Code's built-in capabilities.

For the business world, SkillOpt offers a way to make AI agents more disciplined, consistent, and easy to audit without retraining the underlying models, which are usually expensive and complicated.