ShinkaEvolve: Evolving New Algorithms with LLMs, Orders of Magnitude More Efficiently

September 25, 2025

ShinkaEvolve produced algorithms that found a state-of-the-art Circle Packing solution.
We introduce ShinkaEvolve, an evolutionary code optimization framework, which discovers new algorithms with LLMs and achieves unprecedented sample efficiency. In the above on the left, we demonstrate the progress of the challenging Circle Packing task, and visualize the path of evolutionary program search to the best program on the right.

Summary

At Sakana AI, we are inspired by nature’s principles of evolution and collective intelligence to build the future of artificial intelligence. Evolution in nature is a masterful search algorithm, creating sophisticated solutions over millennia. In our work, from Evolutionary Model Merge, LLM², The AI Scientist, Automating the Search for Artificial Life, to the Darwin Gödel Machine, our consistent theme is to bring this incredible search algorithm to AI-driven discovery.

High-level overview of ShinkaEvolve.
The ShinkaEvolve framework constructs an archive of evaluated programs, generates new programs, and evaluates their fitness.

ShinkaEvolve provides a sample-efficient alternative to AlphaEvolve and outperforms its Circle Packing solution.

Modern evolutionary approaches using LLMs (e.g. AlphaEvolve) have shown great promise for scientific discovery. However, they suffer from a critical limitation: they are incredibly sample inefficient, often requiring thousands of attempts to find good solutions. This makes them slow, expensive, and inaccessible to many. We wanted to change that.

In our new work, “ShinkaEvolve: Towards Open-Ended And Sample-Efficient Program Evolution,” we introduce a new framework that leverages LLMs to evolve programs and discover new solutions with state-of-the-art performance and incredible efficiency.

We have published our technical report (https://arxiv.org/abs/2509.19349), and open-sourced our project. The code is incredibly easy to use, and we encourage you to try it out yourself:

GitHub Project: https://github.com/SakanaAI/ShinkaEvolve
Paper: https://arxiv.org/abs/2509.19349

Introducing ShinkaEvolve

The Japanese word ‘Shinka’ (進化) means ‘evolution’ or ‘gradual development’. ShinkaEvolve¹ is an open-source framework (Apache 2.0 License) designed from the ground up to tackle the critical limitations of existing approaches: poor sample efficiency and their closed-source nature. We tested ShinkaEvolve across four completely different domains, and the results demonstrate its power, generality, and efficiency:

1. Mathematical Optimization (Circle Packing)

ShinkaEvolve discovered a new state-of-the-art solution for the classic 26-circle packing problem using only 150 samples. As you can see in the chart from our paper, this is a massive leap in efficiency compared to prior work. The discovered algorithm is a sophisticated hybrid of a golden-angle spiral initialization, gradient-based refinement, and simulated annealing to escape local optima.

Circle Packing: ShinkaEvolve's discovered solution outperforms AlphaEvolve's solution in only 150 generations.

Evolution tree of programs produced by ShinkaEvolve over several generations.

ShinkaEvolve produced algorithms that found this state-of-the-art Circle Packing solution.

2. Agentic System Design (AIME Math Reasoning)

We tasked ShinkaEvolve with designing an agent scaffold for solving challenging math competition problems. Using only 75 generations, it evolved a highly effective three-stage architecture that uses diverse expert personas, critical peer review, and a final synthesis stage. This design significantly outperforms strong baselines on the AIME Math benchmark. Furthermore, the scaffold successfully generalizes to unseen problems from different years and even different underlying LLMs.

AIME Agent Scaffold Design: Evolving Agentic Systems for Math Reasoning. ShinkaEvolve illuminates the space of agent scaffold designs and discovers a Pareto frontier that trades off math reasoning performance and the number of LLM queries.

The agent that ShinkaEvolve discovered successfully generalizes to unseen problems from different years and even different underlying LLMs.

3. Competitive Programming (ALE-Bench)

We took the best solutions discovered by a state-of-the-art agent (ALE-Agent) in the AtCoder heuristic contest, a competitive programming contest for NP-Hard optimization problems, where ShinkaEvolve is used to improve them further. It successfully found improvements across multiple tasks, boosting the average performance of the agent. On one task, its improvements were so significant that it would have achieved 2nd place if it had participated in the competition. It introduced clever enhancements to the ALE-Agent solution, like advanced caching and novel “targeted edge move” operators.

Competitive Programming: ShinkaEvolve improves ALE-Agent initial solutions for AtCoder heuristic programming competitions.

On average the performance improves by 2.3% using the best public score (top-1, evolved fitness) and ShinkaEvolve shows limited overfitting to the public test cases.

Results on AtCoder (AHC015) Optimization Problem. Left: ALE-Agent solution (Score 762,641), Right: Shinka Agent solution (Score: 817,371)

4. LLM Training Design (Mixture-of-Experts)

In a domain very relevant to the core of LLMs, ShinkaEvolve discovered a novel load balancing loss (LBL) function for training Mixture-of-Experts (MoE) models. After only 30 ShinkaEvolve generations, a newly discovered loss function outperformed the state-of-the-art “Global LBL” designed by the DeepSeek team, improving efficiency and downstream accuracy on seven benchmarks and generalizing to larger MoEs with 5 times more active parameters. Given the time and resource cost of training LLMs, ShinkaEvolve’s efficiency was again a critical component of this discovery, showing its potential even for improving the core training strategies of the future generations of AI models themselves.

Improved LLM Training:Towards Balanced and Effective Expert Allocation. ShinkaEvolve improves on state-of-the-art load balancing loss functions with reduced inefficient token routing(-5.81%) and higher task performance (+1.73% on average). The effects of this loss adaptively activate and vanish only for particularly imbalanced LLM layers.

Speeding Up Agentic Evolution

Evolutionary algorithms are not known to be sample efficient. Many evolutionary AI systems are powerful but act like brute-force (random search) engines, burning thousands of samples to find good solutions. This makes discovery slow and expensive. ShinkaEvolve achieves its remarkable sample efficiency through three key innovations that work together:

Balancing exploration and exploitation: A program parent sampling technique that intelligently balances exploiting known good solutions and exploring new ideas.
Novelty-Based Program Rejection Sampling: Code novelty rejection-sampling to avoid wasting time evaluating minor, uninteresting variations of existing programs. We compute program text embedding similarities and leverage an LLM-as-novelty-judge approach to judge the proposal creativity.
Task-dependent Language Model Prioritization: A bandit-based strategy that dynamically selects the best LLM from an ensemble for the task at hand, adapting as the evolution progresses. Thereby, ShinkaEvolve can dynamically select the best-suited LLM to make search progress.

We ran comprehensive ablation experiments to study each individual methodological contribution of ShinkaEvolve. Please read our paper for more details.

ShinkaEvolve Method Ablations on the Circle Packing task: The introduced stepping stone collection dynamics of novelty weighted sampling, adaptive LLM prioritization, and novelty rejection sampling all contribute to ShinkaEvolve's sample efficiency.

A Co-Pilot for Scientists and Engineers

Our competitive programming results demonstrate that ShinkaEvolve can work in tandem with other agents as co-improvers. In the long run, our vision for ShinkaEvolve is to be an easy-to-use companion tool to help scientists and engineers with their daily work. For instance, in its current form, the tool can assist machine learning researchers and engineers in their research and development work.

ShinkaEvolve generates a search summary document consisting of previous code proposal summaries, a set of insights outlined in a scratchpad, and recommendations for future solution approaches. This can aid humans in achieving further improvements.

In order to further aid this process, we release ShinkaEvolve along with an interactive WebUI for visualizing solutions and discovery runs.

ShinkaEvolve comes with an interactive web interface for monitoring and analyzing program evolution results.

The Future: Open Sourcing a Scientific Discovery Engine

We believe ShinkaEvolve can be a powerful tool for researchers and practitioners across the world. By drastically reducing the computational and LLM query cost and providing an open, extensible framework, we hope to accelerate discovery across a wide range of scientific and engineering problems.

Our system in action.

Looking ahead, we see ShinkaEvolve as a foundation for future extensions and even more ambitious goals. Our framework can leverage all AI model providers, with its performance and efficiency already improving with the latest AI releases such as GPT-5 and Claude 4.1.

Furthermore, a compelling future extension is to go beyond performance human-defined metrics, making our framework generate its own problems and provide preliminary assessments of its solutions to tackle broader domains like medicine and design, where success cannot be easily measured.

By open-sourcing ShinkaEvolve, we release a powerful, general-purpose discovery engine that we hope can empower the community and accelerate future breakthroughs beyond our work.

We are excited to see what the community builds with it!

Sakana AI

Want to make the AI that improves AI? Please see our Careers page for more information.

Footnotes

We understand that the name ShinkaEvolve might be a bit funny for some Japanese readers, as it sounds like Evolve-Evolve or 進化ー進化. However, similar bilingual repetitive terms appear quite often in Japan. For example: Nihonbashi bridge or Arakawa river. Can you think of others? 🤭 ↩