
Founded by Josh Purtell
Josh recently worked at Basis, a startup automating accounting tasks with AI agents, as a researcher developing and maintaining pipelines and agents used to serve top accounting firms. While there, he had the opportunity to publish academic research on the topic of agent optimization at EMNLP, a respected AI conference.
Before Basis, he built a startup doing ML for cyber with some friends that got acqui-hired right out of the gate for a modest amount. Before then, he studied math and machine learning at Yale.
Teams want to deploy software that automates economically meaningful tasks over multiple steps. Iterating on those systems is hard: an agent taking 100 steps to complete a task will often create a newspaper’s worth of text data over just one task run, and reviewing a representative evaluation dataset can be daunting. Moreover, getting language models to demonstrate context-specific agentic behavior - such as following strong plans and making the best use of their environment - can be challenging, just with manual prompt tweaking.
Working as an agent researcher at a startup automating accounting tasks, Josh struggled to parse through the reams of logs his systems would generate every time he wanted to test a potential improvement. Manually tweaking dozens of prompts and reasoning about how changes would propagate through the system was also difficult— and, for many problems, ineffective. As agents are deployed in more challenging applications and to more customers, these problems will grow. Agent developers need a tool that scales with their agent’s complexity and scope in order to build state-of-the-art systems.
Synth works in 3 steps:
The Synth team worked with a startup that was working on automating code generation to increase their agent’s success rate at an editing task from ~85% to 100% on a representative evaluation set.
Developers and researchers have been building AI agents with language models for years, but only somewhat recently have base models provided a strong enough foundation to enable the most ambitious applications.
Moreover, some of the most powerful and consistent approaches for tuning agents to address domain-specific challenges — the ones powering Synth — have only been published in the last year or so. The market is ready for the Synth solution, as is the research literature.
