The Data Engineering Show
How Zipline AI Turns Weeks of Engineering Into Minutes of SQL Queries ft. Nikhil Simha
March 10, 2026
What if you could deploy ML features and real-time data pipelines without building complex infrastructure from scratch? In this episode, host Benjamin sits down with Nikhil Simha, CTO at Zipline AI and co-author of Chronon AI, to explore how Chronon, an open-source system that generates data infrastructure from simple queries, is transforming feature engineering at companies like OpenAI and Airbnb. Learn why iteration speed matters for fraud detection, how to serve thousands of signals at a massive scale, and what the future of analytical databases looks like in an AI-first world. Whether you're scaling real-time ML systems or building customer-facing analytics, this conversation is packed with practical insights on bridging the gap between data scientists and ML engineers.
In this episode of The Data Engineering Show, host Benjamin sits down with Nikhil Simha, CTO of Zipline AI and co-author of Chronon, to explore how a declarative feature platform solves the speed-vs-scale paradox in modern ML infrastructure, from fraud detection at Airbnb to powering OpenAI's recommendation systems.


What You'll Learn:









If you enjoyed this episode, make sure to subscribe, rate, and review it on Apple Podcasts, Spotify, and YouTube Podcasts. Instructions on how to do this are here: https://www.fame.so/follow-rate-review


About the Guest(s)


Nikhil Simha is the CTO at Zipline AI, bringing extensive experience from leadership roles at Airbnb and Facebook. He is a co-author of Chronon, an open-source feature engineering platform that automates the generation of ML infrastructure from declarative queries. With deep expertise in real-time data systems, fraud detection, and feature engineering at scale, Nikhil has architected solutions powering recommendation systems and risk detection across billions of user interactions. In this episode, he shares insights on building scalable ML infrastructure, integrating LLMs with real-time feature contexts, and the evolving data engineering landscape. His work has directly impacted how organizations from early-stage startups to Fortune 500 companies approach feature engineering and real-time ML serving, making this conversation essential for engineers building production AI systems.


Quotes


"Fraud is adversarial. Right? Like, someone comes up with a new way to do fraud somewhere around the world, and people at Airbnb need to react to it very quickly." - Nikhil

"Chronon, at its core, generates these systems from queries. So users write queries on Chronon, and we generate all of these under the hood." - Nikhil

"Chronon allows data scientists to operate independently." - Nikhil

"The main problem there was that the traditional model of data scientists writing some logic and ML engineers going and billing system out for that logic, that was too slow for fraud detection." - Nikhil

"They have to come up with a new model in a matter of days. They don't have, like, this three to five month period where they can sit and create the new model, build all of these pipelines." - Nikhil

"There is a real gap in the industry for an engine that goes all the way from single machine scale to thousands of machine scale seamlessly." - Nikhil

"Most people, for ninety-five percent of their queries, don't need Spark in RPA. Right? But there is that 5% usually, like, a lot of ML falls into that." - Nikhil

"We are handling query fragments. Right? We take query fragments, generate very specialized logic for that, and run that through Spark's distributed processing topologies." - Nikhil

"The new trend in the industry would be, like, towards these engines that can work at any scale and be useful for interactive and large processing workloads." - Nikhil

"I think Iceberg is great that way because you're not fragmenting to different proprietary data formats, different proprietary engines." - Nikhil


Resources
 

Connect on LinkedIn:


Websites:



Tools & Platforms:



The Data Engineering Show is brought to you by firebolt.io and handcrafted by our friends over at: fame.so

Previous guests include: Joseph Machado of Linkedin, Metthew Weingarten of Disney, Joe Reis and Matt Housely, authors of The Fundamentals of Data Engineering, Zach Wilson of Eczachly Inc, Megan Lieu of Deepnote, Erik Heintare of Bolt, Lior Solomon of Vimeo, Krishna Naidu of Canva, Mike Cohen of Substack, Jens Larsson of Ark, Gunnar Tangring of Klarna, Yoav Shmaria of Similarweb and Xiaoxu Gao of Adyen.

Check out our three most downloaded episodes: