How Zipline AI Turns Weeks of Engineering Into Minutes of SQL Queries ft. Nikhil Simha

In this episode of The Data Engineering Show, host Benjamin sits down with Nikhil Simha, CTO of Zipline AI and co-author of Chronon, to explore how a declarative feature platform solves the speed-vs-scale paradox in modern ML infrastructure, from fraud detection at Airbnb to powering OpenAI's recommendation systems.

What You'll Learn:

How to eliminate the data scientist-to-ML engineer bottleneck by generating Spark, Flink, and orchestration pipelines automatically from simple SQL queries, enabling data scientists to ship features independently without waiting for engineering resources

Why fraud detection demands real-time feature iteration: The adversarial nature of fraud requires companies to build and deploy new detection models in days, not months- a timeline impossible with manual pipeline engineering

The "precompute everything" optimization principle for serving latency: Chronon minimizes query response time by batching feature computation upstream through stream and batch processing, then delivering pre-aggregated signals to models in milliseconds

How to safely ship feature versions in production using dual-write strategies that keep old and new feature versions running simultaneously, enabling A/B testing and instant rollbacks without service disruption

Why context engineering, not just RAG, powers modern LLM applications: ML model predictions (fraud risk scores, user signals, embeddings) feed directly into LLM prompts as structured context, improving decision quality for both human and AI agents

The critical gap in open-source data infrastructure: Modern systems need query engines that scale seamlessly from single-machine to distributed clusters - today's choice between lightweight tools (DuckDB) and heavyweight platforms (Spark) leaves mid-scale and product-embedded analytics underserved

If you enjoyed this episode, make sure to subscribe, rate, and review it on Apple Podcasts, Spotify, and YouTube Podcasts. Instructions on how to do this are here: https://www.fame.so/follow-rate-review

About the Guest(s)

Nikhil Simha is the CTO at Zipline AI, bringing extensive experience from leadership roles at Airbnb and Facebook. He is a co-author of Chronon, an open-source feature engineering platform that automates the generation of ML infrastructure from declarative queries. With deep expertise in real-time data systems, fraud detection, and feature engineering at scale, Nikhil has architected solutions powering recommendation systems and risk detection across billions of user interactions. In this episode, he shares insights on building scalable ML infrastructure, integrating LLMs with real-time feature contexts, and the evolving data engineering landscape. His work has directly impacted how organizations from early-stage startups to Fortune 500 companies approach feature engineering and real-time ML serving, making this conversation essential for engineers building production AI systems.

Quotes

"Fraud is adversarial. Right? Like, someone comes up with a new way to do fraud somewhere around the world, and people at Airbnb need to react to it very quickly." - Nikhil

"Chronon, at its core, generates these systems from queries. So users write queries on Chronon, and we generate all of these under the hood." - Nikhil

"Chronon allows data scientists to operate independently." - Nikhil

"The main problem there was that the traditional model of data scientists writing some logic and ML engineers going and billing system out for that logic, that was too slow for fraud detection." - Nikhil

"They have to come up with a new model in a matter of days. They don't have, like, this three to five month period where they can sit and create the new model, build all of these pipelines." - Nikhil

"There is a real gap in the industry for an engine that goes all the way from single machine scale to thousands of machine scale seamlessly." - Nikhil

"Most people, for ninety-five percent of their queries, don't need Spark in RPA. Right? But there is that 5% usually, like, a lot of ML falls into that." - Nikhil

"We are handling query fragments. Right? We take query fragments, generate very specialized logic for that, and run that through Spark's distributed processing topologies." - Nikhil

"The new trend in the industry would be, like, towards these engines that can work at any scale and be useful for interactive and large processing workloads." - Nikhil

"I think Iceberg is great that way because you're not fragmenting to different proprietary data formats, different proprietary engines." - Nikhil

Resources

Connect on LinkedIn:

Nikhil Simha - https://www.linkedin.com/in/nikhilsimha
Benjamin Wagner - https://www.linkedin.com/in/wagjamin

Websites:

Zipline AI – zipline.ai
Firebolt – firebolt.io

Tools & Platforms:

Chronon – Feature engineering and real-time ML infrastructure platform for generating data pipelines from queries
Apache Spark – Distributed data processing engine for batch and large-scale processing workloads
Apache Flink – Stream processing engine for real-time data transformations
Redis – In-memory key-value store for feature serving
Apache Iceberg – Open table format for data lake storage
Airflow – Workflow orchestration platform for pipeline scheduling
DuckDB – Open-source analytical database for single-machine to moderate-scale processing
BigQuery – Google Cloud data warehouse
Snowflake – Cloud-based data warehouse platform
Kubernetes – Container orchestration platform