How Zipline AI Turns Weeks of Engineering Into Minutes of SQL Queries ft. Nikhil Simha
What if you could deploy ML features and real-time data pipelines without building complex infrastructure from scratch?
In this episode, host Benjamin sits down with Nikhil Simha, CTO at Zipline AI and co-author of Chronon AI, to explore how Chronon, an open-source system that generates data infrastructure from simple queries, is transforming feature engineering at companies like OpenAI and Airbnb. Learn why iteration speed matters for fraud detection, how to serve thousands of signals at a massive scale, and what the future of analytical databases looks like in an AI-first world. Whether you're scaling real-time ML systems or building customer-facing analytics, this conversation is packed with practical insights on bridging the gap between data scientists and ML engineers.
In this episode of The Data Engineering Show, host
Benjamin sits down with
Nikhil Simha, CTO of Zipline AI and co-author of Chronon, to explore how a declarative feature platform solves the speed-vs-scale paradox in modern ML infrastructure, from fraud detection at Airbnb to powering OpenAI's recommendation systems.
What You'll Learn:
- How to eliminate the data scientist-to-ML engineer bottleneck by generating Spark, Flink, and orchestration pipelines automatically from simple SQL queries, enabling data scientists to ship features independently without waiting for engineering resources
- Why fraud detection demands real-time feature iteration: The adversarial nature of fraud requires companies to build and deploy new detection models in days, not months- a timeline impossible with manual pipeline engineering
- The "precompute everything" optimization principle for serving latency: Chronon minimizes query response time by batching feature computation upstream through stream and batch processing, then delivering pre-aggregated signals to models in milliseconds
- How to safely ship feature versions in production using dual-write strategies that keep old and new feature versions running simultaneously, enabling A/B testing and instant rollbacks without service disruption
- Why context engineering, not just RAG, powers modern LLM applications: ML model predictions (fraud risk scores, user signals, embeddings) feed directly into LLM prompts as structured context, improving decision quality for both human and AI agents
- The critical gap in open-source data infrastructure: Modern systems need query engines that scale seamlessly from single-machine to distributed clusters - today's choice between lightweight tools (DuckDB) and heavyweight platforms (Spark) leaves mid-scale and product-embedded analytics underserved
About the Guest(s)
Nikhil Simha is the CTO at Zipline AI, bringing extensive experience from leadership roles at Airbnb and Facebook. He is a co-author of Chronon, an open-source feature engineering platform that automates the generation of ML infrastructure from declarative queries. With deep expertise in real-time data systems, fraud detection, and feature engineering at scale, Nikhil has architected solutions powering recommendation systems and risk detection across billions of user interactions. In this episode, he shares insights on building scalable ML infrastructure, integrating LLMs with real-time feature contexts, and the evolving data engineering landscape. His work has directly impacted how organizations from early-stage startups to Fortune 500 companies approach feature engineering and real-time ML serving, making this conversation essential for engineers building production AI systems.
Quotes
"Fraud is adversarial. Right? Like, someone comes up with a new way to do fraud somewhere around the world, and people at Airbnb need to react to it very quickly." - Nikhil
"Chronon, at its core, generates these systems from queries. So users write queries on Chronon, and we generate all of these under the hood." - Nikhil
"Chronon allows data scientists to operate independently." - Nikhil
"The main problem there was that the traditional model of data scientists writing some logic and ML engineers going and billing system out for that logic, that was too slow for fraud detection." - Nikhil
"They have to come up with a new model in a matter of days. They don't have, like, this three to five month period where they can sit and create the new model, build all of these pipelines." - Nikhil
"There is a real gap in the industry for an engine that goes all the way from single machine scale to thousands of machine scale seamlessly." - Nikhil
"Most people, for ninety-five percent of their queries, don't need Spark in RPA. Right? But there is that 5% usually, like, a lot of ML falls into that." - Nikhil
"We are handling query fragments. Right? We take query fragments, generate very specialized logic for that, and run that through Spark's distributed processing topologies." - Nikhil
"The new trend in the industry would be, like, towards these engines that can work at any scale and be useful for interactive and large processing workloads." - Nikhil
"I think Iceberg is great that way because you're not fragmenting to different proprietary data formats, different proprietary engines." - Nikhil
Resources
Connect on LinkedIn:
Websites:
Tools & Platforms:
- Chronon – Feature engineering and real-time ML infrastructure platform for generating data pipelines from queries
- Apache Spark – Distributed data processing engine for batch and large-scale processing workloads
- Apache Flink – Stream processing engine for real-time data transformations
- Redis – In-memory key-value store for feature serving
- Apache Iceberg – Open table format for data lake storage
- Airflow – Workflow orchestration platform for pipeline scheduling
- DuckDB – Open-source analytical database for single-machine to moderate-scale processing
- BigQuery – Google Cloud data warehouse
- Snowflake – Cloud-based data warehouse platform
- Kubernetes – Container orchestration platform