Scale By The Bay 2021 : Tudor Bosman, Karen Li, How We Built SQL Rollups on Streaming Data

Rockset is a real-time analytics database for serving fast search and analytics at scale. We built SQL rollups in Rockset that can pre-aggregate data from streaming sources, like Apache Kafka and Amazon Kinesis. Using rollups can improve storage efficiency and query performance.Over the course of this project, we encountered multiple challenges in building SQL rollups on streaming data, including:

  • supporting exactly-once write semantics
  • executing SQL on streaming data at ingest time
  • processing out-of-order arrivals correctly
    In this talk, we discuss how we overcame these technical challenges to implement rollups.