Kafka Summit- questions about real-time updates on Rockset

Hi!

I was at the Kafka Summit, and stopped by your booth. There was a question about real-time updates, but I came at the tail end of it. How do real-time updates work with Rockset? The other question I had was lower details of how converged index works and how’s that different from other databases like druid and such.

Thanks
–ash–

Hi @ashn !

Thanks for stopping by the booth. Yah, those questions came in during my demo around 11am!

Here’s how real-time updates work on Rockset :timer_clock:
Rockset makes it possible to provide low latency and high write rates with RocksDB-Cloud and remote compaction. Developers also save on compute and I/O with incremental indexing using the Patch API. Overall, real-time insights with millisecond query latency allow applications to move faster.

To elaborate more on the PATCH API: :arrow_heading_down:
The PATCH API solves the problem that other databases have with re-indexing the whole document. At Rockset, we use the PATCH API to incrementally index the collection. This means that for any updates that a collection gets, only part of those fields in a document are reindexed while keeping the rest of the fields in that document untouched. This means we are efficient in compute and i/o.

How converged index works and how’s that different from other databases
Converged Index = ROW + Column + Search. :smiley:

We store every column of every document in a row-based store, column-based store, AND a search index. :ok_hand:

Converged indexing requires more space on disk, but our queries are faster. We trade off storage for CPU. However, more importantly, we trade off hardware for human time. Humans no longer need to configure indexes, and humans no longer need to wait on slow queries. This is very different from other databases and such.

We also built a custom SQL query optimizer that analyzes every query and decides on the execution plan. For example, when to consider a columnar index vs. a search index.

I hope this helps answer your questions. Here are more resources to get you started:

Please let me know if you have more. I hope you had a great time at the conference!

N

@nadine shared a lot of great pointers. Here a couple of more useful context:

How do real-time updates work with Rockset?

I assume you are asking about updates here and not inserts. Rockset has a document data model and stores data in collections. Every field in every document is mutable and you can update them using the Patch API. When you update a field in Rockset only that field will get re-indexed not the entire document like what happens in other indexing systems such as Elasticsearch.

The other question I had was lower details of how converged index works and how’s that different from other databases like druid and such.

As far as I know there are no other databases that has anything similar to Converged Indexing. Elasticsearch builds inverted indexes on the entire data set and columnar databases stores everything in 1 columnar format and do not support secondary indexing - but there is nothing out there that tries to combine the power of the two along with a fully featured SQL engine. Rockset leverages RocksDB cloud (log structured merge tree optimized to run in cloud environments) quite a bit to make real-time indexing efficient and scalable.

For more details on Rockset Vs Apache Druid - please take a look at: https://rockset.com/comparisons/rockset-vs-apache-druid

2 Likes