Trident, Realtime Big Data
Trident is a new high-level abstraction for doing realtime computing on top of Twitter Storm, available in Storm 0.8.0 (released today). It allows you to seamlessly mix high throughput (millions of messages per second), stateful stream processing with low latency distributed querying. If you’re familiar with high level batch processing tools like Pig or Cascading, the concepts of Trident will be very familiar – Trident has joins, aggregations, grouping, functions, and filters. In addition to these, Trident adds primitives for doing stateful, incremental processing on top of any database or persistence store. Trident has consistent, exactly-once semantics, so it is easy to reason about Trident topologies.
We’re really excited about Trident and believe it is a major step forward in Big Data processing. It builds upon Storm’s foundation to make realtime computation as easy as batch computation.
I’m sure Trident has some sharp pointy edges, but it also looks like a bag of fun.