home ¦ Archives ¦ Atom ¦ RSS

spark-kafka

You can also turn a Kafka topic into a Spark RDD

Spark-kafka is a library that facilitates batch loading data from Kafka into Spark, and from Spark into Kafka.

This library does not provide a Kafka Input DStream for Spark Streaming. For that please take a look at the spark-streaming-kafka library that is part of Spark itself.

This could come in handy to pre-ingest some data to build up some history before connecting to a Kafka data stream using Spark Streaming.

© 2008-2024 C. Ross Jam. Built using Pelican. Theme based upon Giulio Fidente’s original svbhack, and slightly modified by crossjam.