home ¦ Archives ¦ Atom ¦ RSS

Spark Deployment

Wow! I didn’t realize how easy it was to run a Spark/Mesos cluster on Amazon EC2:

The spark-ec2 script, located in Spark’s ec2 directory, allows you to launch, manage and shut down Spark clusters on Amazon EC2. It automatically sets up Mesos, Spark and HDFS on the cluster for you. This guide describes how to use spark-ec2 to launch clusters, how to run jobs on them, and how to shut them down. It assumes you’ve already signed up for an EC2 account on the Amazon Web Services site.

Might be fun to just spin one up and run against some Common Crawl data.

© 2008-2024 C. Ross Jam. Built using Pelican. Theme based upon Giulio Fidente’s original svbhack, and slightly modified by crossjam.