Apache Spark installation comes bundled with
spark-ec2 script which makes it easy to create Spark instances on EC2. This recipe will cover connecting to EC2 using this script.
- Login to your Amazon AWS account
- Click on Security Credentials under your account name in the top-right corner.
- Click on Access Keys and Create New Access Key
- Note down the access key ID and secret access key
- Now go to Services | EC2
- Click on Key Pairs in left-hand menu under NETWORK & SECURITY
- Click on Create Key Pair and enter kp-spark as key-pair name
- Download the private key file and copy it in the /home/hduser/keypairs folder.
- Set permissions on key file to 600.
- Set environment variables to reflect access key ID and secret access key
(please replace sample values with your own values):
$ echo "export AWS_ACCESS_KEY_ID=\"AKIAOD7M2LOWATFXFKQ\"" >> / home/hduser/.bashrc $ echo "export AWS_SECRET_ACCESS_KEY=\"+Xr4UroVYJxiLiY8DLT4DLT4D4s xc3ijZGMx1D3pfZ2q\"" >> /home/hduser/.bashrc $ echo "export PATH=$PATH:/opt/infoobjects/spark/ec2" >> /home/ hduser/.bashrc
- Launch the cluster with the example value:
$ spark-ec2 -k kp-spark -i /home/hduser/keypairs/kp-spark.pem --hadoop-major-version 2 -s 3 launch spark-cluster
For more details about this recipe, please read Spark Cookbook by Packt Publishing.
If you need help with any of your Spark implementations, please contact us.