To connect to S3 from Spark you need two environment variables for security credentials and they are:
There are three ways to set them up.
1. In .bashrc file, add following two lines at the end. Replace these dummy values with the real values from your AWS account.
export AWS_SECRET_ACCESS_KEY=ed+11LI1zsT62cPFRUmjXswWL7lEa9a5Ncm26VfC export AWS_ACCESS_KEY_ID=AKIAJOEX7YHFQ5OYSLIQ
After updating .bashrc source it to refresh it.
2. you can set them up on command line.
$ export AWS_SECRET_ACCESS_KEY=ed+11LI1zsT62cPFRUmjXswWL7lEa9a5Tcm25VfC $ export AWS_ACCESS_KEY_ID=AKIAJOEX7YHFQ5OYSLIQ
3. You can set them up in Spark shell
scala> sc.hadoopConfiguration.set("fs.s3n.awsAccessKeyId", "") scala> sc.hadoopConfiguration.set("fs.s3n.awsSecretAccessKey","")
Contributed by Spark Training Class of Feb, 2016