I have followed the instructions detailed here to try to use flintrock to create a Spark Cluster via EC2 instances on AWS. As background, the ultimate objective of what I'm doing is to parallelize operations on Spark across 4 EC2 instances and gather results on the master node.
Below is the content I have for my config.yaml that is utilized when I try to run flintrock launch cluster_name. When running this I get the error: An error occurred (InvalidParameterCombination) when calling the RunInstances operation: The parameter iops is not supported for gp2 volumes. Operation aborted.
services:
spark:
version: 3.1.2
# git-commit: latest # if not 'latest', provide a full commit SHA; e.g. d6dc12ef0146ae409834c78737c116050961f350
# git-repository: # optional; defaults to https://github.com/apache/spark
# optional; defaults to download from a dynamically selected Apache mirror
# - can be http, https, or s3 URL
# - must contain a {v} template corresponding to the version
# - Spark must be pre-built
# - files must be named according to the release pattern shown here: https://dist.apache.org/repos/dist/release/spark/
# download-source: "https://www.example.com/files/spark/{v}/"
# download-source: "s3://some-bucket/spark/{v}/"
# executor-instances: 1
hdfs:
version: 3.3.0
# optional; defaults to download from a dynamically selected Apache mirror
# - can be http, https, or s3 URL
# - must contain a {v} template corresponding to the version
# - files must be named according to the release pattern shown here: https://dist.apache.org/repos/dist/release/hadoop/common/
# download-source: "https://www.example.com/files/hadoop/{v}/"
# download-source: "http://www-us.apache.org/dist/hadoop/common/hadoop-{v}/"
# download-source: "s3://some-bucket/hadoop/{v}/"
provider: ec2
providers:
ec2:
key-name: spark_cluster
identity-file: /media/sf_linuxvm/spark_cluster.pem
instance-type: t2.micro
region: us-east-1
# availability-zone: <name>
ami: ami-0230bd60aa48260c6
user: ec2-user
# ami: ami-61bbf104 # CentOS 7, us-east-1
# user: centos
# spot-price: <price>
# spot-request-duration: 7d # duration a spot request is valid, supports d/h/m/s (e.g. 4d 3h 2m 1s)
# vpc-id: <id>
# subnet-id: <id>
# placement-group: <name>
# security-groups:
# - group-name1
# - group-name2
# instance-profile-name:
# tags:
# - key1,value1
# - key2, value2 # leading/trailing spaces are trimmed
# - key3, # value will be empty
# min-root-ebs-size-gb: <size-gb>
tenancy: default # default | dedicated
ebs-optimized: no # yes | no
instance-initiated-shutdown-behavior: terminate # terminate | stop
# user-data: /path/to/userdata/script
# authorize-access-from:
# - 10.0.0.42/32
# - sg-xyz4654564xyz
launch:
num-slaves: 3
# install-hdfs: True
install-spark: True
java-version: 8
debug: false
I tried searching for resolutions to the error online and here on Stack Overflow but was not sure how to apply it to my specific situation.