How do I resolve this error when using flintrock to start Spark clusters on AWS?

66 Views Asked by At

I have followed the instructions detailed here to try to use flintrock to create a Spark Cluster via EC2 instances on AWS. As background, the ultimate objective of what I'm doing is to parallelize operations on Spark across 4 EC2 instances and gather results on the master node.

Below is the content I have for my config.yaml that is utilized when I try to run flintrock launch cluster_name. When running this I get the error: An error occurred (InvalidParameterCombination) when calling the RunInstances operation: The parameter iops is not supported for gp2 volumes. Operation aborted.

services:
  spark:
    version: 3.1.2
    # git-commit: latest  # if not 'latest', provide a full commit SHA; e.g. d6dc12ef0146ae409834c78737c116050961f350
    # git-repository:  # optional; defaults to https://github.com/apache/spark
    # optional; defaults to download from a dynamically selected Apache mirror
    #   - can be http, https, or s3 URL
    #   - must contain a {v} template corresponding to the version
    #   - Spark must be pre-built
    #   - files must be named according to the release pattern shown here: https://dist.apache.org/repos/dist/release/spark/
    # download-source: "https://www.example.com/files/spark/{v}/"
    # download-source: "s3://some-bucket/spark/{v}/"
    # executor-instances: 1
  hdfs:
    version: 3.3.0
    # optional; defaults to download from a dynamically selected Apache mirror
    #   - can be http, https, or s3 URL
    #   - must contain a {v} template corresponding to the version
    #   - files must be named according to the release pattern shown here: https://dist.apache.org/repos/dist/release/hadoop/common/
    # download-source: "https://www.example.com/files/hadoop/{v}/"
    # download-source: "http://www-us.apache.org/dist/hadoop/common/hadoop-{v}/"
    # download-source: "s3://some-bucket/hadoop/{v}/"

provider: ec2

providers:
  ec2:
    key-name: spark_cluster
    identity-file: /media/sf_linuxvm/spark_cluster.pem
    instance-type: t2.micro
    region: us-east-1
    # availability-zone: <name>
    ami: ami-0230bd60aa48260c6
    user: ec2-user
    # ami: ami-61bbf104  # CentOS 7, us-east-1
    # user: centos
    # spot-price: <price>
    # spot-request-duration: 7d  # duration a spot request is valid, supports d/h/m/s (e.g. 4d 3h 2m 1s)
    # vpc-id: <id>
    # subnet-id: <id>
    # placement-group: <name>
    # security-groups:
    #   - group-name1
    #   - group-name2
    # instance-profile-name:
    # tags:
    #   - key1,value1
    #   - key2, value2  # leading/trailing spaces are trimmed
    #   - key3,  # value will be empty
    # min-root-ebs-size-gb: <size-gb>
    tenancy: default  # default | dedicated
    ebs-optimized: no  # yes | no
    instance-initiated-shutdown-behavior: terminate  # terminate | stop
    # user-data: /path/to/userdata/script
    # authorize-access-from:
    #   - 10.0.0.42/32
    #   - sg-xyz4654564xyz

launch:
  num-slaves: 3
  # install-hdfs: True
  install-spark: True
  java-version: 8

debug: false

I tried searching for resolutions to the error online and here on Stack Overflow but was not sure how to apply it to my specific situation.

0

There are 0 best solutions below