Normally, when I run pyspark with graphframes I have to use this command:
pyspark --packages graphframes:graphframes:0.8.1-spark3.0-s_2.12
In the first time run this, this will install the packages graphframes but not the next time. In the .bashrc file, I have already added:
export SPARK_OPTS="--packages graphframes:graphframes:0.8.1-spark3.0-s_2.12"
But I cannot import the packages if I am not adding the option --packages.
How can I run pyspark with graphframes with this simple command?
pyspark
you can make a wrapper script like
myspark.shthat triggerspyspark --packages graphframes:graphframes:0.8.1-spark3.0-s_2.12, that would be the simplest solution.