I'm trying to create a new column in a DataFrame. This new column will contain a formatted data string created from a Long timestamp in milliseconds.
I keep getting this error:
Exception in thread "main" java.lang.NoSuchMethodError: org.apache.spark.sql.DataFrameReader.jdbc(Ljava/lang/String;Ljava/lang/String;Ljava/util/Properties;)Lorg/apache/spark/sql/Dataset;
It occurs in this code:
import org.apache.spark.SparkContext
import org.apache.spark.SparkConf
import org.apache.spark.sql.{DataFrame, SQLContext}
import joptsimple.OptionParser
import org.apache.spark.sql.functions._
import java.text.SimpleDateFormat
import org.apache.spark.sql.functions.udf
.
.
.
val formatDateUDF = udf((ts: Long) => {
new SimpleDateFormat("yyyy.MM.dd.HH.mm.ss").format(ts)
})
I'm using the following dependencies in build.sbt:
scalaVersion := "2.11.11"
libraryDependencies ++= Seq(
// Spark dependencies
"org.apache.spark" % "spark-hive_2.11" % "2.1.1" % "provided",
"org.apache.spark" % "spark-mllib_2.11" % "2.1.1" % "provided",
// Third-party libraries
"postgresql" % "postgresql" % "9.1-901-1.jdbc4",
"net.sf.jopt-simple" % "jopt-simple" % "5.0.3",
"org.scalactic" %% "scalactic" % "3.0.1",
"org.scalatest" %% "scalatest" % "3.0.1" % "test",
"joda-time" % "joda-time" % "2.9.9"
)
I'm open to other ways of doing this that might be easier (or, at the very least, work).
The
from_unixtimemethod should work better I think?Output: