Runtime error with udf in Scala Spark

457 Views Asked by Paul Reiners At 12 June 2017 at 21:50

I'm trying to create a new column in a DataFrame. This new column will contain a formatted data string created from a Long timestamp in milliseconds.

I keep getting this error:

Exception in thread "main" java.lang.NoSuchMethodError: org.apache.spark.sql.DataFrameReader.jdbc(Ljava/lang/String;Ljava/lang/String;Ljava/util/Properties;)Lorg/apache/spark/sql/Dataset;

It occurs in this code:

import org.apache.spark.SparkContext
import org.apache.spark.SparkConf
import org.apache.spark.sql.{DataFrame, SQLContext}
import joptsimple.OptionParser
import org.apache.spark.sql.functions._
import java.text.SimpleDateFormat

import org.apache.spark.sql.functions.udf
    .
    .
    .
    val formatDateUDF = udf((ts: Long) => {
      new SimpleDateFormat("yyyy.MM.dd.HH.mm.ss").format(ts)
    })

I'm using the following dependencies in build.sbt:

scalaVersion := "2.11.11"

libraryDependencies ++= Seq(
  // Spark dependencies
  "org.apache.spark" % "spark-hive_2.11" % "2.1.1" % "provided",
  "org.apache.spark" % "spark-mllib_2.11" % "2.1.1" % "provided",
  // Third-party libraries
  "postgresql" % "postgresql" % "9.1-901-1.jdbc4",
  "net.sf.jopt-simple" % "jopt-simple" % "5.0.3",
  "org.scalactic" %% "scalactic" % "3.0.1",
  "org.scalatest" %% "scalatest" % "3.0.1" % "test",
  "joda-time" % "joda-time" % "2.9.9"
)

I'm open to other ways of doing this that might be easier (or, at the very least, work).

Original Q&A

There are 1 best solutions below

Tom Lous On 13 June 2017 at 10:15

The from_unixtime method should work better I think?

val input = List(
  ("a",1497348453L),
  ("b",1497345453L),
  ("c",1497341453L),
  ("d",1497340453L)
).toDF("name", "timestamp")


input.select(
  'name,
  from_unixtime('timestamp, "yyyy.MM.dd.HH.mm.ss").alias("timestamp_formatted")
).show()

Output:

+----+-------------------+
|name|timestamp_formatted|
+----+-------------------+
|   a|2017.06.13.12.07.33|
|   b|2017.06.13.11.17.33|
|   c|2017.06.13.10.10.53|
|   d|2017.06.13.09.54.13|
+----+-------------------+

Runtime error with udf in Scala Spark

There are 1 best solutions below

Related Questions in SCALA

Related Questions in APACHE-SPARK

Related Questions in APACHE-SPARK-SQL

Related Questions in UDF

Trending Questions

Popular # Hahtags

Popular Questions