CaseWhen in spark DataFrame

1.8k Views Asked by At

I'd like to understand how to use the CaseWhen expressions with the new DataFrame api.

I can't see any reference to it in the documentation, and the only place I saw it was in the code: https://github.com/apache/spark/blob/v1.4.0/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/predicates.scala#L397

I'd like to be able to write something like this:

val col = CaseWhen(Seq(
    $"a" === lit(1), lit(10),
    $"a" === lit(2), lit(15),
    ...
    lit(20)
))

but this code won't compile because the Seq is of type Column and not Expression

What is the proper way to use CaseWhen?

1

There are 1 best solutions below

2
On BEST ANSWER

To be honest, I don't know whether CaseWhen is intended to be used as a user facing API. Instead you should use the when and otherwise method of the Column type. With those methods you can construct a CaseWhen column.

val column: Column = //some column

val result: Column = column.
  when($"a" === functions.lit(1), 10).
  when($"a" === functions.lit(2), 15).
  otherwise(20)