Spark CREATE TABLE 'data_source' format vs 'Hive format'

15 Views Asked by At

In the Spark docs, there are 2 ways to CREATE TABLE

  1. DATA_SOURCE format
  2. Hive format

Syntactically, there doesn't seem to be much difference between them other than the storage clauses. i.e. 'USING <>' (for datasource) vs 'STORED AS <>' (for hive format).

Internally, what are the differences between the two? What are the situations when someone should prefer one over the other?

Thanks!

0

There are 0 best solutions below