Why does Apache drill change the column names when extractHeader is enabled?

58 Views Asked by Diksha At 10 May 2023 at 13:06

I have the following data in the CSV file,

PRODUCTID	PRODUCTNAME	SUPPLIERID	CATEGORYID	UNIT	PRICE
1	Chais	1	1	10 boxes x 20 bags	18
2	Chang	1	1	24 - 12 oz bottles	19
3	Aniseed Syrup	1	2	12 - 550 ml bottles	10
4	Chef Anton's Cajun Seasoning	2	2	48 - 6 oz jars	22
5	Chef Anton's Gumbo Mix	2	2	36 boxes	21.35

I've enabled extractHeader in the csv config of dfs plugin.
Apache Drill version: apache-drill-1.21.0
No. of drillbits: Single
OS: Windows

While querying on the csv file using following query:

SELECT * FROM dfs.`/var/lib/PRODUCT.csv`

Case 1:
the output is

Why does drill change the ID column name like that?

Case 2: It does some more modifications when we have special characters in column name.
For example -

#UNITS is changed to col_UNITS

FINANCIAL$RECORD is changed to FINANCIAL_RECORD

Is there any criteria on which these changes are made?

The problem with this is that while making a SELECT query with the original column names, we don't get any output. I've tried to go through the documentation and the JIRAs in Apache Drill but didn't find anything helpful.

Thanks in advance.

Original Q&A

Why does Apache drill change the column names when extractHeader is enabled?

There are 0 best solutions below

Related Questions in SQL

Related Questions in CSV

Related Questions in APACHE-DRILL

Trending Questions

Popular # Hahtags

Popular Questions