I have a S3 bucket named test-bucket with the following structure
- test-bucket
- surveys
- abc.json
- def.json
[...]
Now I create a table in Trino like this
create table hive.qual2.surveys
(
id VARCHAR,
name VARCHAR
)
with (
format = 'JSON',
external_location = 's3://test-bucket/surveys'
);
select data from this table works fine. However, now I also need to know where the information is coming from. So I need to enter a column with e.g a value abc.json if the data is coming from this file.
Any clue how I can achieve this?
As described in the docs for the hive connector provided by trino, I could just use the
$pathcolumn to see the file where the information is coming from.to extract the file name without an extension i simply use this:
split(element_at(split("$path", '/'),-1),'.')[1] as survey_idso the whole query looks like this after i created the table as described above.
including
$pathin the create table definition does not make sense, since it already exists as a hidden column.