The source file is json and we zipped it to gz format. The file is good. I am able to open the file with notepad++
import gzip
# Open the GZIP file in text mode ('rt')
with gzip.open('example.gz', 'rt') as f:
file_content = f.read()
print(file_content)
The error I get is
BadGzipFile: Not a gzipped file (b'{\n')
I also try to read line by line and get the same error
import gzip
with gzip.open('example.gz', 'r') as fin:
for line in fin:
print('got line:', line)
This is my sample json data:
{
"metadata_version": 1,
"created": "2024-01-31T16:02:11.400125+00:00",
"domain": {
"name": "myname",
"version": 1,
"type": "core"
},
"id1": "01HNG439A8M7395MB9CWC4XSKC",
"id2": {
"id3": "efbc9315-6a27-455b-9050-02ea08eb1b69",
"id4": "05933069-eeb5-4801-8801-fdd9819d08bf",
"id5": "8b642da5-e954-402c-bcb9-a196d594ed62"
},
"data": "AAAAAAAA22RzW7CMBCEXyXymVQJNOHnVgGlHIoikvbQ2+IsYMnYdNemQlXfvQ4Q4MB1ZvebWftXVASGQTplzcyrWozEWub9LM8HsUyyNE5TxBjSvoyTJEuyfPUMvXUqOmKJ3x7ZTcChGBmvdUeMNQIps3mznnE"
}
The GZ file is downloaded from AWS S3. When we download the file, AWS unzips it to its original JSON format automatically, and the file name remains myfile.gz. Despite the file name being myfile.gz, it is actually a JSON file, not a GZ file.