PySpark Performance slow in Reading large fixed width file with long lines to convert to structural

130 Views Asked by Sanjay Bagal At 03 March 2023 at 03:29

I am trying to convert bit large file 34GB fixed width file into structural format using pySpark, But my job taking too long to complete (Almost 10 hr+), File having large line almost 50K characters which I am trying to split using substring into around 5k columns, and storing it into parquet format table. if anyone faced similar issues and resolved, any suggestion are greatly appreciated. We have Spark 3.1.1 running through google's Spark Kubernetes Operator on Openshift cluster.

Original Q&A

PySpark Performance slow in Reading large fixed width file with long lines to convert to structural

There are 0 best solutions below

Related Questions in APACHE-SPARK

Related Questions in PYSPARK

Related Questions in FIXED-WIDTH

Related Questions in GOOGLE-SPARK-OPERATOR

Related Questions in APACHE-SPARK-SQL-REPARTITION

Trending Questions

Popular # Hahtags

Popular Questions