I am trying to display all values in a table with 50000 rows but get the error:
java.lang.OutOfMemoryError: Java heap space
is there a way to increase memory to avoid this issue?
(apologies if this is a simple question I'm very new to this)
the code used to print the data:
SkillsDF.printSchema()
SkillsDF.show(n=50000) - this worked for >1000 rows
To show all 50000 rows the data needs to be collected to the driver to display them. But even if you will be able to increase this size, it won't help you because Databricks notebooks have a limit on amount of data that could be shown as a cell output.
On Databricks it's better to use
display(dataframe)function, but it's also limited to 1000 or 10000 rows. If you need to look into the whole dataset, then export it to S3 or something like that, and use other tools for looking through it.