To write an xml string to azure data lake storage

199 Views Asked by Sam At 11 October 2023 at 09:48

When I try to write an xml string to azure datalake storage I am getting error as file not found. I am using synapse notebook with python to write the file. Synapse notebook and the datalake storage are in the same resource group

I tried with to_xml({file_path/output.xml}). But this does not work with xml strings

Original Q&A

There are 2 best solutions below

Hermann12 On 11 October 2023 at 12:16 BEST ANSWER

If you use pandas, what I assume you use it:

import pandas as pd
import io
xml = '''<data><row><tex>text example</tex></row></data>'''

df = pd.read_xml(io.StringIO(xml))
print(df)

# Output in file
out ='StringXML.xml'
df.to_xml(f'{out}', index=False)

This will write into file:

<?xml version='1.0' encoding='utf-8'?>
<data>
  <row>
    <tex>text example</tex>
  </row>
</data>

DileeprajnarayanThumula On 11 October 2023 at 11:12

spark.sparkContext.parallelize([xml_string], 1) converts the xml_string into a distributed collection (RDD) and specifies that it should be stored as one partition.

.saveAsTextFile(adls_path) saves the content of the RDD to the specified ADLS Gen2 path as a text file.

I have tried the below approach in Pyspark:

xml_string = """
<root>
  <person>
    <name>John Doe</name>
    <age>30</age>
  </person>
  <person>
    <name>Jane Smith</name>
    <age>28</age>
  </person>
</root>
"""
adls_path = "abfss://[email protected]/output.xml"
from pyspark.sql import SparkSession
spark = SparkSession.builder.appName("WriteXMLToADLS").getOrCreate()
spark.sparkContext.parallelize([xml_string], 1).saveAsTextFile(adls_path)
print("XML data has been written to ADLS Gen2.")

enter image description here

The above Code converts the XML string into an RDD, and then saves it to your specified ADLS Gen2 path as a text file.
This is a way to write data to ADLS Gen2 using distributed data processing capabilities provided by PySpark in Azure Synapse.

To write an xml string to azure data lake storage

There are 2 best solutions below

Related Questions in PYTHON

Related Questions in XML

Related Questions in AZURE-SYNAPSE

Related Questions in AZURE-DATA-LAKE

Related Questions in XMLWRITER

Trending Questions

Popular # Hahtags

Popular Questions