Python minidom snippet (converting xml file produced by SQL Developer to csv)

I wrote a simple python script for converting xml export files generated by SQL Developer to csv format.

Let’s first see what is the output from SQL Developer. Export file of the Oracle table EMP generated by SQL Developer looks like this:

<?xml version='1.0'  encoding='Cp1250' ?>
<RESULTS>
	<ROW>
		<COLUMN NAME="COMM"><![CDATA[]]></COLUMN>
		<COLUMN NAME="DEPTNO"><![CDATA[20]]></COLUMN>
		<COLUMN NAME="EMPNO"><![CDATA[7369]]></COLUMN>
		<COLUMN NAME="ENAME"><![CDATA[SMITH]]></COLUMN>
		<COLUMN NAME="HIREDATE"><![CDATA[17.12.1980]]></COLUMN>
		<COLUMN NAME="JOB"><![CDATA[CLERK]]></COLUMN>
		<COLUMN NAME="MGR"><![CDATA[7902]]></COLUMN>
		<COLUMN NAME="SAL"><![CDATA[800]]></COLUMN>
	</ROW>
	<ROW>
		<COLUMN NAME="COMM"><![CDATA[300]]></COLUMN>
		<COLUMN NAME="DEPTNO"><![CDATA[30]]></COLUMN>
		<COLUMN NAME="EMPNO"><![CDATA[7499]]></COLUMN>
		<COLUMN NAME="ENAME"><![CDATA[ALLEN]]></COLUMN>
		<COLUMN NAME="HIREDATE"><![CDATA[20.02.1981]]></COLUMN>
		<COLUMN NAME="JOB"><![CDATA[SALESMAN]]></COLUMN>
		<COLUMN NAME="MGR"><![CDATA[7698]]></COLUMN>
		<COLUMN NAME="SAL"><![CDATA[1600]]></COLUMN>
	</ROW>
	<ROW>
		<COLUMN NAME="COMM"><![CDATA[500]]></COLUMN>
		<COLUMN NAME="DEPTNO"><![CDATA[30]]></COLUMN>
		<COLUMN NAME="EMPNO"><![CDATA[7521]]></COLUMN>
		<COLUMN NAME="ENAME"><![CDATA[WARD]]></COLUMN>
		<COLUMN NAME="HIREDATE"><![CDATA[22.02.1981]]></COLUMN>
		<COLUMN NAME="JOB"><![CDATA[SALESMAN]]></COLUMN>
		<COLUMN NAME="MGR"><![CDATA[7698]]></COLUMN>
		<COLUMN NAME="SAL"><![CDATA[1250]]></COLUMN>
	</ROW>
</RESULTS>

And my goal was to convert xml file to csv format:

COMM;DEPTNO;EMPNO;ENAME;HIREDATE;JOB;MGR;SAL;
;20;7369;SMITH;17.12.1980;CLERK;7902;800;
300;30;7499;ALLEN;20.02.1981;SALESMAN;7698;1600;
500;30;7521;WARD;22.02.1981;SALESMAN;7698;1250;

In this “exercise” I decided to use minidom for xml parsing which is quite easy to code with, but at the same time is suitable only for small xml files, such as exports of database code books. Perhaps I’ll publish similar python script, where I’ll parse xml with sax module.

# -*- coding: cp1250 -*-
# xml2csv.py -- AlesK, 2009
#
# Purpose:
# simple python code snippet that shows how to use python minidom module
# to convert export file produced by Oracle SQL Developer (Table->Export->to xml)
# to csv format. Note that minidom (DOM) as such is suitable for parsing
# small xml files, such as database codebooks!
#
# Note: Script doesn't even try to do any error handling at all and 
# as such is far away from being anything more than a simple demo.
#
# Example:
# cmd> python xml2csv.py emp.xml > emp.csv
#
import sys
import xml.dom.minidom

doc = xml.dom.minidom.parse(sys.argv[1])

# if encoding is not explicitly declared in xml header then 
# I'm assuming Cp1250 for the output file
if doc.encoding:
    encode = doc.encoding
else:
    # Cp1250 is my personal preference if encoding is not
    # specified in xml header
    encode = 'Cp1250'

# fields delimiter    
csv = ';'    

# during the pass of the first ROW collect
# column names (for csv file header)
firstRowPass = True
header = ''

for row in doc.getElementsByTagName("ROW"):
    row_data = ''
    for column in row.getElementsByTagName("COLUMN"):
        for column_value in column.childNodes:
            row_data += column_value.data
        row_data += csv
        if firstRowPass:
            header += column.getAttribute('NAME') + csv
    if firstRowPass:
        print header.encode(encode)
        firstRowPass = False
    print row_data.encode(encode)

doc.unlink()
#
# End xml2csv.py
#
Advertisement

Posted on 26.02.2009, in Scripting and tagged , . Bookmark the permalink. Comments Off on Python minidom snippet (converting xml file produced by SQL Developer to csv).

Comments are closed.