March 28, 2009

plist to JSON (including your iTunes library...)

This is a quick-and-dirty sed script to convert a plist XML file to a no-unnecessary-whitespace JSON format. It has a few caveats, however:

  • It uses extended regular expressions

  • It slurps the entire XML text into memory before spitting it back out; it does not perform the traditional line-by-line editing/printing as most sed scripts do

  • <data> elements are given empty string values


Because of the above conditions, you have to use the "-En" options when running sed otherwise you will get garbage output (on Mac OS X, at least). Here is the code, I saved it in a file called "plist-to-json.sed" and invoke it as "sed -Enf plist-to-json.sed ...":

# Kill newlines
/^[[:space:]]*$/ d

# Kill leading whitespace
s|^[[:space:]]*||g

# Kill trailing whitespace
s|[[:space:]]*$||g

# Kill any base64 lines, <data> elements will not be converted
/^[[:alnum:]]+=*$/ d

# Kill the <?xml...?> line
s|<\?[^>]*>||g

# Kill any XML processing instructions
s|<![^>]*>||g

# String escape any values for JSON
s|"|\\"|g

# Convert the top level <plist> element to an object with a "plist" field
s|<plist[^>]*>|{"plist":|
s|</plist>|}|

# Keys, strings and dates get surrounded with quotes, numbers and booleans left alone
s|<key>|"|g
s|</key>|":|g
s|<string>|"|g
s|</string>|",|g
s|<real>||g
s|</real>|,|g
s|<integer>||g
s|</integer>|,|g
s|<true[[:space:]]*/>|true,|g
s|<false[[:space:]]*/>|false,|g
s|<date>|"|g
s|</date>|",|g

# Arrays and dictionaries convert nicely
s|<array>|[|g
s|</array>|],|g
s|<dict>|{|g
s|</dict>|},|g

# Give <data> elements an empty value
s|<data>|""|g
s|</data>|,|g

# Append the pattern space into the hold space
H

# Everything here happens only on the last line of input, after the above stuff has run
${
# Bring the hold space into the pattern space (the entire document)
g

# Remove tailing commas from the last field definitions in a JSON object
s|,[[:space:]]*}|}|g
s|,[[:space:]]*]|]|g

# Kill remaining unnecessary whitespace
s|:[[:space:]]*|:|g
s|{[[:space:]]*|{|g
s|\[[[:space:]]*|[|g
s|\n||g

# Print out the resulting JSON
p
}


Using this I was able to take my "iTunes Music Library.xml" file (2,365,981 bytes) and convert it nicely to JSON (1,087,270 bytes), which I intend to use in a Flash/Flex application I want to build to listen to my iTunes music over the internet. Unfortunately, Mediamaster (a company I used to work for) is not going to exist anymore and I miss the service... this will be my pathetic attempt at a replacement as I have absolutely no UI skills at all.