The best in command line xml: XMLStarlet
June 23rd, 2008
Quite some time ago I wrote about using xsltproc to process xml on the command line. Thank fully someone pointed out XMLStarlet. I now use XMLStarlet almost every day. I work with a variety of REST based API’s gather information. XMLStartlet along with a simple for loop or xargs gives you an exceedingly powerful set of tools.
Here is a quick introduction into the power of XMLStarlet. This is just a teaser as I cannot share the data I work with. However, you should be able to see the power of this tool.
All the links from my RSS feed:
$ curl -s 'http://bashcurescancer.com/rss/' | xml sel -t -m '//link' -v '.' -n http://bashcurescancer.com
http://bashcurescancer.com/processing-xml-on-the-command-line.html http://bashcurescancer.com/do-not-close-stderr.html
http://bashcurescancer.com/prepend-to-a-file-with-sponge-from-moreutils.html
http://bashcurescancer.com/bug-in-curl-is-fixed.html
http://bashcurescancer.com/using-kill-to-see-if-a-process-is-alive.html
http://bashcurescancer.com/performance-testing-with-curl.html
http://bashcurescancer.com/new-command-prepend.html
http://bashcurescancer.com/shell-function-which-webserver-does-that-site-run.html
http://bashcurescancer.com/exposing-command-line-programs-as-web-services.html http://bashcurescancer.com/wrapping-dynamic-languages-in-shell-without-an-extra-script.html
Or how about “Title: link”
$ curl -s 'http://bashcurescancer.com/rss/' | xml sel -t -m '//item' -v 'title' -o ': ' -v 'link' -n
Processing XML on the Command Line: http://bashcurescancer.com/processing-xml-on-the-command-line.html
Do not close stderr: http://bashcurescancer.com/do-not-close-stderr.html
prepend to a file with sponge from moreutils: http://bashcurescancer.com/prepend-to-a-file-with-sponge-from-moreutils.html
Bug in Curl is fixed: http://bashcurescancer.com/bug-in-curl-is-fixed.html
using kill to see if a process is alive: http://bashcurescancer.com/using-kill-to-see-if-a-process-is-alive.html
Performance testing - with curl: http://bashcurescancer.com/performance-testing-with-curl.html
New command: prepend: http://bashcurescancer.com/new-command-prepend.html
Shell Function - Which Webserver Does That Site Run?: http://bashcurescancer.com/shell-function-which-webserver-does-that-site-run.html
Exposing command line programs as web services: http://bashcurescancer.com/exposing-command-line-programs-as-web-services.html
Wrapping dynamic languages in shell without an extra script: http://bashcurescancer.com/wrapping-dynamic-languages-in-shell-without-an-extra-script.html
You may need to do some reading on xpaths and xsl stylesheets to use the full power of the tool.


June 24th, 2008 at 3:17 am
Thats so handy it makes my eyes hurt.
June 24th, 2008 at 5:28 pm
hi, really like your blog.
i use xmlstarlet at work for a couple of nagios tests (before i got into ruby and REXML anyway) and when parsing some xml from a windows service provided to me from a .net team, i had to specify the namespace in the query, and the syntax in a little funny:
$ wget -qO - $URL > $TMPFILE
$ xml sel -N x=”http://services.company.com/service/” -t -v “//x:Status” $TMPFILE
the name space is defined after -N, and then used in the xpath after -v with its variable “x”, which is an arbitrary string. perhaps “namespace” would be more readable, like:
$ xml sel -N namespace=”http://services.company.com/service/” -t -v “//namespace:Status” $TMPFILE
the only reason i use a $TMPFILE is because i might want to pull out more than one node from the xml, and this seemed like the clearest way to do it without hitting the service more than once, although i’m no doubt incorrect.
anyway, hope this helps someone trying to use this tool against xml provided by a .net service.
July 1st, 2008 at 5:11 am
For a layman, Bash Curers Cancer may come across as a long list of web page addresses published online across the home page and beyond. However, a closer look reveals this site as intensively technical as regards UNIX and Linux and the author of the blog is an expert at his subject. He showcases his prowess by suggesting coding for indicated problems.
July 30th, 2008 at 12:33 am
hey, why did´nt anybody tell before about this stuff? this is very _cool_ shit! it´s even in the debian repository. you just got to apt it! xml was until now somewhat of a problem on the commandline for me…thanks a lot!