xq: or how to use jq with XML files
JSON is currently more popular than XML and there is a great tool to filter, query and transform JSON files its called jq, but since XML is…
JSON is currently more popular than XML and there is a great tool to filter, query and transform JSON files its called jq, but since XML is still used by many legacy applications would be nice to have a command line tool like jq to work with XML files? actually we have and is called xq and is a wrapper for jq which adds support for XML files, lets see how we can install xq and use it with XML files.
Installing xq
To install xq we need to install yq which is a python package that contains the xq tool as well
$ pip3 install yqA sample XML file
Create the following file and save it as test.xml, we will use this file in our examples
<library>
<book>
<title>The Great Gatsby</title>
<author>F. Scott Fitzgerald</author>
<genre>Fiction</genre>
<publication_year>1925</publication_year>
</book>
<book>
<title>To Kill a Mockingbird</title>
<author>Harper Lee</author>
<genre>Novel</genre>
<publication_year>1960</publication_year>
</book>
<book>
<title>1984</title>
<author>George Orwell</author>
<genre>Dystopian</genre>
<publication_year>1949</publication_year>
</book>
</library>How xq worksHow to query an XML file
xq works by transcoding XML documents to JSON and pipes them to jq, this means that we can use what we already know about jq with xq!, if we pipe a file to xq and use the '.' filter it will print on the console the whole XML document as JSON if its correctly formatted
cat test.txt | xq .
{
"library": {
"book": [
{
"title": "The Great Gatsby",
"author": "F. Scott Fitzgerald",
"genre": "Fiction",
"publication_year": "1925"
},
{
"title": "To Kill a Mockingbird",
"author": "Harper Lee",
"genre": "Novel",
"publication_year": "1960"
},
{
"title": "1984",
"author": "George Orwell",
"genre": "Dystopian",
"publication_year": "1949"
}
]
}
}Echoing the exit status of xq we can verify that the XML document parsed successfully.
echo $?
0If we apply a non valid XML document to xq will exit with non zero value apart from the error message, this is a great way to verify if this is a proper XML document
echo "hello" | xq .
xq: Error running jq: ExpatError: syntax error: line 1, column 0.
echo $?
1To print all books we can do the following query
cat test.txt | xq '.library.book[]'
{
"title": "The Great Gatsby",
"author": "F. Scott Fitzgerald",
"genre": "Fiction",
"publication_year": "1925"
}
{
"title": "To Kill a Mockingbird",
"author": "Harper Lee",
"genre": "Novel",
"publication_year": "1960"
}
{
"title": "1984",
"author": "George Orwell",
"genre": "Dystopian",
"publication_year": "1949"
}If you want only the title and genre of each entry do the following
% cat test.txt | xq '.library.book[]|[.title, .genre]'
[
"The Great Gatsby",
"Fiction"
]
[
"To Kill a Mockingbird",
"Novel"
]
[
"1984",
"Dystopian"
]To format output as csv oyput use the @ csv parameter
% cat test.txt | xq '.library.book[]|[.title, .genre]|@csv'
"\"The Great Gatsby\",\"Fiction\""
"\"To Kill a Mockingbird\",\"Novel\""
"\"1984\",\"Dystopian\""Conclusion
xq is powerful and has the benefit that you can re-use all the examples that you can find for jq since what actually does is to convert XML to JSON, i hope that you enjoyed this article and helped you to make your work with XML files easier!