xq: or how to use jq with XML files

JSON is currently more popular than XML and there is a great tool to filter, query and transform JSON files its called jq, but since XML is…

xq: or how to use jq with XML files
Photo by Shahadat Rahman on Unsplash
Join Medium with my referral link - Konstantinos Patronas
As a Medium member, a portion of your membership fee goes to writers you read, and you get full access to every story…

JSON is currently more popular than XML and there is a great tool to filter, query and transform JSON files its called jq, but since XML is still used by many legacy applications would be nice to have a command line tool like jq to work with XML files? actually we have and is called xq and is a wrapper for jq which adds support for XML files, lets see how we can install xq and use it with XML files.

Installing xq

To install xq we need to install yq which is a python package that contains the xq tool as well

$ pip3 install yq

A sample XML file

Create the following file and save it as test.xml, we will use this file in our examples

<library> 
  <book> 
    <title>The Great Gatsby</title> 
    <author>F. Scott Fitzgerald</author> 
    <genre>Fiction</genre> 
    <publication_year>1925</publication_year> 
  </book> 
  <book> 
    <title>To Kill a Mockingbird</title> 
    <author>Harper Lee</author> 
    <genre>Novel</genre> 
    <publication_year>1960</publication_year> 
  </book> 
  <book> 
    <title>1984</title> 
    <author>George Orwell</author> 
    <genre>Dystopian</genre> 
    <publication_year>1949</publication_year> 
  </book> 
</library>How xq works

How to query an XML file

xq works by transcoding XML documents to JSON and pipes them to jq, this means that we can use what we already know about jq with xq!, if we pipe a file to xq and use the '.' filter it will print on the console the whole XML document as JSON if its correctly formatted

cat test.txt | xq . 
{ 
  "library": { 
    "book": [ 
      { 
        "title": "The Great Gatsby", 
        "author": "F. Scott Fitzgerald", 
        "genre": "Fiction", 
        "publication_year": "1925" 
      }, 
      { 
        "title": "To Kill a Mockingbird", 
        "author": "Harper Lee", 
        "genre": "Novel", 
        "publication_year": "1960" 
      }, 
      { 
        "title": "1984", 
        "author": "George Orwell", 
        "genre": "Dystopian", 
        "publication_year": "1949" 
      } 
    ] 
  } 
}

Echoing the exit status of xq we can verify that the XML document parsed successfully.

echo $? 
0

If we apply a non valid XML document to xq will exit with non zero value apart from the error message, this is a great way to verify if this is a proper XML document

echo "hello" | xq . 
xq: Error running jq: ExpatError: syntax error: line 1, column 0. 
echo $?             
1

To print all books we can do the following query

cat test.txt | xq '.library.book[]'       
{ 
  "title": "The Great Gatsby", 
  "author": "F. Scott Fitzgerald", 
  "genre": "Fiction", 
  "publication_year": "1925" 
} 
{ 
  "title": "To Kill a Mockingbird", 
  "author": "Harper Lee", 
  "genre": "Novel", 
  "publication_year": "1960" 
} 
{ 
  "title": "1984", 
  "author": "George Orwell", 
  "genre": "Dystopian", 
  "publication_year": "1949" 
}

If you want only the title and genre of each entry do the following

% cat test.txt | xq '.library.book[]|[.title, .genre]'      
[ 
  "The Great Gatsby", 
  "Fiction" 
] 
[ 
  "To Kill a Mockingbird", 
  "Novel" 
] 
[ 
  "1984", 
  "Dystopian" 
]

To format output as csv oyput use the @ csv parameter

% cat test.txt | xq '.library.book[]|[.title, .genre]|@csv' 
"\"The Great Gatsby\",\"Fiction\"" 
"\"To Kill a Mockingbird\",\"Novel\"" 
"\"1984\",\"Dystopian\""

Conclusion

xq is powerful and has the benefit that you can re-use all the examples that you can find for jq since what actually does is to convert XML to JSON, i hope that you enjoyed this article and helped you to make your work with XML files easier!

Join Medium with my referral link - Konstantinos Patronas
As a Medium member, a portion of your membership fee goes to writers you read, and you get full access to every story…