XQuery Toolbox of a Java Developer

These days a colleague chimed in after he heard that I struggled running the basic w3schools.com XQuery Tutorial in eXist.

This post is more or less a condensed protocol of the amazing session, tightly packed with a mix of XQuery basics and pearls.

XML Query (short XQuery) is a functional, side effect-free, expression-oriented programming language with a simple type system, summed up by Kilpeläinen, Pekka (2012). "Using XQuery for problem solving"

To get an anchor into the Java World you can (in detail improperly) compare

  • XQuery sequences with Java 8 Streams,
  • XQuery arrays with Java Lists and
  • XQuery maps with Java Maps (since XQuery 3).

A First XQuery Sequence

We started with two very basic snippets for creating a sequence of the first three natural numbers:

(1, 2, 3)

and

1 to 3

Executing either of those queries result in the expected sequence.

Query returned 3 item(s).

The empty sequence is: ().

Tip: The snippet 1 to 3 can become quite handy to check if you can execute XPath 2 expressions.

Running the w3schools.com XQuery Example in eXist

We investigated two ways to work with the example data books.xml from within eXide of eXist.

Read the data

  1. from the filesystem and
  2. from a database collection.

Preparations:

  • Save the XML document books.xml in a folder named transfer.

Let's check the directory first:

(: Note: File system access in eXist needs admin privileges :)
file:directory-list("/transfer", '*')

results in the following output:

<file:list xmlns:file="http://exist-db.org/xquery/file" directory="/transfer">
    <file:file name="books.xml" size="808" human-size="808" modified="2018-07-12T14:40:29Z"/>
</file:list>

Next step in the tutorial is to use the doc() function to open the data file: doc("books.xml"). You might already guess it: This doesn't work out of the box in an unnamed editor window.

Using a file URL works fine:

doc("file://transfer/books.xml")/bookstore/book/title

and shows the expected results (a sequence of titles):

<title lang="en">Everyday Italian</title>,
<title lang="en">Harry Potter</title>,
<title lang="en">XQuery Kick Start</title>,
<title lang="en">Learning XML</title>

Since we are the lazy programmers and want to use the tutorial snippets unmodified we tried another approach:

Preparations:

  • Store the XML document books.xml from the w3schools tutorial in the collection example.
  • Create a scratchpad.xq in the same collection.

Inside the scratchpad, the relative name works fine.

Running the unmodified predicate example

doc("books.xml")/bookstore/book[price<30]

shows the exact same result as in the online tutorial:

<book category="CHILDREN">
  <title lang="en">Harry Potter</title>
  <author>J K. Rowling</author>
  <year>2005</year>
  <price>29.99</price>
</book>

FLWOR Expressions

At this point, we abandoned the tutorial path and jumped to FLWOR Expressions (pronounced "flower").

It's constructed of the keywords: for, let, where, order by, and return.

A variable declaration is one of the simplest FLOWR expressions:

(: For :)
(: Let :)
let $accountId as xs:string:="devop"
(: Order by :)
(: Where :)
(: Return :)
return

string-length($accountId)
5

The example above will call string-length and surprise, surprise - returns an integer.

Tip: The "Function Finder" allows to easily browse the XPath and XQuery Functions and Operators 3.1 available.

Other common types of atomic values are:

  • xs:string
  • xs:float
  • xs:number
  • xs:boolean
  • xs:datetime

Looking Around in an eXist Database

After that excursus we went back to the eXide...this resulted in the following collection of random queries:

Count all XML elements of all XML documents in the database

count(//*)
6639

Count all XML nodes in all XML documents (This is all elements and text nodes.)

count(//node())
16703

Sequence of all title elements

count(//title)
17

Oops. That's more than the expected 4 from the books.xml document.

Note: The query above returns the title elements of all documents in the whole database.

Sequence of all title elements (in the freshly created collection data/examples)

collection('/db/data/example')//title
<title lang="en">Everyday Italian</title>,
<title lang="en">Harry Potter</title>,
<title lang="en">XQuery Kick Start</title>,
<title lang="en">Learning XML</title>

That's the expected output. I matches the results from our previous example doc("file://transfer/books.xml")/bookstore/book/title.

Let's try a FLWOR expression with for...

List the local name (with fn:local-name()) of all XML files in the example collection:

for $i in collection('/db/data/example')//*
return local-name($i)

We get a sequence of xs:string:

bookstore
book
title
author
year
...

Time for a new concept...

Pipes

The pipe expressed with a !, allows to chain the processing of sequences.

The following FLWOR expression uses pipes in the return part:

for $i in collection('/db/data/example')//*
return local-name($i) ! string-length(.)

The items of the squence are passed to the next pipe and can be used via .

During the first iteration the loop evaluates: string-length('bookstore') to 9. The overall result is a sequence of xs:integer:

9
4
5
6
4
...

FLOWR expression can be used inside function arguments, as shown in the distinct-values() example below:

distinct-values(
    for $i in collection('/db/data/example')//*
    return (local-name($i) ! string-length(.)) 
)

This results in a sequence of 4 elements only. The distinct lengths of the local names.

9
4
5
6

That was a big workload of info. And still no single line of Java. In a future post I'll show how to query an eXist database with Java.

Stay tuned...

PS: Last but not least a nifty one-liner printing infos about the eXist currently in use:

"eXist " || system:get-version() || " / " || system:get-revision()  || " / build date " || system:get-build()