COSC430 XML Database Assignment 2017
The task is to take some World Development Indicator data downloaded
from the World Bank as *.csv files, load the information into an XML
data base, and run some queries against it.
The material can be found in
in the shared file system and a subdirectory of it.
These directories are all readable and searchable by everyone.
- the source code for Oracle
Berkeley XML DB. You can install this yourselves. You may be
able to run
yourselves. Please try this. I have asked cshelp to install it
for you but this may work.
- this is the
tutorial for using Berkeley XML DB. It is the very document that
I learned the command line interface of this database from.
- the slides for the XML databases lecture.
- The subdirectory with the World Bank data.
The parent of that directory,
- the first W3C standard for XPath, now superseded, but a much
better place to start trying to understand XPath than the
- the first W3C standard for XQuery, now superseded, but a much
better place to start trying to understand XQuery than the
The Register has a useful
tutorial on Berkeley DB XML.
What to do
- dbxml should be available in your VMs now. Check that this is so.
- Ensure that you can reach the data files in
- Start dbxml, create a container using
WDI_Country.xml into it using
putDocument (you will need to give this a name;
WDI_Series.xml into it using
putDocument again (I suggest the name 'series').
Quit dbxml. (The command for that is
You can use
help cmd to get help about
a command called cmd.
- Start dbxml again, open the container using
and check that the data are still there. Using just the data in
Country, how many countries in each Region belong to each
Income Group? (Show your XQuery.)
- Read through
WDI_Data.txt to get an idea of what is
WDI_Data.csv. It is actually much simpler than
the Country and Series data. That file has some notes about
how this information might be represented in XML.
- Think about the nature of the information in WDI_Data.csv,
and some queries you might want to ask, choose or design
a way to represent the information you need from this file
as XML. Write up to a page giving your decision and your
reasons for it.
- Convert the file to your chosen representation.
You may use any programming language you like for converting
CSV to XML. The conversion program must not be
linked with the data base in any way, shape, or form. The
conversion need not even be done on the same machine as the
one you run
dbxml on. You may submit
your source code if you wish, and it will be inspected,
but it will not be marked.
- Devise three questions involving these three XML files,
and express them in XQuery, give them to dbxml, and
show the results. (So the questions had better have
short results...) The queries are expected to use more of
XQuery than just XPath, and in particular, at least two of
them should involve joining data from two or more documents.
- Submit a report with
- The XQuery and result from step 4.
- Your design and rationale from step 6.
- The three XQueries and their results from step 8.