Database Reference
In-Depth Information
While subvalues give us the ability to perform fast updates and still query documents,
comparing subvalue DLN identifiers against non-subvalue DLN identifiers is more
costly. For this reason, eXist will occasionally trigger the background defragmentation
of a document that has had significant updates made to it. The defragmentation
effectively renumbers all of the nodes in the document, removing the subvalues.
The fragmentation of documents occurs when XQuery update or
XUpdate expressions are executed against XML documents. After
the evaluation of an XQuery running XQuery update expressions
or an XUpdate document, the updated documents will be checked
for fragmentation. If they exceed the allowed fragmentation limit
set in $EXIST_HOME/conf.xml , then they will be queued for
defragmentation. Defragmentation happens in the background in
the database, but during defragmentation a document is locked
against further writes!
Paging and Caching
Several of the core database files ( dom.dbx , structure.dbx , and collections.dbx ) that are
kept on disk are organized into pages of 4 KB. A page is simply a contiguous region
that is read or written in an atomic operation. Rather than randomly seeking and
reading individual bytes as required, mechanical rotational storage systems (such as
hard disks) are much faster at larger linear reads and writes. However, pages them‐
selves are not necessarily always in the order that you need to answer a query; as
such, good random-access speed is still a requirement of the filesystem and underly‐
ing storage system.
The size of a page in eXist is configurable in $EXIST_HOME/
conf.xml via the attribute indicated by the XPath /exist/db-
connection/@pageSize . This should be aligned with the block size of
your filesystem; today 4 KB is typically correct. You can also
experiment with setting this to a multiple of the block size when
testing for the optimal performance of eXist with your data. How‐
ever, be aware that you can only change the page size before creat‐
ing a database (i.e., with no .dbx files in $EXIST_HOME/webapp/
WEB-INF/data ).
The persistent on-disk DOM ( dom.dbx ) and collections ( collections.dbx ) files are split
into two parts, a data section and an index section (which is a B+ tree). The data sec‐
tion contains the node or document and collection metadata, while the index section
ensures the quick lookup of collections, documents, and nodes. The structural index
( structure.dbx ) file is literally just an index and has no data section to its file.
Search WWH ::




Custom Search