Databases Reference
In-Depth Information
on a terminator to signify the end of a string. This traversability is useful when the
MongoDB server needs to introspect documents.
Performance
Finally, BSON is designed to be fast to encode to and decode from. It uses C-style
representations for types, which are fast to work with in most programming
languages.
For the exact BSON specification, see http://www.bsonspec.org .
Wire Protocol
Drivers access the MongoDB server using a lightweight TCP/IP wire protocol. The
protocol is documented on the MongoDB wiki but basically consists of a thin wrapper
around BSON data. For example, an insert message consists of 20 bytes of header data
(which includes a code telling the server to perform an insert and the message length),
the collection name to insert into, and a list of BSON documents to insert.
Data Files
Inside of the MongoDB data directory, which is /data/db/ by default, there are separate
files for each database. Each database has a single .ns file and several data files, which
have monotonically increasing numeric extensions. So, the database foo would be
stored in the files foo.ns , foo.0 , foo.1 , foo.2 , and so on.
The numeric data files for a database will double in size for each new file, up to a
maximum file size of 2GB. This behavior allows small databases to not waste too much
space on disk, while keeping large databases in mostly contiguous regions on disk.
MongoDB also preallocates data files to ensure consistent performance. (This behavior
can be disabled using the --noprealloc option.) Preallocation happens in the back-
ground and is initiated every time that a data file is filled. This means that the MongoDB
server will always attempt to keep an extra, empty data file for each database to avoid
blocking on file allocation.
Namespaces and Extents
Within its data files, each database is organized into namespaces , each storing a specific
type of data. The documents for each collection have their own namespace, as does
each index. Metadata for namespaces is stored in the database's .ns file.
The data for each namespace is grouped on disk into sections of the data files, called
extents . In Figure C-1 the foo database has three data files, the third of which has been
preallocated and is empty. The first two data files have been divided up into extents
belonging to several different namespaces.
 
Search WWH ::




Custom Search