Databases Reference
In-Depth Information
ourselves working with rough estimates until such point as production data is available
to verify our assumptions.
Although ideally we would always test with a production-sized dataset, it is often not
possible or desirable to reproduce extremely large volumes of data in a test environment.
In such cases, we should at least ensure that we build a representative dataset whose size
exceeds the capacity of the object cache. That way, we'll be able to observe the effect of
cache evictions, and querying for portions of the graph not currently held in main
memory.
Representative datasets also help with capacity planning. Whether we create a full-sized
dataset, or a scaled-down sample of what we expect the production graph to be, our
representative dataset will give us some useful figures for estimating the size of the
production data on disk. These figures then help us plan how much memory to allocate
to the filesystem cache and the Java virtual machine (JVM) heap (see “Capacity Plan‐
ning” on page 93 for more details).
In the following example we're using a dataset builder called Neode to build a sample
social network: 7
private void createSampleDataset ( GraphDatabaseService db )
{
DatasetManager dsm = new DatasetManager ( db , new Log ()
{
@Override
public void write ( String value )
{
System . out . println ( value );
}
} );
// User node specification
NodeSpecification userSpec =
dsm . nodeSpecification ( "user" ,
indexableProperty ( "name" ) );
// FRIEND relationship specification
RelationshipSpecification friend =
dsm . relationshipSpecification ( "FRIEND" );
Dataset dataset =
dsm . newDataset ( "Social network example" );
// Create user nodes
NodeCollection users =
userSpec . create ( 1000000 )
. update ( dataset );
7. Max De Marzi describes an alternative graph generation technique .
Search WWH ::




Custom Search