Building a Graph Database Application - Graph Databases

Databases Reference

In-Depth Information

ourselves working with rough estimates until such point as production data is available

to verify our assumptions.

Although ideally we would always test with a production-sized dataset, it is often not

possible or desirable to reproduce extremely large volumes of data in a test environment.

In such cases, we should at least ensure that we build a representative dataset whose size

exceeds the capacity of the object cache. That way, we'll be able to observe the effect of

cache evictions, and querying for portions of the graph not currently held in main

memory.

Representative datasets also help with capacity planning. Whether we create a full-sized

dataset, or a scaled-down sample of what we expect the production graph to be, our

representative dataset will give us some useful figures for estimating the size of the

production data on disk. These figures then help us plan how much memory to allocate

to the filesystem cache and the Java virtual machine (JVM) heap (see “Capacity Plan‐

ning” on page 93 for more details).

In the following example we're using a dataset builder called Neode to build a sample

social network: 7

private void createSampleDataset ( GraphDatabaseService db )

{

DatasetManager dsm = new DatasetManager ( db , new Log ()

{

@Override

public void write ( String value )

{

System . out . println ( value );

}

} );

// User node specification

NodeSpecification userSpec =

dsm . nodeSpecification ( "user" ,

indexableProperty ( "name" ) );

// FRIEND relationship specification

RelationshipSpecification friend =

dsm . relationshipSpecification ( "FRIEND" );

Dataset dataset =

dsm . newDataset ( "Social network example" );

// Create user nodes

NodeCollection users =

userSpec . create ( 1000000 )

. update ( dataset );

7. Max De Marzi describes an alternative graph generation technique .

Search WWH ::

Custom Search

Home