Neo4j: embedded versus server mode - Neo4j in Action

Database Reference

In-Depth Information

Bywayofaninitial comparison,lookattheresultsdetailed in table 10.2 ,showingthetime

in milliseconds for embedded and server modes (with nodes per second in parentheses)

taken to create one million new user nodes with a name property. Each new node was cre-

ated in its own transaction (TX) using the raw Java API in embedded mode versus the raw

REST API of the server mode.

Table 10.2. The initial results of embedded versus server mode performance when creating new nodes

Scenario

Description

Embedded

Server

1 1 TX per node (1,000,000 × 1) 168,815 ms (5,952 nodes/s) 2,380,140 ms (420 nodes/s)

* Run on a MacBook Pro with 16 GB of RAM and 1 TB SSD (with FileVault FS encryption turned on). The Neo4j

server was run on the same local machine that the unit test was run on, and the unit test made use of the neo4j-rest-

graphdb REST client library detailed earlier.

On the face of it, these numbers don't make for very good reading, even for the embedded

mode. Three odd minutes to create 1 million nodes? Don't despair just yet; this example

was designed to prove a point. We could argue that the question should never be a simple

case of which one is faster. Rather, given a scenario, what can be done to get the best per-

formance, andisthisperformance levelacceptable. Afewadjustments tothewayinwhich

the operations are performed can have a drastic effect on performance.

Table 10.3 shows how the performance gets a lot better when you start to make better use

of transactions (native transactions in embedded mode, and batches for server mode). This

simple change has a big effect on the original numbers: up to 10 times faster for the em-

bedded mode and 16 times faster for the server mode using batches.

Table 10.3. Extended results of embedded versus server mode performance when creating new nodes

Scenario

Description

Embedded

Server

1

1 TX per node (1,000,000 × 1) 168,815 ms (5,952 nodes/s)

2,380,140 ms (420 nodes/s)

1 TX for all nodes (1 ×

1,000,000)

2

25,654 ms (40,000 nodes/s)

Took too long, hung

3

Batched TXs (20 × 50,000)

16,081 ms (62,500 nodes/s)

148,357 ms (6756 nodes/s)

Wheneveryou'representedwithperformancenumbers,makesureyouunderstandhowthe

performance test was put together and what factors are in play—or not. To lay all of our

cards on the table, listings 10.9 and 10.10 show the code used to perform these comparis-

ons.

Neo4j in Action

Search WWH ::

Custom Search

Home