Building a Graph Database Application - Graph Databases

Databases Reference

In-Depth Information

a certain time period; that, say, 98% of requests are satisfied in under 200 ms. It is

important to keep a record of subsequent test runs so that we can compare performance

figures over time, and thereby quickly identify slowdowns and anomalous behavior.

As with unit tests and query performance tests, application performance tests prove

most valuable when employed in an automated delivery pipeline, where successive

builds of the application are automatically deployed to a testing environment, the tests

executed, and the results automatically analyzed. Log files and test results should be

stored for later retrieval, analysis, and comparison. Regressions and failures should fail

the build, prompting developers to address the issues in a timely manner. One of the

big advantages of conducting performance testing over the course of an application's

development life cycle, rather than at the end, is that failures and regressions can very

often be tied back to a recent piece of development; this enables us to diagnose, pinpoint,

and remedy issues rapidly and succinctly.

For generating load, we'll need a load-generating agent. For web applications, there are

several open source stress and load testing tools available, including Grinder , JMeter ,

and Gatling . 6 When testing load-balanced web applications, we should ensure our test

clients are distributed across different IP addresses so that requests are balanced across

the cluster.

Testing with representative data

For both query performance testing and application performance testing we will need

a dataset that is representative of the data we will encounter in production. It will,

therefore, be necessary to create or otherwise source such a dataset. In some cases we

can obtain a dataset from a third party, or adapt an existing dataset that we own, but

unless these datasets are already in the form of a graph, we will have to write some

custom export-import code.

In many cases, however, we're starting from scratch. If this is the case, we must dedicate

some time to creating a dataset builder. As with the rest of the software development

life cycle, this is best done in an iterative and incremental fashion. Whenever we intro‐

duce a new element into our domain's data model, as documented and tested in our unit

tests, we add the corresponding element to our performance dataset builder. That way,

our performance tests will come as close to real-world usage as our current under‐

standing of the domain allows.

When creating a representative dataset, we try to reproduce any domain invariants we

have identified: the minimum, maximum, and average number of relationships per

node, the spread of different relationship types, property value ranges, and so on. Of

course, it's not always possible to know these things upfront, and often we'll find

6. Max De Marzi describes using Gatling to test Neo4j .

Search WWH ::

Custom Search

Home