Databases Reference
In-Depth Information
a certain time period; that, say, 98% of requests are satisfied in under 200 ms. It is
important to keep a record of subsequent test runs so that we can compare performance
figures over time, and thereby quickly identify slowdowns and anomalous behavior.
As with unit tests and query performance tests, application performance tests prove
most valuable when employed in an automated delivery pipeline, where successive
builds of the application are automatically deployed to a testing environment, the tests
executed, and the results automatically analyzed. Log files and test results should be
stored for later retrieval, analysis, and comparison. Regressions and failures should fail
the build, prompting developers to address the issues in a timely manner. One of the
big advantages of conducting performance testing over the course of an application's
development life cycle, rather than at the end, is that failures and regressions can very
often be tied back to a recent piece of development; this enables us to diagnose, pinpoint,
and remedy issues rapidly and succinctly.
For generating load, we'll need a load-generating agent. For web applications, there are
several open source stress and load testing tools available, including Grinder , JMeter ,
and Gatling . 6 When testing load-balanced web applications, we should ensure our test
clients are distributed across different IP addresses so that requests are balanced across
the cluster.
Testing with representative data
For both query performance testing and application performance testing we will need
a dataset that is representative of the data we will encounter in production. It will,
therefore, be necessary to create or otherwise source such a dataset. In some cases we
can obtain a dataset from a third party, or adapt an existing dataset that we own, but
unless these datasets are already in the form of a graph, we will have to write some
custom export-import code.
In many cases, however, we're starting from scratch. If this is the case, we must dedicate
some time to creating a dataset builder. As with the rest of the software development
life cycle, this is best done in an iterative and incremental fashion. Whenever we intro‐
duce a new element into our domain's data model, as documented and tested in our unit
tests, we add the corresponding element to our performance dataset builder. That way,
our performance tests will come as close to real-world usage as our current under‐
standing of the domain allows.
When creating a representative dataset, we try to reproduce any domain invariants we
have identified: the minimum, maximum, and average number of relationships per
node, the spread of different relationship types, property value ranges, and so on. Of
course, it's not always possible to know these things upfront, and often we'll find
6. Max De Marzi describes using Gatling to test Neo4j .
Search WWH ::




Custom Search