Database Reference
In-Depth Information
LOAD CSV
LOAD CSV is a memory-intensive operation, which if optimized, can provide significant
improvements in the overall data-loading process. For one of our use cases, we were able
to reduce the overall data insertion time by 40 percent by using appropriate configuration
and tuning parameters.
Apart from the common memory parameters, the following are a few considerations for the
LOAD CSV process:
• Having simple
LOAD CSV
statements and multiple passes across them or multiple
.csv
files, consumes less memory than complex
LOAD CSV
statements
• The
MERGE
command should not be used for nodes and relationships in a single
LOAD CSV
command
• In a single
LOAD CSV
statement, either use
CREATE
nodes or
MERGE
nodes, but
not both
• Neo4j shell provides optimum performance for batch imports
• Make sure that auto indexing is ON (
set node_auto_indexing=true
) and
also define the columns that need to be indexed
(
node_keys_indexable=id,name,type
) in
<$NEO4J_HOME>/conf/
neo4j.properties