Titan Graph Databases with Cassandra - Beginning Apache Cassandra Development

Database Reference

In-Depth Information

//in vertex

Vertex in = bgraph.addVertex(Math.random() +

"");

in.setProperty("twitter_tag",

in_twitter_tag);

in.setProperty("fname", in_fname);

in.setProperty("lname", in_lname);

//out vertex

Vertex out = bgraph.addVertex(Math.random() +

"");

out.setProperty("twitter_tag",

out_twitter_tag);

out.setProperty("fname", out_fname);

out.setProperty("lname", out_lname);

//assign edge

bgraph.addEdge(null, in, out,

edgeName);

7.

Finally we can call commit after successfully reading all records

from the .csv file and populating BatchGraph:

bgraph.commit();

Here batch size is the number of vertices and edges to be loaded before we invoke

the commit on the graph. One thing we should take care of is setting a moderate value

as the batch size to avoid heap size issues while processing a big graph having millions

or billions of edges.

The Supernode Problem

In the real world, big data-based graphs can be very large, and there can be a group of

vertices having a very high number of incident edges. In graph theory, such vertices are

called supernodes . With so many complex paths, a random traversal in a graph can lead

us to such supernodes, which would badly affect the system's performance.

Figure 7-21 shows my LinkedIn social graph, where the marked vertices can be

termed supernodes.

Search WWH ::

Custom Search

Home