Big Data - Graph Analysis and Visualization

Graphics Reference

In-Depth Information

Finally, notice that, because you gathered nodes linked in both directions,

there is a duplicate reference to the Drawing Graphs book, indicating that

each book is listed as being similar to the other. If the title is any indication,

they are also the most similar in subject. You can use the dedup() step to

eliminate duplicates, as shown here:

gremlin>

g.V('group','Book').has('title',CONTAINS,'Graph').has('title',

CONTAINS,'Visualization').both('similar').dedup().title

Alternatively, you can use the groupCount() step to count instances of

each. Processing the results of grouping requires more advanced Gremlin

and shows how Gremlin tends to become more complex when the output of

a step is not easily expressed as a simple list of single elements. To output a

list of counts by book, the cap statement used in the following code moves

back one step in the pipeline output (which is otherwise a list of counts) to

the map produced as a side effect. The scatter() step then unrolls the

map into a list of key/value entries, and transform is used to output each

entry with its title.

gremlin>

g.V('group','Book').has('title',CONTAINS,'Graph').has('title',

CONTAINS,'Visualization').both('similar').groupCount().

cap.scatter().transform(){[it.value,

it.key.title]}

==>[1, Computational Modeling of Genetic and

Biochemical Networks

(Computational Molecular Biology)]

==>[2, Drawing Graphs : Methods and Models (Lecture

Notes in

Computer Science)]

==>[1, Introduction to Graph Theory (Dover topics on

Advanced

Mathematics)]

==>[1, Computational Analysis of Biochemical Systems :

A Practical

Guide for Biochemists and Molecular Biologists]

Search WWH ::

Custom Search

Home