Off Topic: Exploring the Cayley Graph Database (part 2)

We ended the last blog-post just as things were starting to get somewhat interesting.

Has function

First of all. We used Is to get to the initial node until now. There is also a Has path function which can be used to get to the first node(s) and to filter out specific nodes once we are well on our path.

graph.V()
.Has("<type>", "</people/person>")
.Has("<name>", "Jackie Chan").All()

{
	"result": [
		{
			"id": "</en/jackie_chan>"
		}
	]
}

Few more hops

Let's find all the actors that Mr. Jackie Chan worked with as a director.

graph.V().Has("<name>", "Jackie Chan")
.In("</film/film/directed_by>")
.Out("</film/film/starring>")
.Out("</film/performance/actor>")
.Unique()
.Out("<name>").All()

{
	"result": [
		{
			"id": "Sammo Hung"
		},
		{
			"id": "Yuen Biao"
		},
		{
			"id": "Jackie Chan"
		},
		{
			"id": "Alan Tam"
		},
                ...
          ]
}

To explain. First we find the node of a person with name "Jackie Chan". Then all nodes that link to it with a Predicate "directed_by". We have the movies, now we move Out to all nodes with predicate "starring" and for those nodes via predicate "actor". We remove the duplicates and get to all the names. We found the predicates by using OutPredicates like explained in the first blog-post.

Morphisms and Intersect

We can do the same for Bruce Lee, and use the Function Intersect to see which actors worked with both directors. Because we follow the same path for both directors we can shorten our code using a so called Morphism.

var actorsOfDirector = g.Morphism()
.In("</film/film/directed_by>")
.Out("</film/film/starring>")
.Out("</film/performance/actor>")

g.V().Has("<name>", "Jackie Chan")
.Follow(actorsOfDirector)
.Intersect(g.V()
           .Has("<name>", "Bruce Lee")
           .Follow(actorsOfDirector))
.Unique().Out("<name>").All()

{
	"result": [
		{
			"id": "Sammo Hung"
		},
		{
			"id": "Yuen Biao"
		}
	]
}

Result visualization

All the results we got so far were quite flat. This doesn't make much sense as graphs are all about connections, clusters, shape discovery ...

Cayley frontend lets us visualize results of queries. We need to tag them as source and target.

Here we find Jackie Chan, then all the movies he acted in. We tag them s sources, next we find all the directors these movies had and tag them as targets.

graph.Vertex()
.Has("<name>", "Jackie Chan")
.In("</film/performance/actor>")
.In("</film/film/starring>")
.Out("<name>").Tag("source").In("<name>")
.Out("</film/film/directed_by>")
.Out("<name>").Tag("target").All()

What comes out of it is much more information rich than previous arrays of objects. We can immediately observe different patterns / suspect stories. We see multiple "one-movie" directors, but there are some clusters where one director directed multiple movies. We also spot the co-directors, which some then made more movies with Jackie Chan. At the center of the biggest cluster is Jackie Chan as a director or co-director himself.

At least one more blog-post about this theme is coming.

ryelang

Preišči ta spletni dnevnik