Preskoči na glavno vsebino

Off Topic: Exploring the Cayley Graph Database (part 2)

 
We ended the last blog-post just as things were starting to get somewhat interesting.

Has function

First of all. We used Is to get to the initial node until now. There is also a Has path function which can be used to get to the first node(s) and to filter out specific nodes once we are well on our path.

graph.V()
.Has("<type>", "</people/person>")
.Has("<name>", "Jackie Chan").All()

{
	"result": [
		{
			"id": "</en/jackie_chan>"
		}
	]
}

 
Few more hops

Let's find all the actors that Mr. Jackie Chan worked with as a director.

graph.V().Has("<name>", "Jackie Chan")
.In("</film/film/directed_by>")
.Out("</film/film/starring>")
.Out("</film/performance/actor>")
.Unique()
.Out("<name>").All()

{
	"result": [
		{
			"id": "Sammo Hung"
		},
		{
			"id": "Yuen Biao"
		},
		{
			"id": "Jackie Chan"
		},
		{
			"id": "Alan Tam"
		},
                ...
          ]
}  

To explain. First we find the node of a person with name "Jackie Chan". Then all nodes that link to it with a Predicate "directed_by". We have the movies, now we move Out to all nodes with predicate "starring" and for those nodes via predicate "actor". We remove the duplicates and get to all the names. We found the predicates by using OutPredicates like explained in the first blog-post.

Morphisms and Intersect

We can do the same for Bruce Lee, and use the Function Intersect to see which actors worked with both directors. Because we follow the same path for both directors we can shorten our code using a so called Morphism.

var actorsOfDirector = g.Morphism()
.In("</film/film/directed_by>")
.Out("</film/film/starring>")
.Out("</film/performance/actor>")

g.V().Has("<name>", "Jackie Chan")
.Follow(actorsOfDirector)
.Intersect(g.V()
           .Has("<name>", "Bruce Lee")
           .Follow(actorsOfDirector))
.Unique().Out("<name>").All()

{
	"result": [
		{
			"id": "Sammo Hung"
		},
		{
			"id": "Yuen Biao"
		}
	]
}

 

Result visualization

All the results we got so far were quite flat. This doesn't make much sense as graphs are all about connections, clusters, shape discovery ...

Cayley frontend lets us visualize results of queries. We need to tag them as source and target.

Here we find Jackie Chan, then all the movies he acted in. We tag them s sources, next we find all the directors these movies had and tag them as targets.

graph.Vertex()
.Has("<name>", "Jackie Chan")
.In("</film/performance/actor>")
.In("</film/film/starring>")
.Out("<name>").Tag("source").In("<name>")
.Out("</film/film/directed_by>")
.Out("<name>").Tag("target").All()

What comes out of it is much more information rich than previous arrays of objects. We can immediately observe different patterns / suspect stories. We see multiple "one-movie" directors, but there are some clusters where one director directed multiple movies. We also spot the co-directors, which some then made more movies with Jackie Chan. At the center of the biggest cluster is Jackie Chan as a director or co-director himself.

At least one more blog-post about this theme is coming.

Komentarji

Priljubljene objave iz tega spletnega dnevnika

Less variables, more flows example vs Python

In the last blogpost ( Less variables, more flows ) I wrote a quick practical script I needed. It was an uncommon combination of CGI, two GET requests with Cookies and a POST request with Authorization header. I really like practical random/dirty problems, rather than ideal - made up problems to test the language. To get a sense of comparison I rewrote the example 2 times while removing specific Rye features. But that comparison is meaningless to a person that doesn't know Rye or at least Rebol already. So I went on fiverr and made a request for a Python script with these requirements. I got a nicely written Python script that uses functions for each step. To be more comparable, I rewrote the Rye code to a similar structure. Below is the result ... For a next step, it would be interesting, to extract a little simpler example out and add error handling. With Rye-s specific failure handling, I think the difference would become even greater. You can find Rye on github .

Ryelang - controlled file serving example and comparison to Python

This is as anecdotal as it gets, but basic HTTP serving functions in Rye seem to be working quite OK. They do directly use the extremely solid Go 's HTTP functions, so that should be somewhat expected. I made a ryelang.org web-server with few lines of Rye code 3 months ago and the process was running ever since and served more than 30.000 pages. If not else, it  seems there are no inherent memory leaks in Rye interpreter. Those would probably show up in a 3 month long running process? And now I got another simple project. I needed to make a HTTP API for some mobile app. API should accept a key, and return / download a binary file in response if the key is correct. Otherwise it should return a HTTP error. So I strapped in and created Rye code below. I think I only needed to add generic methods stat and size? , all other were already implemented, which is a good sign. Of course, we are in an age of ChatGPT, so I used it to generate the equivalent  Python code. It used the ele...

Receiving emails with Go's smtpd and Rye

This goes a while back. At some project for user support, we needed to receive emails and save them to appropriate databases. The best option back in the day seemed project Lamson . And it worked well ever since. It was written in Python by then quite known programmer Zed Shaw. It worked like a Python based SMTP server, that called your handlers when emails arrived. It was sort of Ruby on Rails for email. We were using this ever since. Now our system needs to be improved, there are still some emails or attachments that don't get parsed correctly. That isn't the problem of Lamson, but of our code that parses the emails. But Lamson development has been passive for more than 10 years. And I am already moving smaller utilities to Rye.  Rye uses Go, and Go has this nice library smtpd , which seems like made for this task. I integrated it and parsemail into Rye and tested it in the Rye console first. Interesting function here is enter-console , that can put you into Rye console any...