Preskoči na glavno vsebino

Off Topic: Exploring the Cayley Graph Database (part 1)


I am trying to integrate Cayley's, a graph database's query language Gizmo, into Rye. To do this, I need to get to know it well enough. Not just the exact query language, but I need a better understanding of how to solve problems with graph databases. 

Since outside of the official documentation and tutorial, I couldn't find many resources about Gizmo, I decided to write this down.

I am using the 30kmoviedata graph, that you can find on Cayley's Github repository. AFAIK the graph includes at least a network of movies, directors and actors. I will try to compare the Gizmo to SQL where applicable. 

I am just learning, take it all with a grain of salt and let me know if I made any mistakes.

Very basics

Let's start with the basics and get a title of a film based on ID. In SQL this would be something like:

select name from film where id = "/en/alien_1979";

and in Cayley's Gizmo:

g.V().Is("</en/alien_1979>").Out("<name>").All()

The query language is based on Gremlin, and looks like JavaScript. "g" and "V()" stand for "graph" and "Vertex()".

In general we find some nodes and then follow the path to more nodes. Name is not a property of the film node per-se, but another node with a "<name>" edge.

Vertex is another name for node, edges for links between the nodes.

Predicates

Functions Out and In follow the links from current nodes. You can use them without an argument to follow any type of link, or with, to specify it.

g.V().Is("</en/alien_1979>").Out().All()

{
	"result": [
		{
			"id": "</en/ridley_scott>"
		},
		{
			"id": "_:2597"
		},
                ... more of these ...
                {
			"id": "Alien"
		},
		{
			"id": "</film/film>"
		}
	]
}

This wasn't very helpfull. We got all the nodes, that a film links to. To see what kinds of links are possible we use functions OutPredicates and InPredicates:

g.V().Is("</en/alien_1979>").OutPredicates().All()

{
    "result": [
        {
            "id": "</film/film/directed_by>"
        },
        {
            "id": "</film/film/starring>"
        },
        {
            "id": "<name>"
        },
        {
            "id": "<type>"
        }
    ]
}

So our film node links to it's directors, actors, it's name and type.

InPredicates in the case above doesn't return anything, so nothing links to it.

Finding director(s)

We can see above, that a film node points to it's directors. Let's get the names of them, for this movie:

graph.V().Is("</en/alien_1979>").Out("</film/film/directed_by>").Out("<name>").All()

{
    "result": [
        {
            "id": "Ridley Scott"
        }
    ]
}


At first thought, we would do this in SQL with a single join:

select p.name  
from films f join people p on f.directed_by = p.id 
where f.id = "/en/alien_1979";


But there is a problem, as we will see later, a film can have more than one directors, so we need an additional many-to-many table, which makes the code a little more messy:

select p.name 
from films f 
 join film2director f2d on f.id = f2d.id_film  
 join people p on p.id = f2d.id_director 
where f.id = "/en/alien_1979";

 
We get a little deeper in the next blog-post.

Komentarji

Priljubljene objave iz tega spletnega dnevnika

Ryelang - controlled file serving example and comparison to Python

This is as anecdotal as it gets, but basic HTTP serving functions in Rye seem to be working quite OK. They do directly use the extremely solid Go 's HTTP functions, so that should be somewhat expected. I made a ryelang.org web-server with few lines of Rye code 3 months ago and the process was running ever since and served more than 30.000 pages. If not else, it  seems there are no inherent memory leaks in Rye interpreter. Those would probably show up in a 3 month long running process? And now I got another simple project. I needed to make a HTTP API for some mobile app. API should accept a key, and return / download a binary file in response if the key is correct. Otherwise it should return a HTTP error. So I strapped in and created Rye code below. I think I only needed to add generic methods stat and size? , all other were already implemented, which is a good sign. Of course, we are in an age of ChatGPT, so I used it to generate the equivalent  Python code. It used the ele...

Receiving emails with Go's smtpd and Rye

This goes a while back. At some project for user support, we needed to receive emails and save them to appropriate databases. The best option back in the day seemed project Lamson . And it worked well ever since. It was written in Python by then quite known programmer Zed Shaw. It worked like a Python based SMTP server, that called your handlers when emails arrived. It was sort of Ruby on Rails for email. We were using this ever since. Now our system needs to be improved, there are still some emails or attachments that don't get parsed correctly. That isn't the problem of Lamson, but of our code that parses the emails. But Lamson development has been passive for more than 10 years. And I am already moving smaller utilities to Rye.  Rye uses Go, and Go has this nice library smtpd , which seems like made for this task. I integrated it and parsemail into Rye and tested it in the Rye console first. Interesting function here is enter-console , that can put you into Rye console any...

Go's concurrency in a dynamic language Rye

  The Rye programming language is a dynamic scripting language based on REBOL’s ideas, taking inspiration from the Factor language and Unix shell. Rye is written in Go and inherits Go’s concurrency capabilities, including goroutines and channels. Recently, Rye gained support for Go’s select and waitgroups. Building blocks Goroutines Goroutines are lightweight threads of execution that are managed by the Go/Rye runtime. They operate independently, allowing multiple tasks to run concurrently without blocking each other. Creating a Goroutine in Rye is straightforward. The go keyword is used to launch a new Goroutine, followed by the Rye function to be executed concurrently. For instance, the following code snippet creates and starts a Goroutine that prints a message after a delay: ; # Hello Goroutine print "Starting Goroutine" go does { ; does creates a function without arguments sleep 1000 print "Hello from Goroutine!" } print "Sleepi...