DGraph - Part 4: Creating a schema

In the first part of this series I mentioned that Dgraph works without a scheme. In part 3 we made an adjustment to the schema to make a field searchable. In addition, there are other reasons to deal with the topic of schema in Dgraph, as you will not get any further for some functions without schema. An example of this is the expand() function. This makes it possible to call up all fields of a data record without explicitly requesting them. For this to work we first have to define a scheme in Dgraph.

Define a schema and data types

The schema definition in Dgraph is relatively simple. With the keyword type we tell Dgraph that a type definition follows. Within the definition, we specify the data fields / predicates from which the type should consist.

Following the data type, we can define the individual fields more precisely.

1
2
3
4
5
6
7
8
9
type Person {
    name
    hometown
    friend_of
}

name: string @index(hash) .
hometown: string .
friend_of: [uid] .

Here we define the field name as type string and tell Dgraph that this field should be indexed. For the field friend_of we indicate that it is a connection to another data record.

Dgraph does not automatically use the defined node type and does not assume that just because a node has all fields of the Person type, it is also of the Person node type. That means we need to tell DGraph when a record is a person:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
{
  set{
    _:Peter <name> "Peter Parker" .
    _:Peter <hometown> "New York" .
    _:Peter <dgraph.type> "Person" .
    _:Mary <name> "Mary Jane Watson" .
    _:Mary <hometown> "New York" .
    _:Mary <friend_of> _:Peter .
    _:Mary <dgraph.type> "Person" .
    _:Harry <name> "Harry Osborne" .
    _:Harry <hometown> "New York" .
    _:Harry <friend_of> _:Peter .
    _:Harry <friend_of> _:Mary .
    _:Harry <dgraph.type> "Person" .
  }
}

With the N-Quad _: Peter <dgraph.type> “Person”. e.g. assigned the type Person to the node for Peter Parker. Of course, this also works in JSON format:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
{
  "set": [
    {
      "name": "Peter Parker",
      "hometown": "New York",
      "dgraph.type": "Person"
    },
    {
      "name": "Mary Jane Watson",
      "hometown": "New York",
      "dgraph.type": "Person"
     }
  ]
}

A couple of functions that result from the schema definition

Thanks to the schema definition, we now have access to a few standard functions that would not be possible without the schema. Now in Dgraph we can e.g. search specifically for the type person.

1
2
3
4
5
6
{
  persons(func: type(Person)){
    uid
    name
  }
}

This allows queries to be executed even more specifically, since you usually work with very concrete data structures. In an ordering system, you have orders, customers, and so on. You define these as your own types, which you can then specifically address again.

Another useful function is expan:

1
2
3
4
5
6
7
{
  persons(func: type(Person)){
    expand(_all_) {
      expand(_all_)
    }
  }
}

With expand all fields of a data record can be output and do not have to be specified individually.

I’ve read many times that schemaless is touted as an advantage of databases. That is basically a good thing and correct to say, but in the end you have to think about the structure of your data one way or another with every project. In the case of Dgraph, I think it makes perfect sense to work with a schema. It is also not the case in Dgraph that a scheme takes away the flexibility. Even if a data record has been created as a type, you can still add any fields that are not included in the type definition. You just have to be aware that you have to request these fields in a query:

1
2
3
4
5
6
7
8
{
  persons(func: type(Person)){
    fieldname
    expand(_all_) {
      expand(_all_)
    }
  }
}

I will probably stop at this point with the basics and show in the next few posts how to interact with the Go Client with Dgraph.

0%