Hello World

Setting up the data model

For this example, we need a very simple data model. You can insert it into your CaosDB instance by saving the following to a file called model.yml:

HelloWorld:
  recommended_properties:
    time:
      datatype: DATETIME
    note:
      datatype: TEXT

and insert the model using

python -m caosadvancedtools.models.parser model.yml --sync

Let’s look first at how the CaosDB Crawler synchronizes Records that are created locally with those that might already exist on the CaosDB server.

For this you need a file called identifiables.yml with this content:

HelloWorld:
  - name

Synchronizing data

Then you can do the following interactively in (I)Python. But we recommend that you copy the code into a script and execute it to spare yourself typing.

import caosdb as db
from datetime import datetime
from caoscrawler import Crawler, SecurityMode
from caoscrawler.identifiable_adapters import CaosDBIdentifiableAdapter


# Create a Record that will be synced
hello_rec = db.Record(name="My first Record")
hello_rec.add_parent("HelloWorld")
hello_rec.add_property(name="time", value=datetime.now().isoformat())

# Create a Crawler instance that we will use for synchronization
crawler = Crawler(securityMode=SecurityMode.UPDATE)
# This defines how Records on the server are identified with the ones we have locally
identifiables_definition_file = "identifiables.yml"
ident = CaosDBIdentifiableAdapter()
ident.load_from_yaml_definition(identifiables_definition_file)
crawler.identifiableAdapter = ident

# Here we synchronize the Record
inserts, updates = crawler.synchronize(commit_changes=True, unique_names=True,
                                       crawled_data=[hello_rec])
print(f"Inserted {len(inserts)} Records")
print(f"Updated {len(updates)} Records")

Now, start by executing the code. What happens? The output suggests that one entity was inserted. Please go to the web interface of your instance and have a look. You can use the query FIND HelloWorld. You should see a brand new Record with a current time stamp.

So, how did this happen? In our script, we created a “HelloWorld” Record and gave it to the Crawler. The Crawler checks how “HelloWorld” Records are identified. We told the Crawler with our identifiables.yml that it should use the name. The Crawler thus checked whether a “HelloWorld” Record with our name exists on the Server. It did not. Therefore the Record that we provided was inserted in the Server.

Running the synchronization again

Now, run the script again. What happens? There is an update! This time, a Record with the required name existed. Thus the “time” Property of the existing Record was updated.

The Crawler does not touch Properties that are not present in the local data. Thus, if you add a “note” Property to the Record in the server (e.g. with the edit mode in the web interface) and run the script again, this Property is kept unchanged. This means that you can extend Records that were created using the Crawler.

Note that if you change the name of the “HelloWorld” Record in the script and run it again, a new Record is inserted by the Crawler. This is because in the identifiables.yml we told the Crawler that it should use the name to check whether a “HelloWorld” Record already exists in the Server.

So far, you saw how the Crawler handles synchronization in a very simple scenario. In the following tutorials, you will learn how this looks like if there are multiple connected Records involved which may not simply be identified using the name. Also, we created the Record “manually” in this example while the typical use case is to create it automatically from some files or directories. How this is done will also be shown in the following chapters.