Data Insertion

Data Models

Data is stored and structured in CaosDB using a concept of RecordTypes, Properties, Records etc. If you do not know what these are, please look at the chapter Data Model in the CaosDB server documentation.

In order to insert some actual data, we need to create a data model using RecordTypes and Properties (You may skip this if you use a CaosDB instance that already has the required types). When you create a new Property you must supply a datatype. So, let’s create a simple Property called “a” of datatype double. This is very easy in pylib:

a = db.Property(name="a", datatype=db.DOUBLE)

There are a few basic datatypes like db.INTEGER, db.DOUBLE, or db.TEXT. See the data types for a full list.

We can create our own small data model for e.g. a simulation by adding two more Properties and a RecordType:

b = db.Property(name="b", datatype=db.DOUBLE)
epsilon = db.Property(name="epsilon", datatype=db.DOUBLE)
recordtype = db.RecordType(name="BarkleySimulation")
container = db.Container()
container.extend([a, b, epsilon, recordtype])

Inheritance of Properties

Suppose you want to create a new RecordType “2D_BarkleySimulation” that denotes spatially extended Barkley simulations. This is a subtype of the “BarkleySimulation” RecordType above and should have all its parameters, i.e., properties. It may be assigned more, e.g., spatial resolution, but we’ll omit this for the sake of brevity for now.

rt = db.RecordType(name="2D_BarkleySimulation",
               description="Spatially extended Barkley simulation")
# inherit all properties from the BarkleySimulation RecordType
rt.add_parent(name="BarkleySimulation", inheritance="all")

print(rt.get_property(name="epsilon").importance) ### rt has a "epsilon" property with the same importance as "BarkleySimulation"

The parameter inheritance=(obligatory|recommended|fix|all|none) of Entity.add_parent() tells the server to assign all properties of the parent RecordType with the chosen importance (and properties with a higher importance) to the child RecordType automatically upon insertion. See the chapter on importance in the documentation of the CaosDB server for more information on the importance and inheritance of properties.


The inherited properties will only be visible after the insertion since they are set by the CaosDB server, not by the Python client.

Insert Actual Data

Suppose the RecordType “Experiment” and the Property “date” exist in the database. You can then create single data Records by using the corresponding python class:

rec = db.Record() # is None
rec.add_property(name="date", value="2020-01-07")
print( # set by the server

Here, the record has a parent, the RecordType “Experiment”, and a Property date with a value "2020-01-07". After the successful insertion, our new Record is assigned an id by the server. In the following, let’s assume this id to be 256.

Reference Properties

Now suppose we want to insert an analysis that references the above experiment record as its source data. Since we know that the id of the experiment record is 256, we can do the following:

ana = db.Record().add_parent(name="Analysis") # Create record and assign parent in one line
ana.add_property(name="Experiment", value=256)
ana.add_propertt(name="date", value="2020-01-08")
# possibly, add more properties here ...

The experiment record’s id is used as the value of the Experiment property of the analysis Record (note how we use the RecordType Experiment as a REFERENCE property here). Sending a CaosDB query like FIND RECORD Experiment WHICH IS REFERENCED BY A Analysis WITH date=2020-01-08 would now return our original experiment record.

Equivalently, we can also use the Python object of the experiment record, i.e., rec as the value of the Experiment property:

ana = db.Record().add_parent(name="Analysis")
ana.add_property(name="Experiment", value=rec)
ana.add_propertt(name="date", value="2020-01-08")
# possibly, add more properties here ...

Finally, we can also insert both records at the same time using a db.Container:

rec = db.Record()
rec.add_property(name="date", value="2020-01-07")
ana = db.Record().add_parent(name="Analysis")
ana.add_property(name="Experiment", value=rec)
ana.add_propertt(name="date", value="2020-01-08")

cont = db.Container().extend([rec, ana]) # Add experiment and analysis
                                         # records to our container
cont.insert() # Insert both at the same time, the CaosDB server will
              # resolve the reference upon insertion.

All three ways result in an Analysis record which references an Experiment record.


Instead of using the Experiment RecordType as a REFERENCE porperty, we can also create an actual property with data type Experiment:"source", datatype="Experiment"). Now you can add this property to the analysis record with the experiment record as a value as explained above. As a rule of thumbs, using a separate property for these references is meaningful whenever you want to highlight that, e.g., this particular experiment provided the source data for your analysis (as opposed to another experiment that was used for validation).

Advanced insertions

Most Records do not have a name, however it can absolutely make sense to assign one. In that case use the name argument when creating it. Another useful feature is the fact that properties can have units:

rec = db.Record("DeviceNo-AB110")
rec.add_property(name="weight", value="1749", unit="kg")

If you are in some kind of analysis you can do this in batch mode with a container. E.g. if you have a python list analysis_results:

cont = db.Container()
for date, result in analysis_results:
   rec = db.Record()
   rec.add_property(name="date", value=date)
   rec.add_property(name="result", value=result)


It may also be usefull to know that you can insert directly tabular data.

from caosadvancedtools.table_converter import from_tsv

recs = from_tsv("test.csv", "Experiment")

Try it yourself with this example file test.csv!

List Properties

As you may already know, properties can also have list values instead of scalar values. They can be accessed, set, and updated as you would expect from any list-valued attribute in Python, as the following example illustrates.

import caosdb as db
db.Property(name="TestList", datatype=db.LIST(db.DOUBLE)).insert()
    name="TestList", value=[1,2,3]).insert()
retrieved = db.Record(name="TestRec").retrieve()
retrieved.get_property("TestList").value += [4,5]

# Check update
retrieved = db.Record(name="TestRec").retrieve()


Properties of Entities that shall be updated need to have IDs. Let’s look at an example:

experiment = db.Record(id=1111).retrieve()
experiment.add_property(name='date', value="2020-01-01")
retrieved.update()  # Fails! The 'date' Property needs to have an ID.

The easiest way to get around this is to use the corresponding entity getter:

experiment = db.Record(id=1111).retrieve()
experiment.add_property(db.get_entity_by_name('date'), value="2020-01-01")
retrieved.update()  # Works!

There also are the functions get_entity_by_path and get_entity_by_id. You can easily use cached versions of those functions (see caching options).

File Update

Updating an existing file by uploading a new version.

  1. Retrieve the file record of interest, e.g. by ID:

import caosdb as db

file_upd = db.File(id=174).retrieve()
  1. Set the new local file path. The remote file path is stored in the file object as file_upd.path while the local path can be found in file_upd.file.

file_upd.file = "./supplements.pdf"
  1. Update the file: