YAML data model specification

The caosadvancedtools library features the possibility to create and update CaosDB models using a simplified definition in YAML format.

Let’s start with an example taken from model.yml in the library sources.

Project:
   obligatory_properties:
      projectId:
         datatype: INTEGER
         description: 'UID of this project'
Person:
   recommended_properties:
      firstName:
         datatype: TEXT
         description: 'first name'
      lastName:
         datatype: TEXT
         description: 'last name'
LabbookEntry:
   recommended_properties:
      Project:
      entryId:
         datatype: INTEGER
         description: 'UID of this entry'
      responsible:
         datatype: Person
         description: 'the person responsible for these notes'
      textElement:
         datatype: TEXT
         description: 'a text element of a labbook recording'
      associatedFile:
         datatype: FILE
         description: 'A file associated with this recording'
      table:
         datatype: FILE
         description: 'A table document associated with this recording'
extern:
   - Textfile

This example defines 3 RecordTypes:

  • A Project with one obligatory property datatype

  • A Person with a firstName and a lastName (as recommended properties)

  • A LabbookEntry with multiple recommended properties of different data types

  • It is assumed that the server knows a RecordType or Property with the name Textfile.

One major advantage of using this interface (in contrast to the standard python interface) is that properties can be defined and added to record types “on-the-fly”. E.g. the three lines for firstName as sub entries of Person have two effects on CaosDB:

  • A new property with name firstName, datatype TEXT and description first name is inserted (or updated, if already present) into CaosDB.

  • The new property is added as a recommended property to record type Person.

Any further occurrences of firstName in the yaml file will reuse the definition provided for Person.

Note the difference between the three property declarations of LabbookEntry:

  • Project: This record type is added directly as a property of LabbookEntry. Therefore it does not specify any further attributes. Compare to the original declaration of record type Project.

  • responsible: This defines and adds a property with name “responsible” to LabbookEntry`, which has a datatype ``Person. Person is defined above.

  • firstName: This defines and adds a property with the standard data type TEXT to record type Person.

If the data model depends on record types or properties which already exist in CaosDB, those can be added using the extern keyword: extern takes a list of previously defined names.

Datatypes

You can use any data type understood by CaosDB as datatype attribute in the yaml model.

List attributes are a bit special:

datatype: LIST<DOUBLE>

would declare a list datatype of DOUBLE elements.

datatype: LIST<Project>

would declare a list of elements with datatype Project.

Keywords

  • importance: Importance of this entity. Possible values: “recommended”, “obligatory”, “suggested”

  • datatype: The datatype of this property, e.g. TEXT, INTEGER or Project.

  • unit: The unit of the property, e.g. “m/s”.

  • description: A description for this entity.

  • recommended_properties: Add properties to this entity with importance “recommended”.

  • obligatory_properties: Add properties to this entity with importance “obligatory”.

  • suggested_properties: Add properties to this entity with importance “suggested”.

  • inherit_from_XXX: This keyword accepts a list of other RecordTypes. Those RecordTypes are added as parents, and all Properties with at least the importance XXX are inherited. For example, inherited_from_recommended will inherit all Properties of importance recommended and obligatory, but not suggested.

  • parent: Parent of this entity. Same as inherit_from_obligatory. (Deprecated)

Usage

You can use the yaml parser directly in python as follows:

from caosadvancedtools.models import parser as parser
model = parser.parse_model_from_yaml("model.yml")

This creates a DataModel object containing all entities defined in the yaml file.

You can then use the functions from caosadvancedtools.models.data_model.DataModel to synchronize the model with a CaosDB instance, e.g.:

model.sync_data_model()