This document describes the SQL tables used by CaosDB and how they represent the internal data structure.

Tables

entities

All entities (RecordTypes, Records, Properties, …) have an entry here. The columns are:

  • id :: The (unique) ID of this entity.

  • name :: Name of the entity.

  • description :: A description of the entity.

  • role :: The role of the entity (e.g. data type, RecordType, Property, Record, …).

  • acl :: Access control ID.

isa

Inheritance as in “A is a B” is stored in this table. The columns:

  • child :: The child entity.

  • parent :: The parent entity.

  • type :: The inheritance type (either INHERITANCE or SUBTYPING (what’s “subtyping”?)).

TODO Why is this table empty on the default example?

isa_cache

Here, information about nested inheritance is cached (upon creation?).

  • child :: The child entity.

  • parent :: The parent entity.

  • rpath :: Possible intermediate steps on a path from child to parent. See the Inheritance section for more details.

foo_data

Property data is stored here, in tables named after the data type:

  • date_data

  • datetime_data

  • double_data

  • integer_data

  • text_data

  • enum_data

  • name_data

  • null_data

  • reference_data

These tables share (generally) the same set of columns:

  • domain_id :: The domain of an item, see the multipurpose subdomains section. For “normal” data, this is 0.

  • entity_id :: The entity to which this property belongs. In most cases, this is a “normal” entity like a Record or RecordType.

  • property_id :: In most cases, the property to which the value belongs.

  • value :: The value of the property.

  • status :: The importance of the property (?).

  • pidx :: The property index, becomes nonzero when a property occurs more than once in an entity.

name_data

This table is used to name RecordTypoes, Records, Properties etc. The column property_id has the value 20, because that is the entity ID for names:

> SELECT * FROM entities WHERE id=20;
+----+-------------------+----------+------+
| id | description       | role     | acl  |
+----+-------------------+----------+------+
| 20 | Name of an entity | PROPERTY |    0 |
+----+-------------------+----------+------+

data_type

The type of properties is stored here. The columns are:

  • domain_id :: Property identitification, same as for plain data?

  • entity_id :: Property identitification, same as for plain data?

  • property_id :: Property identitification, same as for plain data?

  • datatype :: The data type of the property, a reference to an entity.

files

  • file_id :: ID.

  • path :: (Relative) path to the file.

  • size :: Size in bytes.

  • hash :: Hash of the file contents, as binary. TODO Which algorithm?

  • checked_timestamp :: Timestamp when last checked?

feature_config

+——————-+———+ | _key | _value | +——————-+———+ | ENTITY_VERSIONING | ENABLED | +——————-+———+

A key-value store for features. Use the is_feature_config procedure to check for the status of specific features.

Versioning tables

There are a few extra tables for implementing versioning. Also there is a special column value_iversion in the reference_data table which stores the version of the referenced entity at the time of the version. If value_iversion is NULL, no specific version is stored.

The _iversion is an incremental version counter, starting implicitly at 1. Previous versions of entities are stored, along with their _iversion, in the archive_* tables.

transactions

Replacement for transaction_log, holds ID, user and time of transaction:

  • srid :: Server request ID, used to identify transactions

  • username :: User name

  • realm :: Realm for which the user name is valid

  • seconds :: Time of transaction: seconds

  • nanos :: Time of transaction: sub-second time resolution

entity_version

Versioning info for entities:

  • entity_id :: persistent ID of the entity

  • hash :: Hash, for future use.

  • version :: External version string, may be globally unique. Should be used by API calls.

  • _iversion :: Version ID for this entity for internal use, typically an incremental counter, starting with 1

  • _ipparent :: Primary parent ID (internal) for this version, i.e. predecessor

  • srid :: Server request / transaction ID which created this version

archive_footable

Older (i.e. not current) data, from previous versions of an entity which were stored in footable.

  • domain_id :: Same as in footable.

  • entity_id :: Same as in footable.

  • property_id :: Same as in footable.

  • value :: The value at the given version.

  • status :: The value at the given version.

  • pidx :: Same as in footable.

  • _iversion :: Version index of this version.

  • … :: Data type specific columns may also be there.

TODO

  • entity_acl

  • groups

  • logging

  • desc_overrides

  • name_overrides

  • passwd

  • passwords

  • permissions

  • query_template_def :: User-defined query templates?

  • roles :: possible user roles

  • stats :: benchmarking?

  • transaction_log :: benchmarking?

  • units_lin_con :: fraction and unit conversion?

  • user_info :: Metadata for (local?) user accounts

  • user_roles :: User permissions?

Inheritance caching

The isa_cache table does not only store direct inheritances, but also nested parentships. This is done by adding intermediate inheritance steps in the rpath column:

  • For direct inheritance, the rpath column contains the child itself.

  • For all other possible paths between the child and parent, it contains the entities between, separated by the > character.

Example

Consider this inheritance structure and the resulting table:

   0    Read this from top to bottom: "1 is a 0",
  / \   "2 is a 0", and so on.
 1   2
  \ /
   3
   |
   4
   |
   5

|——-|——–|——-| | child | parent | rpath | |——-|——–|——-| | 1 | 0 | 1 | | 2 | 0 | 2 | | 3 | 0 | 1 | | 3 | 0 | 2 | | 3 | 1 | 3 | | 3 | 2 | 3 | | 4 | 0 | 3>1 | | 4 | 0 | 3>2 | | 4 | 1 | 3 | | 4 | 2 | 3 | | 4 | 3 | 4 | | 5 | 0 | 4>3>1 | | 5 | 0 | 4>3>2 | | 5 | 1 | 4>3 | | 5 | 2 | 4>3 | | 5 | 3 | 4 | | 5 | 4 | 5 | |——-|——–|——-|

Multipurpose subdomains

Multipurpose subdomains are generically used to work with composite properties. Currently implemented examples are:

  • Properties with units (if multiple Properties exist)

  • Lists

  • Name overrides

Example

Let’s have a look at this Record (simplified XML):

<R1>
    <P1 name="Comment">Hello World</P1>
    <P2 name="voltage" unit="V">
        23
    </P2>
    <P3 comment="list of something">
        V1, V2, V3, V4, ...
    </P3>
</R1>