Complex Data Models

With LinkAhead it is possible to create very complex data models.

E.g. it is possible to add properties to properties to cover complex relations in data management workflows.

One example for a use case is meta data that is added to very specific properties of datasets, e.g. data privacy information can be added to properties which themselves could already be considered meta data of a dataset.

The example below tries to cover some complex cases for data models:

Examples

import linkahead as db

# Create two record types with descriptions:
rt1 = db.RecordType(name="TypeA", description="The first type")
rt2 = db.RecordType(name="TypeB", description="The second type")

# Create a record using the first record type as parent:
r1 = db.Record(name="Test_R_1", description="A record")
r1.add_parent(rt1)

# Create two files (the files named test.txt and testfile2.txt should exist in the
# current working directory:
f1 = db.File(name="Test file", path="/test.txt", file="test.txt")
f2 = db.File(name="Test file 2", path="/testfile2.txt", file="testfile2.txt")

# Create two properties with different data types:
p1 = db.Property(name="TestFileProperty", datatype=db.FILE)
p2 = db.Property(name="TestDoubleProperty", datatype=db.DOUBLE, unit="m")
p3 = db.Property(name="TestIntegerProperty", datatype=db.INTEGER, unit="s")

# Create a reference property that points to records of record type 2:
p4 = db.Property(name="TestReferenceProperty", datatype=rt2)

# Create a complex record:
r2 = db.Record(name="Test_R_2", description="A second record")
r2.add_parent(rt2)
r2.add_property(rt1, value=r1)  # this is a reference to the first record type
r2.add_property(p1, value=f1)  # this is a reference to the first file
r2.add_property(p2, value=24.8)  # this is a double property with a value
r2.add_property(p3, value=1)  # this is an integer property with a value

# Very complex part of the data model:
# Case 1: File added to another file
f2.add_property(p1, value=f1)  # this adds a file property with value first file
                       # to the second file

# Case 2: Property added to a property
p2.add_property(p3, value=27)  # this adds an integer property with value 27 to the
                       # double property

# Case 3: Reference property added to a property
# The property p2 now has two sub properties, one is pointing to
# record p2 which itself has the property p2, therefore this can be
# considered a loop in the data model.
p2.add_property(p4, value=r2)  # this adds a reference property pointing to
                       # record 2 to the double property

# Insert a container containing all the newly created entities:
c = db.Container().extend([rt1, rt2, r1, r2, f1, p1, p2, p3, f2, p4])
c.insert()

# Useful for testing: wait until the user presses a key
# Meanwhile have a look at the WebUI: You can e.g. query "FIND ENTITY Test*"
# to view all the entities created here and see the relations and links
# between them.
b = input("Press any key to cleanup.")
# cleanup everything after the user presses any button.
c.delete()

Finding parents and properties

To find a specific parent or property of an Entity, its ParentList or PropertyList can be filtered using names, ids, or entities. A short example:

import linkahead as db

# Setup a record with six properties
r = db.Record()
p1_1 = db.Property(id=101, name="Property 1")
p1_2 = db.Property(name="Property 1")
p2_1 = db.Property(id=102, name="Property 2")
p2_2 = db.Property(id=102)
p2_3 = db.Property(id=102, name="Other Property")
p3 = db.Property(id=104, name="Other Property")
r.add_property(p1_1).add_property(p1_2).add_property(p2_1)
r.add_property(p2_2).add_property(p2_3).add_property(p3)
properties = r.properties

# As r only has one property with id 101, this returns a list containing only p1_1
properties.filter_by_identity(pid=101)
# Result: [p1_1]

# Filtering with name="Property 1" returns both p1_1 and p1_2, as they share their name
properties.filter_by_identity(name="Property 1")
# Result: [p1_1, p1_2]

#  If both name and pid are given, matching is based only on pid for all entities that have an id
properties.filter_by_identity(pid="102", name="Other Property")
# Result: [p2_1, p2_2, p2_3]

# However, filter_by_identity with name="Property 1" and id=101 returns both p1_1 and p1_2, because
# p1_2 does not have an id and matches the name
properties.filter_by_identity(pid="101", name="Property 1")
# Result: [p1_1, p1_2]

# We can also filter using an entity, in which case the name and id of the entity are used:
properties.filter_by_identity(pid="102", name="Property 2") == properties.filter_by_identity(p2_1)
# Result: True

# If we only need properties that match both id and name, we can set the parameter
# conjunction to True:
properties.filter_by_identity(pid="102", name="Property 2", conjunction=True)
# Result: [p2_1]

The filter function of ParentList works analogously.

Finding entities in a Container

In the same way as described above, Container can be filtered. A short example:

import linkahead as db

# Setup a record with six properties
p1 = db.Property(id=101, name="Property 1")
p2 = db.Property(name="Property 2")
c = db.Container().extend([p1,p2])
c.filter_by_identity(name="Property 1")
# Result: [p1]