Data Insertion
Data Models
Data is stored and structured in LinkAhead using a concept of RecordTypes, Properties, Records etc. If you do not know what these are, please look at the chapter Data Model in the LinkAhead server documentation.
In order to insert some actual data, we need to create a data model using RecordTypes and Properties (You may skip this if you use a LinkAhead instance that already has the required types). When you create a new Property you must supply a datatype. So, let’s create a simple Property called “a” of datatype double. This is very easy in pylib:
a = db.Property(name="a", datatype=db.DOUBLE)
There are a few basic datatypes like db.INTEGER, db.DOUBLE, or db.TEXT. See the data types for a full list.
We can create our own small data model for e.g. a simulation by adding two more Properties and a RecordType:
b = db.Property(name="b", datatype=db.DOUBLE)
epsilon = db.Property(name="epsilon", datatype=db.DOUBLE)
recordtype = db.RecordType(name="BarkleySimulation")
recordtype.add_property(a)
recordtype.add_property(b)
recordtype.add_property(epsilon)
container = db.Container()
container.extend([a, b, epsilon, recordtype])
container.insert()
Inheritance of Properties
Suppose you want to create a new RecordType “2D_BarkleySimulation” that denotes spatially extended Barkley simulations. This is a subtype of the “BarkleySimulation” RecordType above and should have all its parameters, i.e., properties. It may be assigned more, e.g., spatial resolution, but we’ll omit this for the sake of brevity for now.
rt = db.RecordType(name="2D_BarkleySimulation",
description="Spatially extended Barkley simulation")
# inherit all properties from the BarkleySimulation RecordType
rt.add_parent(name="BarkleySimulation", inheritance="all")
rt.insert()
print(rt.get_property(name="epsilon").importance) ### rt has a "epsilon" property with the same importance as "BarkleySimulation"
The parameter inheritance=(obligatory|recommended|fix|all|none)
of
Entity.add_parent()
tells the server to assign
all properties of the parent RecordType with the chosen importance (and properties with a higher
importance) to the child RecordType
automatically upon insertion. See the chapter on importance in the
documentation of the LinkAhead server for more information on the importance and inheritance of
properties.
Note
The inherited properties will only be visible after the insertion since they are set by the LinkAhead server, not by the Python client.
Insert Actual Data
Suppose the RecordType “Experiment” and the Property “date” exist in the database. You can then create single data Records by using the corresponding python class:
rec = db.Record() # rec.id is None
rec.add_parent(name="Experiment")
rec.add_property(name="date", value="2020-01-07")
rec.insert()
print(rec.id) # rec.id set by the server
Here, the record has a parent, the RecordType “Experiment”, and a Property date
with a value "2020-01-07"
. After the successful insertion, our new Record is
assigned an id
by the server. In the following, let’s assume this id to be
256
.
Reference Properties
Now suppose we want to insert an analysis that references the above experiment record as its source data. Since we know that the id of the experiment record is 256, we can do the following:
ana = db.Record().add_parent(name="Analysis") # Create record and assign parent in one line
ana.add_property(name="Experiment", value=256)
ana.add_propertt(name="date", value="2020-01-08")
# possibly, add more properties here ...
ana.insert()
The experiment record’s id is used as the value of the Experiment
property
of the analysis Record (note how we use the RecordType Experiment
as a
REFERENCE
property here). Sending a LinkAhead query like FIND RECORD
Experiment WHICH IS REFERENCED BY A Analysis WITH date=2020-01-08
would now
return our original experiment record.
Equivalently, we can also use the Python object of the experiment record, i.e.,
rec
as the value of the Experiment
property:
ana = db.Record().add_parent(name="Analysis")
ana.add_property(name="Experiment", value=rec)
ana.add_propertt(name="date", value="2020-01-08")
# possibly, add more properties here ...
ana.insert()
Finally, we can also insert both records at the same time using a
db.Container
:
rec = db.Record()
rec.add_parent(name="Experiment")
rec.add_property(name="date", value="2020-01-07")
ana = db.Record().add_parent(name="Analysis")
ana.add_property(name="Experiment", value=rec)
ana.add_propertt(name="date", value="2020-01-08")
cont = db.Container().extend([rec, ana]) # Add experiment and analysis
# records to our container
cont.insert() # Insert both at the same time, the LinkAhead server will
# resolve the reference upon insertion.
All three ways result in an Analysis record which references an Experiment record.
Note
Instead of using the Experiment
RecordType as a REFERENCE
porperty,
we can also create an actual property with data type Experiment
:
db.property(name="source", datatype="Experiment")
. Now you can add this
property to the analysis record with the experiment record as a value as
explained above. As a rule of thumbs, using a separate property for these
references is meaningful whenever you want to highlight that, e.g., this
particular experiment provided the source data for your analysis (as opposed
to another experiment that was used for validation).
Advanced insertions
Most Records do not have a name, however it can absolutely make sense to assign one. In that case use the name argument when creating it. Another useful feature is the fact that properties can have units:
rec = db.Record("DeviceNo-AB110")
rec.add_parent(name="SlicingMachine")
rec.add_property(name="weight", value="1749", unit="kg")
rec.insert()
If you are in some kind of analysis you can do this in batch mode with a
container. E.g. if you have a python list analysis_results
:
cont = db.Container()
for date, result in analysis_results:
rec = db.Record()
rec.add_parent(name="Experiment")
rec.add_property(name="date", value=date)
rec.add_property(name="result", value=result)
cont.append(rec)
cont.insert()
It may also be usefull to know that you can insert directly tabular data.
from caosadvancedtools.table_converter import from_tsv
recs = from_tsv("test.csv", "Experiment")
print(recs)
recs.insert()
Try it yourself with this example file test.csv!
List Properties
As you may already know, properties can also have list values instead of scalar values. They can be accessed, set, and updated as you would expect from any list-valued attribute in Python, as the following example illustrates.
import linkahead as db
db.Property(name="TestList", datatype=db.LIST(db.DOUBLE)).insert()
db.RecordType(name="TestType").add_property(name="TestList").insert()
db.Record(name="TestRec").add_parent("TestType").add_property(
name="TestList", value=[1,2,3]).insert()
retrieved = db.Record(name="TestRec").retrieve()
retrieved.get_property("TestList").value += [4,5]
retrieved.update()
# Check update
retrieved = db.Record(name="TestRec").retrieve()
print(retrieved.get_property("TestList").value)
Note
Properties of Entities that shall be updated need to have IDs. Let’s look at an example:
experiment = db.Record(id=1111).retrieve()
experiment.add_property(name='date', value="2020-01-01")
retrieved.update() # Fails! The 'date' Property needs to have an ID.
The easiest way to get around this is to use the corresponding entity getter:
experiment = db.Record(id=1111).retrieve()
experiment.add_property(db.get_entity_by_name('date'), value="2020-01-01")
retrieved.update() # Works!
There also are the functions get_entity_by_path
and get_entity_by_id
. You can easily use
cached versions of those functions (see caching options).
File Update
Updating an existing file by uploading a new version.
Retrieve the file record of interest, e.g. by ID:
import linkahead as db
file_upd = db.File(id=174).retrieve()
Set the new local file path. The remote file path is stored in the file object as
file_upd.path
while the local path can be found infile_upd.file
.
file_upd.file = "./supplements.pdf"
Update the file:
file_upd.update()