CaosDB Query Language Examples
See syntax specification in CaosDB Query Language Syntax.
In this chapter, the CaosDB Query Language (CQL) is presented as a means of formulating search commands, commonly referred to as queries. It is highly recommended that you experiment with the examples provided, such as those found on https://demo.indiscale.com. An interactive tour is also available on this public instance, which includes a comprehensive overview of the query language. Therefore, it is suggested that you begin there and subsequently proceed with this more detailed explanation.
Introduction
Queries typically start with the keyword FIND
, followed by a description of
what you want to find. For example, you can search for all musical instruments
with FIND MusicalInstrument
.
Note, the CQL is case insensitive. We will write keywords of CQL in all caps to illustrate what parts are part of the language.
The most common way is to provide a RecordType name after FIND
(as in the
example above). However, you can also use the name of some other entity:
FIND 'My first guitar'
.
Note, that we put the name here in quotes. Spaces are used in CQL as separator of words. Thus, if something contains quotes, like the name here, it needs to be quoted.
While queries like the last one are great to get an impression of the data, often we need to be more specific. Therefore, queries can include various conditions to restrict the result set.
Example: FIND MusicalAnalysis WITH quality_factor>0.5 AND date IN 2019
. The keyword WITH
signifies that for each Record of the type
MusicalAnalysis
, an assessment is made to determine whether it possesses a
Property labelled quality_factor
that exceeds 0.5, as well as another
Property labelled date
that may correspond to any day within the year 2019.
In order to make CQL easier to learn and to remember we designed it to be close
to natural spoken English language. For example, you can write
FIND Guitar WHICH HAS A PROPERTY price
. Here, “HAS A PROPERTY” is what we call
syntactic sugar. It lets the query role off the tongue more easily than
FIND Guitar WHICH price
but it is actually not needed and does not change
the meaning of the query. In fact, you could also write FIND Guitar WITH price
.
If you are only interested in the number of Entities that match your query, you
can replace FIND
with COUNT
and the query will only return the number of
Entities in the result set.
Sometimes the list of Records that you get using a FIND
query is not what you
need; especially if you want to export a subset of the data for the analysis
with some external tool.
SELECT
queries offer to represent the query result in a tabular form.
If you replace the FIND
keyword of a query with SELECT x, y, z FROM
, then
CaosDB will return the result as tabular data.
For example, instead of FIND Guitar
, try out
SELECT name, electric FROM Guitar
As you can see, those queries are design to allow very specific data requests.
If you do not want/need to be as specific you can omit the first keyword (FIND
or SELECT
) which creates a search for anything that has a text Property with
something like your expression. For example, the query “John” will search for
any Records that has a text property that contains this string.
With this, we conclude our introduction of CQL. You now know about the basic elements. The following will cover the various aspects in more detail and you will for example learn how you can use references among Records, or meta data like the creation time of a Record to restrict the query result set.
What am I searching for?
We already learned, that we can provide the name of a RecordType after the FIND
keyword. Let’s call this part of the query “entity expression”. In general, we
need to identify with the entity expression one or more entities via their name, CaosDB ID
or a pattern.
FIND Guitar
FIND Guit*
(’*’ represents none, one or more characters)FIND <<[gG]ui.*>>
(a regular expression surrounded by << and ‘>>’. see below)FIND 110
The result set will contain Entities that are either identified by the entity expression directly (i.e. they have the name or the given ID) or the have such an Entity as parent.
As you know, CaosDB distincts among different Entity roles:
Entity
Record
RecordType
Property
File
You can provide the role directly after the FIND
keyword and before the
entity expression: FIND RECORD Guitar
. The result set will then restricted
to Entities with that role.
Conditions / Filters
POV - Property-Operator-Value
The following queries are equivalent and will restrict the result set to records which have a property named pname1 that has a value val1.
FIND ename.pname1=val1
FIND ename WITH pname1=val1
FIND ename WHICH HAS A PROPERTY pname1=val1
FIND ename WHICH HAS A pname1=val1
Again, the resultset can be restricted to any other entity role as well:
FIND RECORDTYPE ename WHICH HAS A pname1=val1
currently known operators: =, !=, <=, <, >=, >
(and cf. next paragraphes!)
Special Operator: LIKE
The LIKE can be used with wildcards. The *
is a wildcard for any (possibly empty) sequence of characters.
Examples:
FIND ename WHICH HAS A pname1 LIKE va*
FIND ename WHICH HAS A pname1 LIKE va*1
FIND ename WHICH HAS A pname1 LIKE *al1
Note: The LIKE operator will only produce expectable results with text properties.
Special Case: References
In general a reference can be addressed just like a POV filter. So
FIND ename1 WITH pname1=ename2
will also return any record named ename1 which references the entity with name or id ename2 via a reference property named pname1. However, it will also return any entity with a text property of that name with the string value ename2. In order to restrict the result set to reference properties one may make use of special reference operators:
reference operators: ->, REFERENCES, REFERENCE TO
The query looks like this:
FIND ename1 WHICH HAS A pname1 REFERENCE TO ename2
FIND ename1 WHICH HAS A pname1->ename2
Special Case: DateTime
DateTime operators: =, !=, <, >, IN, NOT IN
d1=d2
: Equivalence relation.
‘’True’’ iff d1 and d2 are equal in every respect (same DateTime flavor, same fields are defined/undefined and all defined fields are equal respectively).
‘’False’’ iff they have the same DateTime flavor but have different fields defined or fields with differing values.
‘’Undefined’’ otherwise.
Examples:
2015-04-03=2015-04-03T00:00:00
is undefined.2015-04-03T00:00:00=2015-04-03T00:00:00.0
is undefined (second precision vs. nanosecond precision).2015-04-03T00:00:00.0=2015-04-03T00:00:00.0
is true.2015-04-03T00:00:00=2015-04-03T00:00:00
is true.2015-04=2015-05
is false.2015-04=2015-04
is true.
d1!=d2
: Intransitive, symmetric relation.
‘’True’’ iff
d1=d2
is false.‘’False’’ iff
d1=d2
is true.‘’Undefined’’ otherwise.
Examples:
2015-04-03!=2015-04-03T00:00:00
is undefined.2015-04-03T00:00:00!=2015-04-03T00:00:00.0
is undefined.2015-04-03T00:00:00.0!=2015-04-03T00:00:00.0
is false.2015-04-03T00:00:00!=2015-04-03T00:00:00
is false.2015-04!=2015-05
is true.2015-04!=2015-04
is false.
d1>d2
: Transitive, non-symmetric relation.
Semantics depend on the flavors of d1 and d2. If both are…
UTCDateTime
‘’True’’ iff the time of d1 is after the the time of d2 according to UTC.
‘’False’’ otherwise.
SemiCompleteDateTime
‘’True’’ iff
d1.ILB>d2.EUB
is true ord1.ILB=d2.EUB
is true.‘’False’’ iff
d1.EUB<d2.ILB}} is true or {{{d1.EUB=d2.ILB
is true.‘’Undefined’’ otherwise.
Examples:
2015>2014
is true.2015-04>2014
is true.2015-01-01T20:15.00>2015-01-01T20:14
is true.2015-04>2015
is undefined.2015>2015-04
is undefined.2015-01-01T20:15>2015-01-01T20:15:15
is undefined.2014>2015
is false.2014-04>2015
is false.2014-01-01>2015-01-01T20:15:30
is false.
d1<d2
: Transitive, non-symmetric relation.
Semantics depend on the flavors of d1 and d2. If both are…
UTCDateTime
‘’True’’ iff the time of d1 is before the the time of d2 according to UTC
‘’False’’ otherwise.
SemiCompleteDateTime
‘’True’’ iff
d1.EUB<d2.ILB
is true ord1.EUB=d2.ILB
is true.‘’False’’ iff
d1.ILB>d2.EUB}} is true or {{{d1.ILB=d2.EUB
is true.‘’Undefined’’ otherwise.
Examples:
2014<2015
is true.2014-04<2015
is true.2014-01-01<2015-01-01T20:15:30
is true.2015-04<2015
is undefined.2015<2015-04
is undefined.2015-01-01T20:15<2015-01-01T20:15:15
is undefined.2015<2014
is false.2015-04<2014
is false.2015-01-01T20:15.00<2015-01-01T20:14
is false.
d1 IN d2
: Transitive, non-symmetric relation.
Semantics depend on the flavors of d1 and d2. If both are…
SemiCompleteDateTime
‘’True’’ iff (
d1.ILB>d2.ILB
is true ord1.ILB=d2.ILB
is true) and (d1.EUB<d2.EUB
is true ord1.EUB=d2.EUB
is true).‘’False’’ otherwise.
Examples:
2015-01-01 IN 2015
is true.2015-01-01T20:15:30 IN 2015-01-01
is true.2015-01-01T20:15:30 IN 2015-01-01T20:15:30
is true.2015 IN 2015-01-01
is false.2015-01-01 IN 2015-01-01T20:15:30
is false.
d1 NOT IN d2
: Transitive, non-symmetric relation.
Semantics depend on the flavors of d1 and d2. If both are…
SemiCompleteDateTime
‘’True’’ iff
d1.ILB IN d2.ILB
is false.‘’False’’ otherwise.
Examples:
2015 NOT IN 2015-01-01
is true.2015-01-01 NOT IN 2015-01-01T20:15:30
is true.2015-01-01 NOT IN 2015
is false.2015-01-01T20:15:30 NOT IN 2015-01-01
is false.2015-01-01T20:15:30 NOT IN 2015-01-01T20:15:30
is false.
Note
These semantics follow a three-valued logic with true
, false
and undefined
as truth
values. Only true
is truth preserving. I.e. only those expressions which evaluate to true
pass the POV filter.
FIND ... WHICH HAS A somedate=2015-01
only returns entities for which somedate=2015-01
is
true. On the other hand, FIND ... WHICH DOESN'T HAVE A somedate=2015-01
returns entities for which
somedate=2015-01
is false or undefined. Shortly put, NOT d1=d2
is not equivalent to
d1!=d2
. The latter assertion is stronger.
Omitting the Property or the Value
One doesn’t have to specify the property or the value at all. The following query filters the result set for records which have any property with a value greater than val1.
FIND ename WHICH HAS A PROPERTY > val1
FIND ename . > val1
FIND ename.>val1
And for references…
FIND ename1 WHICH HAS A REFERENCE TO ename2
FIND ename1 WHICH REFERENCES ename2
FIND ename1 . -> ename2
FIND ename1.->ename2
The following query returns records which have a pname1 property with any value.
FIND ename WHICH HAS A PROPERTY pname1
FIND ename WHICH HAS A pname1
FIND ename WITH pname1
FIND ename WITH A pname1
FIND ename WITH A PROPERTY pname1
FIND ename WITH PROPERTY pname1
FIND ename . pname1
FIND ename.pname1
TransactionFilter
FIND ename WHICH (sugar|negated_sugar)? (NOT)? (CREATED|INSERTED|UPDATED) (by_clause time_clause?| time_clause by_clause?)
Definition
sugar:: HAS BEEN
| HAVE BEEN
| HAD BEEN
| WAS
| IS
negated_sugar:: HAS NOT BEEN
| HASN'T BEEN
| WAS NOT
| WASN'T
| IS NOT
| ISN'T
| HAVN'T BEEN
| HAVE NOT BEEN
| HADN'T BEEN
| HAD NOT BEEN
by_clause:: BY (ME | username | SOMEONE ELSE (BUT ME)? | SOMEONE ELSE BUT username)
datetime:: A datetime string of the form YYYY[-MM[-DD(T| )[hh[:mm[:ss[.nnn][(+|-)zzzz]]]]]]
or TODAY
.
time_clause:: [AT|ON|IN|BEFORE|AFTER|UNTIL|SINCE] (datetime)
Examples
FIND ename WHICH HAS BEEN CREATED BY ME ON 2014-12-24
FIND ename WHICH HAS BEEN CREATED BY SOMEONE ELSE ON 2014-12-24
FIND ename WHICH HAS BEEN CREATED BY erwin ON 2014-12-24
FIND ename WHICH HAS BEEN UPDATED BY SOMEONE ELSE BUT erwin ON 2014-12-24
FIND ename WHICH HAS BEEN INSERTED BY erwin
FIND ename WHICH HAS BEEN INSERTED SINCE 2021-04
Note that SINCE
and UNTIL
are inclusive, while BEFORE
and AFTER
are not.
File Location
Search for file objects by their location:
FIND FILE WHICH IS STORED AT a/certain/path/
Wildcards
STORED AT can be used with wildcards similar to unix wildcards.
*
matches any characters or none at all, but not the directory separator/
**
matches any character or none at all.A leading
*
is a shortcut for/**
Asterisks directly between two other asterisks are ignored:
***
is the same as**
.Escape character:
\
(E.g.\\
is a literal backslash.\*
is a literal star. But\\*
is a literal backslash followed by a wildcard.)
Examples:
Find any files ending with .acq
:
FIND FILE WHICH IS STORED AT *.acq
or
FIND FILE WHICH IS STORED AT **.acq
or
FIND FILE WHICH IS STORED AT /**.acq
Find files stored one directory below /data/
, ending with .acq
:
FIND FILE WHICH IS STORED AT /data/*/*.acq
Find files stored in /data/
, ending with .acq
:
FIND FILE WHICH IS STORED AT /data/*.acq
Find files stored in a directory at any depth in the tree below /data/
, ending with .acq
:
FIND FILE WHICH IS STORED AT /data/**.acq
Find any file in a directory which begins with 2016-02
:
FIND FILE WHICH IS STORED AT */2016-02*/*
Back References
The back reference filters for entities that are referenced by another entity. The following query returns records of the type ename1 which are referenced by ename2 entities via the reference property pname1.
FIND ename1 WHICH IS REFERENCED BY ename2 AS A pname1
FIND ename1 WITH @ ename2 / pname1
FIND ename1 . @ ename2 / pname1
One may omit the property specification:
FIND ename1 WHICH IS REFERENCED BY ename2
FIND ename1 WHICH HAS A PROPERTY @ ename2
FIND ename1 WITH @ ename2
FIND ename1 . @ ename2
Nested queries, or filtering by sub-properties
Nested queries can easily be searched by simply concatenating WHICH
or WITH
expressions:
FIND ename WHICH HAS A pname WHICH HAS A subpname=val
For example:
FIND AN experiment WHICH HAS A camera WHICH HAS A 'serial number'= 1234567890
Combining Filters with Propositional Logic
Any result set can be filtered by logically combining POV filters or back reference filters:
Conjunction (AND)
As we saw above, we can combine conditions:
FIND MusicalAnalysis WHICH HAS quality_factor>0.5 AND date IN 2019
In general, the conjunction takes the form FIND <eexpr> WHICH <filter1> AND <filter2>
. You can
also use &
instead of AND
or chain more than two conditions. If you mix conjunctions with
disjunctions, you need to add brackets to define the priority. For example: FIND <eexpr> WHICH (<filter1> AND <filter2>) OR <filter3>
.
FIND Guitar WHICH REFERENCES Manufacturer AND price
is a combination of a reference filter and a
POV filter. For readability, you can also writeFIND Guitar WHICH REFERENCES Manufacturer AND WHICH HAS A price
. However, the additional “WHICH
HAS A” is purely cosmetic (syntactic sugar).
Disjunction (OR)
The rules for disjunctions (OR
or |
) are the same as for conjunctions, see above.
FIND ename1 WHICH HAS A PROPERTY pname1=val1 OR A PROPERTY pname2=val2 Or A PROPERTY...
FIND ename1 WHICH HAS A PROPERTY pname1=val1 OR A pname2=val2 OR ...
FIND ename1 . pname1=val1 | pname2=val2 | ...
Negation (NOT)
You can negate any filter by prefixing the filter with NOT
or !
:
FIND <eexpr> WHICH NOT <filter1>
.
There are many syntactic sugar alternatives which are treated the same as “NOT”:
DOES NOT HAVE
ISN'T
and many more
FIND ename1 WHICH DOES NOT HAVE A PROPERTY pname1=val1
FIND ename1 WHICH DOESN'T HAVE A pname1=val1
FIND ename1 . NOT pname2=val2
FIND ename1 . !pname2=val2
Parentheses
Basically, you can put parantheses around filter expressions and con- or disjunctions.
FIND Guitar WHICH (REFERENCES Manufacturer AND WHICH HAS A price)
.FIND Guitar WHICH (REFERENCES Manufacturer) AND (WHICH HAS A price)
.
For better readability, the above query can be written as:
FIND Guitar WHICH (REFERENCES Manufacturer AND HAS A price)
. Note, that without syntactic sugar this query looks like:FIND Guitar WHICH (REFERENCES Manufacturer AND price)
.
FIND ename1 WHICH HAS A pname1=val1 AND DOESN'T HAVE A pname2<val2 AND ((WHICH HAS A pname3=val3 AND A pname4=val4) OR DOES NOT HAVE A (pname5=val5 AND pname6=val6))
FIND ename1 . pname1=val1 & !pname2<val2 & ((pname3=val3 & pname4=val4) | !(pname5=val5 & pname6=val6))
FIND ename1.pname1=val1&!pname2<val2&((pname3=val3&pname4=val4)|!(pname5=val5&pname6=val6))
A Few Important Expressions
A:: The indistinct article. This is only syntactic suger. Equivalent expressions:
A, AN
AND:: The logical and. Equivalent expressions:
AND, &
FIND:: The beginning of the query.
NOT:: The logical negation. Equivalent expressions:
NOT, DOESN'T HAVE A PROPERTY, DOES NOT HAVE A PROPERTY, DOESN'T HAVE A, DOES NOT HAVE A, DOES NOT, DOESN'T, IS NOT, ISN'T, !
OR:: The logical or. Equivalent expressions:
OR, |
RECORD,RECORDTYPE,FILE,PROPERTY:: Role expression for restricting the result set to a specific role.
WHICH:: The marker for the beginning of the filters. Equivalent expressions:
WHICH, WHICH HAS A, WHICH HAS A PROPERTY, WHERE, WITH (A), .
REFERENCE:: This one is tricky:
REFERENCE TO
expresses a the state of having a reference property.REFERENCED BY
expresses the state of being referenced by another entity.COUNT::
COUNT
works likeFIND
but doesn’t return the entities.
Select Queries
In contrast to FIND
queries, which always return the complete entity, there are SELECT
queries which only return the entity with only those properties which are specified in the query. The syntax is very similar to FIND
queries - just replace the FIND
by SELECT <comma separated list of selectors> FROM
:
SELECT p1, p2, p3 FROM ename
However, the SELECT
query can also return properties of referenced entities and thereby are a means of joining entities together and return a custom view or projection:
SELECT Conductor.Last Name FROM Experiment
would return the conductor’s last name, when Conductor
is a reference property of Experiment
and Last Name
is a property of the Conductor
records.
Selectors
Selectors are strings of entity names which are separated by .
(dot). E.g. Conductor.Last Name
or Conductor.Address
or even Experiment.Conductor.Last name
. Selectors in a SELECT
queries are separated by ,
(comma). E.g. Conductor.First Name, Conductor.Last Name
.
Evaluation of Selectors
The query will return all those properties which have the same name as
specified by the selector (case-insensitive). However, SELECT
queries are
also capable of subtyping in the selectors:
SELECT Person FROM Experiment
would return all Person
properties but all Conductors
as well, if Conductor
is a child of Person
.
Note: When a property responsible
with data type Person
exists, the above SELECT
statement would not include records that use this property (since responsible
is not a child of Person).
Versioning
Since Caosdb 0.2 entities are optionally version controlled. The query language will be extended to include versioning in the future. A current minimal implementation introduces the ANY VERSION OF
modifier which can be used to return all matching versions in the results of COUNT
, FIND
, and SELECT
queries.
Example
FIND ANY VERSION OF RECORD WITH pname=value
returns the all past and present versions of records withpname = value
.
Scope and current limitations
The
ANY VERSION OF
modifier currently the only expression for taking the versioning into account when using the query language.Subproperties are not supported yet, e.g.
FIND ANY VERSION OF ENTITY WHICH IS REFERENCED BY ename WITH ...
. This applies to all cases where you specify properties of referenced entities or referencing entities.
Configuration
In CaosDB Server implementations before version 0.9.0, the FIND ename
query
would return any entity with that name and all children, regardless of the
entity’s role. Basically, FIND ename
was equivalent to FIND ENTITY ename
.
Since 0.9.0 the default behavior has changed and now FIND ename
is equivalent
to FIND RECORD ename
. This default is, however, configurable via the
FIND_QUERY_DEFAULT_ROLE
server property. See Server Configuration.
Future
Additional versioning queries:
Add
(LATEST|LAST|OLDEST|NEWEST|FIRST) VERSION OF
modifiers.Add
(ANY|LATEST|LAST|OLDEST|NEWEST|FIRST) VERSION (BEFORE|AFTER) (<timestamp>|<transaction id>|<entity@version>) OF
modifier.Add support for subproperties, e.g.
FIND ANY VERSION OF ENTITY WHICH IS REFERENCED BY ename WITH ...
.Find deleted entities:
FIND ename WHICH WAS DELETED (BY ME | ON 2014-12-24)
More Logic, especially
ANY
,ALL
,NONE
, andSUCH THAT
key words (and equivalents) for logical quantisation:FIND ename1 SUCH THAT ALL ename2 WHICH HAVE A REFERENCE TO ename1 HAVE A pname=val
. This is likeFIND experiment SUCH THAT ALL person WHICH ARE REFERENCED BY THIS experiment AS conductor HAVE AN 'academic title'=professor.