CaosDB Query Language Examples

See syntax specification in CaosDB Query Language Syntax.

In this chapter, the CaosDB Query Language (CQL) is presented as a means of formulating search commands, commonly referred to as queries. It is highly recommended that you experiment with the examples provided, such as those found on https://demo.indiscale.com. An interactive tour is also available on this public instance, which includes a comprehensive overview of the query language. Therefore, it is suggested that you begin there and subsequently proceed with this more detailed explanation.

Introduction

Queries typically start with the keyword FIND, followed by a description of what you want to find. For example, you can search for all musical instruments with FIND MusicalInstrument.

Note, the CQL is case insensitive. We will write keywords of CQL in all caps to illustrate what parts are part of the language.

The most common way is to provide a RecordType name after FIND (as in the example above). However, you can also use the name of some other entity: FIND 'My first guitar'.

Note, that we put the name here in quotes. Spaces are used in CQL as separator of words. Thus, if something contains quotes, like the name here, it needs to be quoted.

While queries like the last one are great to get an impression of the data, often we need to be more specific. Therefore, queries can include various conditions to restrict the result set.

Example: FIND MusicalAnalysis WITH quality_factor>0.5 AND date IN 2019. The keyword WITH signifies that for each Record of the type MusicalAnalysis, an assessment is made to determine whether it possesses a Property labelled quality_factor that exceeds 0.5, as well as another Property labelled date that may correspond to any day within the year 2019.

In order to make CQL easier to learn and to remember we designed it to be close to natural spoken English language. For example, you can write FIND Guitar WHICH HAS A PROPERTY price. Here, “HAS A PROPERTY” is what we call syntactic sugar. It lets the query role off the tongue more easily than FIND Guitar WHICH price but it is actually not needed and does not change the meaning of the query. In fact, you could also write FIND Guitar WITH price.

If you are only interested in the number of Entities that match your query, you can replace FIND with COUNT and the query will only return the number of Entities in the result set.

Sometimes the list of Records that you get using a FIND query is not what you need; especially if you want to export a subset of the data for the analysis with some external tool. SELECT queries offer to represent the query result in a tabular form.

If you replace the FIND keyword of a query with SELECT x, y, z FROM, then CaosDB will return the result as tabular data.

For example, instead of FIND Guitar, try out SELECT name, electric FROM Guitar

As you can see, those queries are design to allow very specific data requests. If you do not want/need to be as specific you can omit the first keyword (FIND or SELECT) which creates a search for anything that has a text Property with something like your expression. For example, the query “John” will search for any Records that has a text property that contains this string.

With this, we conclude our introduction of CQL. You now know about the basic elements. The following will cover the various aspects in more detail and you will for example learn how you can use references among Records, or meta data like the creation time of a Record to restrict the query result set.

What am I searching for?

We already learned, that we can provide the name of a RecordType after the FIND keyword. Let’s call this part of the query “entity expression”. In general, we need to identify with the entity expression one or more entities via their name, CaosDB ID or a pattern.

  • FIND Guitar

  • FIND Guit* (’*’ represents none, one or more characters)

  • FIND <<[gG]ui.*>> (a regular expression surrounded by << and ‘>>’. see below)

  • FIND 110

The result set will contain Entities that are either identified by the entity expression directly (i.e. they have the name or the given ID) or the have such an Entity as parent.

As you know, CaosDB distincts among different Entity roles:

  • Entity

  • Record

  • RecordType

  • Property

  • File

You can provide the role directly after the FIND keyword and before the entity expression: FIND RECORD Guitar. The result set will then restricted to Entities with that role.

Conditions / Filters

POV - Property-Operator-Value

The following queries are equivalent and will restrict the result set to records which have a property named pname1 that has a value val1.

FIND ename.pname1=val1

FIND ename WITH pname1=val1

FIND ename WHICH HAS A PROPERTY pname1=val1

FIND ename WHICH HAS A pname1=val1

Again, the resultset can be restricted to any other entity role as well:

FIND RECORDTYPE ename WHICH HAS A pname1=val1

currently known operators: =, !=, <=, <, >=, > (and cf. next paragraphes!)

Special Operator: LIKE

The LIKE can be used with wildcards. The * is a wildcard for any (possibly empty) sequence of characters.

Examples:

FIND ename WHICH HAS A pname1 LIKE va*

FIND ename WHICH HAS A pname1 LIKE va*1

FIND ename WHICH HAS A pname1 LIKE *al1

Note: The LIKE operator will only produce expectable results with text properties.

Special Case: References

In general a reference can be addressed just like a POV filter. So

FIND ename1 WITH pname1=ename2

will also return any record named ename1 which references the entity with name or id ename2 via a reference property named pname1. However, it will also return any entity with a text property of that name with the string value ename2. In order to restrict the result set to reference properties one may make use of special reference operators:

reference operators: ->, REFERENCES, REFERENCE TO

The query looks like this:

FIND ename1 WHICH HAS A pname1 REFERENCE TO ename2

FIND ename1 WHICH HAS A pname1->ename2

Special Case: DateTime

DateTime operators: =, !=, <, >, IN, NOT IN

d1=d2: Equivalence relation.
  • ‘’True’’ iff d1 and d2 are equal in every respect (same DateTime flavor, same fields are defined/undefined and all defined fields are equal respectively).

  • ‘’False’’ iff they have the same DateTime flavor but have different fields defined or fields with differing values.

  • ‘’Undefined’’ otherwise.

Examples:

  • 2015-04-03=2015-04-03T00:00:00 is undefined.

  • 2015-04-03T00:00:00=2015-04-03T00:00:00.0 is undefined (second precision vs. nanosecond precision).

  • 2015-04-03T00:00:00.0=2015-04-03T00:00:00.0 is true.

  • 2015-04-03T00:00:00=2015-04-03T00:00:00 is true.

  • 2015-04=2015-05 is false.

  • 2015-04=2015-04 is true.

d1!=d2: Intransitive, symmetric relation.
  • ‘’True’’ iff d1=d2 is false.

  • ‘’False’’ iff d1=d2 is true.

  • ‘’Undefined’’ otherwise.

Examples:

  • 2015-04-03!=2015-04-03T00:00:00 is undefined.

  • 2015-04-03T00:00:00!=2015-04-03T00:00:00.0 is undefined.

  • 2015-04-03T00:00:00.0!=2015-04-03T00:00:00.0 is false.

  • 2015-04-03T00:00:00!=2015-04-03T00:00:00 is false.

  • 2015-04!=2015-05 is true.

  • 2015-04!=2015-04 is false.

d1>d2: Transitive, non-symmetric relation.

Semantics depend on the flavors of d1 and d2. If both are…

UTCDateTime
  • ‘’True’’ iff the time of d1 is after the the time of d2 according to UTC.

  • ‘’False’’ otherwise.

SemiCompleteDateTime
  • ‘’True’’ iff d1.ILB>d2.EUB is true or d1.ILB=d2.EUB is true.

  • ‘’False’’ iff d1.EUB<d2.ILB}} is true or {{{d1.EUB=d2.ILB is true.

  • ‘’Undefined’’ otherwise.

Examples:

  • 2015>2014 is true.

  • 2015-04>2014 is true.

  • 2015-01-01T20:15.00>2015-01-01T20:14 is true.

  • 2015-04>2015 is undefined.

  • 2015>2015-04 is undefined.

  • 2015-01-01T20:15>2015-01-01T20:15:15 is undefined.

  • 2014>2015 is false.

  • 2014-04>2015 is false.

  • 2014-01-01>2015-01-01T20:15:30 is false.

d1<d2: Transitive, non-symmetric relation.

Semantics depend on the flavors of d1 and d2. If both are…

UTCDateTime
  • ‘’True’’ iff the time of d1 is before the the time of d2 according to UTC

  • ‘’False’’ otherwise.

SemiCompleteDateTime
  • ‘’True’’ iff d1.EUB<d2.ILB is true or d1.EUB=d2.ILB is true.

  • ‘’False’’ iff d1.ILB>d2.EUB}} is true or {{{d1.ILB=d2.EUB is true.

  • ‘’Undefined’’ otherwise.

Examples:

  • 2014<2015 is true.

  • 2014-04<2015 is true.

  • 2014-01-01<2015-01-01T20:15:30 is true.

  • 2015-04<2015 is undefined.

  • 2015<2015-04 is undefined.

  • 2015-01-01T20:15<2015-01-01T20:15:15 is undefined.

  • 2015<2014 is false.

  • 2015-04<2014 is false.

  • 2015-01-01T20:15.00<2015-01-01T20:14 is false.

d1 IN d2: Transitive, non-symmetric relation.

Semantics depend on the flavors of d1 and d2. If both are…

SemiCompleteDateTime
  • ‘’True’’ iff (d1.ILB>d2.ILB is true or d1.ILB=d2.ILB is true) and (d1.EUB<d2.EUB is true or d1.EUB=d2.EUB is true).

  • ‘’False’’ otherwise.

Examples:

  • 2015-01-01 IN 2015 is true.

  • 2015-01-01T20:15:30 IN 2015-01-01 is true.

  • 2015-01-01T20:15:30 IN 2015-01-01T20:15:30 is true.

  • 2015 IN 2015-01-01 is false.

  • 2015-01-01 IN 2015-01-01T20:15:30 is false.

d1 NOT IN d2: Transitive, non-symmetric relation.

Semantics depend on the flavors of d1 and d2. If both are…

SemiCompleteDateTime
  • ‘’True’’ iff d1.ILB IN d2.ILB is false.

  • ‘’False’’ otherwise.

Examples:

  • 2015 NOT IN 2015-01-01 is true.

  • 2015-01-01 NOT IN 2015-01-01T20:15:30 is true.

  • 2015-01-01 NOT IN 2015 is false.

  • 2015-01-01T20:15:30 NOT IN 2015-01-01 is false.

  • 2015-01-01T20:15:30 NOT IN 2015-01-01T20:15:30 is false.

Note

These semantics follow a three-valued logic with true, false and undefined as truth values. Only true is truth preserving. I.e. only those expressions which evaluate to true pass the POV filter.

FIND ... WHICH HAS A somedate=2015-01 only returns entities for which somedate=2015-01 is true. On the other hand, FIND ... WHICH DOESN'T HAVE A somedate=2015-01 returns entities for which somedate=2015-01 is false or undefined. Shortly put, NOT d1=d2 is not equivalent to d1!=d2. The latter assertion is stronger.

Omitting the Property or the Value

One doesn’t have to specify the property or the value at all. The following query filters the result set for records which have any property with a value greater than val1.

FIND ename WHICH HAS A PROPERTY > val1

FIND ename . > val1

FIND ename.>val1

And for references…

FIND ename1 WHICH HAS A REFERENCE TO ename2

FIND ename1 WHICH REFERENCES ename2

FIND ename1 . -> ename2

FIND ename1.->ename2

The following query returns records which have a pname1 property with any value.

FIND ename WHICH HAS A PROPERTY pname1

FIND ename WHICH HAS A pname1

FIND ename WITH pname1

FIND ename WITH A pname1

FIND ename WITH A PROPERTY pname1

FIND ename WITH PROPERTY pname1

FIND ename . pname1

FIND ename.pname1

TransactionFilter

FIND ename WHICH (sugar|negated_sugar)? (NOT)? (CREATED|INSERTED|UPDATED) (by_clause time_clause?| time_clause by_clause?)

Definition

sugar:: HAS BEEN | HAVE BEEN | HAD BEEN | WAS | IS

negated_sugar:: HAS NOT BEEN | HASN'T BEEN | WAS NOT | WASN'T | IS NOT | ISN'T | HAVN'T BEEN | HAVE NOT BEEN | HADN'T BEEN | HAD NOT BEEN

by_clause:: BY (ME | username | SOMEONE ELSE (BUT ME)? | SOMEONE ELSE BUT username) datetime:: A datetime string of the form YYYY[-MM[-DD(T| )[hh[:mm[:ss[.nnn][(+|-)zzzz]]]]]] or TODAY. time_clause:: [AT|ON|IN|BEFORE|AFTER|UNTIL|SINCE] (datetime)

Examples

FIND ename WHICH HAS BEEN CREATED BY ME ON 2014-12-24

FIND ename WHICH HAS BEEN CREATED BY SOMEONE ELSE ON 2014-12-24

FIND ename WHICH HAS BEEN CREATED BY erwin ON 2014-12-24

FIND ename WHICH HAS BEEN UPDATED BY SOMEONE ELSE BUT erwin ON 2014-12-24

FIND ename WHICH HAS BEEN INSERTED BY erwin

FIND ename WHICH HAS BEEN INSERTED SINCE 2021-04

Note that SINCE and UNTIL are inclusive, while BEFORE and AFTER are not.

File Location

Search for file objects by their location:

FIND FILE WHICH IS STORED AT a/certain/path/

Wildcards

STORED AT can be used with wildcards similar to unix wildcards.

  • * matches any characters or none at all, but not the directory separator /

  • ** matches any character or none at all.

  • A leading * is a shortcut for /**

  • Asterisks directly between two other asterisks are ignored: *** is the same as **.

  • Escape character: \ (E.g. \\ is a literal backslash. \* is a literal star. But \\* is a literal backslash followed by a wildcard.)

Examples:

Find any files ending with .acq: FIND FILE WHICH IS STORED AT *.acq or FIND FILE WHICH IS STORED AT **.acq or FIND FILE WHICH IS STORED AT /**.acq

Find files stored one directory below /data/, ending with .acq: FIND FILE WHICH IS STORED AT /data/*/*.acq

Find files stored in /data/, ending with .acq: FIND FILE WHICH IS STORED AT /data/*.acq

Find files stored in a directory at any depth in the tree below /data/, ending with .acq: FIND FILE WHICH IS STORED AT /data/**.acq

Find any file in a directory which begins with 2016-02: FIND FILE WHICH IS STORED AT */2016-02*/*

Back References

The back reference filters for entities that are referenced by another entity. The following query returns records of the type ename1 which are referenced by ename2 entities via the reference property pname1.

  • FIND ename1 WHICH IS REFERENCED BY ename2 AS A pname1

  • FIND ename1 WITH @ ename2 / pname1

  • FIND ename1 . @ ename2 / pname1

One may omit the property specification:

  • FIND ename1 WHICH IS REFERENCED BY ename2

  • FIND ename1 WHICH HAS A PROPERTY @ ename2

  • FIND ename1 WITH @ ename2

  • FIND ename1 . @ ename2

Nested queries, or filtering by sub-properties

Nested queries can easily be searched by simply concatenating WHICH or WITH expressions:

  • FIND ename WHICH HAS A pname WHICH HAS A subpname=val

  • For example: FIND AN experiment WHICH HAS A camera WHICH HAS A 'serial number'= 1234567890

Combining Filters with Propositional Logic

Any result set can be filtered by logically combining POV filters or back reference filters:

Conjunction (AND)

As we saw above, we can combine conditions:

FIND MusicalAnalysis WHICH HAS quality_factor>0.5 AND date IN 2019

In general, the conjunction takes the form FIND <eexpr> WHICH <filter1> AND <filter2>. You can also use & instead of AND or chain more than two conditions. If you mix conjunctions with disjunctions, you need to add brackets to define the priority. For example: FIND <eexpr> WHICH (<filter1> AND <filter2>) OR <filter3>.

FIND Guitar WHICH REFERENCES Manufacturer AND price is a combination of a reference filter and a POV filter. For readability, you can also write
FIND Guitar WHICH REFERENCES Manufacturer AND WHICH HAS A price. However, the additional “WHICH HAS A” is purely cosmetic (syntactic sugar).

Disjunction (OR)

The rules for disjunctions (OR or |) are the same as for conjunctions, see above.

  • FIND ename1 WHICH HAS A PROPERTY pname1=val1 OR A PROPERTY pname2=val2 Or A PROPERTY...

  • FIND ename1 WHICH HAS A PROPERTY pname1=val1 OR A pname2=val2 OR ...

  • FIND ename1 . pname1=val1 | pname2=val2 | ...

Negation (NOT)

You can negate any filter by prefixing the filter with NOT or !: FIND <eexpr> WHICH NOT <filter1>.

There are many syntactic sugar alternatives which are treated the same as “NOT”:

  • DOES NOT HAVE

  • ISN'T

  • and many more

  • FIND ename1 WHICH DOES NOT HAVE A PROPERTY pname1=val1

  • FIND ename1 WHICH DOESN'T HAVE A pname1=val1

  • FIND ename1 . NOT pname2=val2

  • FIND ename1 . !pname2=val2

Parentheses

Basically, you can put parantheses around filter expressions and con- or disjunctions.

  • FIND Guitar WHICH (REFERENCES Manufacturer AND WHICH HAS A price).

  • FIND Guitar WHICH (REFERENCES Manufacturer) AND (WHICH HAS A price).

For better readability, the above query can be written as:

  • FIND Guitar WHICH (REFERENCES Manufacturer AND HAS A price). Note, that without syntactic sugar this query looks like:

  • FIND Guitar WHICH (REFERENCES Manufacturer AND price).

  • FIND ename1 WHICH HAS A pname1=val1 AND DOESN'T HAVE A pname2<val2 AND ((WHICH HAS A pname3=val3 AND A pname4=val4) OR DOES NOT HAVE A (pname5=val5 AND pname6=val6))

  • FIND ename1 . pname1=val1 & !pname2<val2 & ((pname3=val3 & pname4=val4) | !(pname5=val5 & pname6=val6))

  • FIND ename1.pname1=val1&!pname2<val2&((pname3=val3&pname4=val4)|!(pname5=val5&pname6=val6))

A Few Important Expressions

  • A:: The indistinct article. This is only syntactic suger. Equivalent expressions: A, AN

  • AND:: The logical and. Equivalent expressions: AND, &

  • FIND:: The beginning of the query.

  • NOT:: The logical negation. Equivalent expressions: NOT, DOESN'T HAVE A PROPERTY, DOES NOT HAVE A PROPERTY, DOESN'T HAVE A, DOES NOT HAVE A, DOES NOT, DOESN'T, IS NOT, ISN'T, !

  • OR:: The logical or. Equivalent expressions: OR, |

  • RECORD,RECORDTYPE,FILE,PROPERTY:: Role expression for restricting the result set to a specific role.

  • WHICH:: The marker for the beginning of the filters. Equivalent expressions: WHICH, WHICH HAS A, WHICH HAS A PROPERTY, WHERE, WITH (A), .

  • REFERENCE:: This one is tricky: REFERENCE TO expresses a the state of having a reference property. REFERENCED BY expresses the state of being referenced by another entity.

  • COUNT:: COUNT works like FIND but doesn’t return the entities.

Select Queries

In contrast to FIND queries, which always return the complete entity, there are SELECT queries which only return the entity with only those properties which are specified in the query. The syntax is very similar to FIND queries - just replace the FIND by SELECT <comma separated list of selectors> FROM:

SELECT p1, p2, p3 FROM ename

However, the SELECT query can also return properties of referenced entities and thereby are a means of joining entities together and return a custom view or projection:

SELECT Conductor.Last Name FROM Experiment

would return the conductor’s last name, when Conductor is a reference property of Experiment and Last Name is a property of the Conductor records.

Selectors

Selectors are strings of entity names which are separated by . (dot). E.g. Conductor.Last Name or Conductor.Address or even Experiment.Conductor.Last name. Selectors in a SELECT queries are separated by , (comma). E.g. Conductor.First Name, Conductor.Last Name.

Evaluation of Selectors

The query will return all those properties which have the same name as specified by the selector (case-insensitive). However, SELECT queries are also capable of subtyping in the selectors:

SELECT Person FROM Experiment would return all Person properties but all Conductors as well, if Conductor is a child of Person.

Note: When a property responsible with data type Person exists, the above SELECT statement would not include records that use this property (since responsible is not a child of Person).

Versioning

Since Caosdb 0.2 entities are optionally version controlled. The query language will be extended to include versioning in the future. A current minimal implementation introduces the ANY VERSION OF modifier which can be used to return all matching versions in the results of COUNT, FIND, and SELECT queries.

Example

  • FIND ANY VERSION OF RECORD WITH pname=value returns the all past and present versions of records with pname =  value.

Scope and current limitations

  • The ANY VERSION OF modifier currently the only expression for taking the versioning into account when using the query language.

  • Subproperties are not supported yet, e.g. FIND ANY VERSION OF ENTITY WHICH IS REFERENCED BY ename WITH .... This applies to all cases where you specify properties of referenced entities or referencing entities.

Configuration

In CaosDB Server implementations before version 0.9.0, the FIND ename query would return any entity with that name and all children, regardless of the entity’s role. Basically, FIND ename was equivalent to FIND ENTITY ename. Since 0.9.0 the default behavior has changed and now FIND ename is equivalent to FIND RECORD ename. This default is, however, configurable via the FIND_QUERY_DEFAULT_ROLE server property. See Server Configuration.

Future

  • Additional versioning queries:

    • Add (LATEST|LAST|OLDEST|NEWEST|FIRST) VERSION OF modifiers.

    • Add (ANY|LATEST|LAST|OLDEST|NEWEST|FIRST) VERSION (BEFORE|AFTER) (<timestamp>|<transaction id>|<entity@version>) OF modifier.

    • Add support for subproperties, e.g. FIND ANY VERSION OF ENTITY WHICH IS REFERENCED BY ename WITH ....

    • Find deleted entities: FIND ename WHICH WAS DELETED (BY ME | ON 2014-12-24)

  • More Logic, especially ANY, ALL, NONE, and SUCH THAT key words (and equivalents) for logical quantisation: FIND ename1 SUCH THAT ALL ename2 WHICH HAVE A REFERENCE TO ename1 HAVE A pname=val. This is like FIND experiment SUCH THAT ALL person WHICH ARE REFERENCED BY THIS experiment AS conductor HAVE AN 'academic title'=professor.