Curl Access for CaosDB

As the API that is used to communicate with the CaosDB server is XML over HTTP also simple HTTP clients, such as cURL can be used. cURL is an old and established command line program for transferring files over networks that implements various protocols including HTTP/HTTPS. It is installed by default on many Linux distributions and can therefore be very useful for testing and debugging of CaosDB.

This small manual also gives some practical insights about the CaosDB protocol itself.

Doing a simple retrieve

So, let’s start right away with a few basic examples.

Let’s do a query on our demo instance:

curl "https://demo.indiscale.com/Entity/?query=FIND%20Experiment"

By default cURL sends an HTTP GET request which is needed for doing queries for CaosDB. CaosDB requires sending them to /Entity. The query itself is specified after the HTTP query string ?query=. %20 is specific to URL encoding and corresponds to a space (see https://en.wikipedia.org/wiki/Percent-encoding for details). So the actual query we were doing was FIND Experiment which should return all entities with name or with parent Experiment.

The response should look like:

<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet type="text/xsl" href="https://demo.indiscale.com/webinterface/webcaosdb.xsl" ?>
<Response srid="b56e3d1a442460c46dde924a54e8afba" timestamp="1561453363382" baseuri="https://demo.indiscale.com" count="3">
  <UserInfo>
    <Roles>
      <Role>anonymous</Role>
    </Roles>
  </UserInfo>
  <Query string="FIND Experiment" results="3">
    <ParseTree>(cq FIND (entity Experiment) &lt;EOF&gt;)</ParseTree>
    <Role />
    <Entity>Experiment</Entity>
    <TransactionBenchmark since="Tue Jun 25 11:02:43 CEST 2019" />
  </Query>
  <RecordType id="212" name="Experiment">
    <Permissions />
  </RecordType>
  <RecordType id="215" name="UnicornExperiment">
    <Permissions />
    <Parent id="212" name="Experiment" />
  </RecordType>
  <Record id="230">
    <Permissions />
    <Parent id="215" name="UnicornExperiment" />
    <Property id="216" name="date" datatype="DATETIME" importance="FIX">
      2025-10-11
      <Permissions />
    </Property>
    <Property id="217" name="species" datatype="TEXT" importance="FIX">
      Unicorn
      <Permissions />
    </Property>
    <Property id="214" name="Conductor" datatype="TEXT" importance="FIX">
      Anton Beta
      <Permissions />
    </Property>
    <Property id="221" name="Photo" datatype="Photo" importance="FIX">
      226
      <Permissions />
    </Property>
    <Property id="220" name="LabNotes" datatype="LabNotes" importance="FIX">
      227
      <Permissions />
    </Property>
    <Property id="218" name="UnicornVideo" datatype="UnicornVideo" importance="FIX">
      228
      <Permissions />
    </Property>
    <Property id="219" name="UnicornECG" datatype="UnicornECG" importance="FIX">
      229
      <Permissions />
    </Property>
  </Record>
</Response>

We can see, that in fact two RecordTypes and one Record are returned. Furthermore the response contains some additional information:

  • Attributes in the Response tag:
    • srid="b56e3d1a442460c46dde924a54e8afba" A unique identifier for this request.
    • timestamp="1561453363382" The UNIX timestamp for this request.
    • baseuri="https://demo.indiscale.com" The base URI of the instance of CaosDB that performed the request.
    • count="3" The number of results (2 RecordTypes and 1 Record)
  • Information about the user in UserInfo. In this case we have not logged in, so we are anonymous.
  • Detailled information about the query. This includes for example the parse tree and can be used for debugging and testing. Depending on the settings of the server instance, this tag includes more or less detail, for example a more detailled transaction benchmark.

More details about the retrieve

The cURL statement used in the previous section made use of a lot of default settings for cURL. Let’s have a closer look behind the options. (I assigned the above URL to a shell variable, to make the statement more readable.)

URL="https://demo.indiscale.com/Entity/?query=FIND%20Experiment"

curl -X GET -b cookie.txt -D head.txt $URL

This command specifies three more options:

  • -X GET Do a GET request. This can of course be replaced by POST, PUT, DELETE or any other HTTP operation.
  • -b cookie.txt This instructs cURL to use cookies from the file cookie.txt (which we don’t have yet, see below)
  • -D head.txt Tell cURL to store the received header in the file head.txt.

Running this command will give us a similar response than in the previous section, but additionally a file head.txt:

HTTP/1.1 200 OK
Content-Type: text/xml; charset=UTF-8
Date: Tue, 25 Jun 2019 09:17:23 GMT
Accept-Ranges: bytes
Server: Restlet-Framework/2.3.12
Vary: Accept-Charset, Accept-Encoding, Accept-Language, Accept
Transfer-Encoding: chunked

There is nothing special in that header. Most importantly the request has lead to a response without HTTP error.

Logging in

You might have asked yourself, what we need the cookie for. The simple answer is: authentication. For many operations done on the CaosDB server we have to log in first. The demo instance is configured to allow for anonymous read access by default. But depending on the instance you are accessing, even this might be disallowed.

You can log in to the server using cURL with the following command:

URL="https://demo.indiscale.com/login"

curl -X POST -c cookie.txt -D head.txt -H "Content-Type: application/x-www-form-urlencoded" -d username=salexan -d password=$PW $URL

So here we are doing a POST request instead of a GET. We are instructing cURL to use the cookie.txt file to store the cookies it recieves. This is done using the -c option (instead of the -b option above). This time we are explicitely specifying the content type of the content we are sending (using the POST request) with the -H option. The actual content sent are the two fields specified using the -d options. This boils down to the two key-value-pairs “username=” and “password=”.

This time we are sending the information to a different context indentified by /login.

If you don’t want to supply your passowrd in plain text you can for example use a password manager (like I do), in my case pass as follows to store your password in the variable $PW:

PW=$(pass your/passowrd/identifier)

Custom Certificates

If you are running your own CaosDB instance it might be necessary to use a custom SSL certificate.

You can specify this using the option --cacert, e.g.:

--cacert "/path/to/certificate/root.cert.pem"

Uploading files

According to the specification, a file upload is a POST with multipart form data. This can be achieved using CURL with the following simple command line:

curl -b cookie.txt \
     -F "FileRepresentation=<file.xml" \
     -F "testfile.bla=@testfile.bla" \
     $URL

Here I am using a previously stored cookie from cookie.txt.

There are two prerequisites to executing the above command: - The file representation file.xml which is in this case stored in a file. - The actual file to be uploaded with name testfile.bla. The left hand side of the assignment is very important as the identifier given here (which in this case is testfile.bla) will be used in the XML for identifying the file that is described in the corresponding XML tag.

Let’s have a look at the contents of file.xml:

<Post>
  <File upload="testfile.bla" destination="/test/testfile.bla" description="bla"
checksum="672f8ff4ae8530de295f9dd963724947841e6277edec3b21820b5e44d0a64baef90fb04e22048028453d715f79357acc5bd2d566fe6ede65f981ba3dda06bae4"
size="3"/>
</Post>

The attributes have the following meaning: - upload=testfile.bla The filename given here is actually no filename, but an identifier to find the multipart data chunk that contains the file. I called it testfile.bla for simplicity here. - destination="/test/testfile.bla" The destination path on the CaosDB server file system.

Before looking at the other attributes let’s have a look at the file testfile.bla itself:

ok

The file has size “3” which can be verified on linux using a:

stat testfile.bla

It’s hashsum is important for checking the integrity after the transfer to the server. It can be computed on linux using:

sha512sum testfile.bla

These information has to be supplied as the remaining attributes to the XML.