CaosDB’s Internal Structure

The CaosDB server

  • builds upon the Restlet framework to provide a REST interface to the network. See the HTTP Resources section for more information.

  • uses an SQL database (MariaDB or MySQL) as the backend for data storage. This is documented in the MySQL Backend section.

  • has an internal scheduling framework to organize the required backend jobs. Read more on this in the Transactions and Schedules section.

  • may use a number of authentication providers. Documentation for this is still to come.

HTTP Resources

HTTP resources are implemented in the resource package, in classes inheriting from AbstractCaosDBServerResource (which inherits Restlet’s Resource class). The main CaosDBServer class defines which HTTP resource (for example /Entity/{specifier}) will be handled by which class (EntityResource in this case).

Implementing classes need to overwrite for example the httpGetInChildClass() method (or methods corresponding to other HTTP request methods such as POST, PUT, …). Typically, they might call the execute() method of a Transaction object. Transactions are explained in detail in the Transactions and Schedules section.

@startuml
abstract AbstractCaosDBServerResource {
  {abstract} httpGetInChildClass()
  {abstract} httpPostInChildClass()
  {abstract} ...InChildClass()
}
abstract RetrieveEntityResource
class EntityResource
AbstractCaosDBServerResource <|-- RetrieveEntityResource
RetrieveEntityResource <|-- EntityResource
@enduml

MySQL Backend

The MySQL backend in CaosDB may be substituted by other backends, but at the time of writing this documentation, only MySQL (MariaDB is used for testing) is implemented. There are the following main packages which handle the backend:

backend.interfaces

Interfaces which backends may implement. The main method for most interfaces is execute(...) with arguments depending on the specific interface, and benchmarking methods (getBenchmark() and setTransactionBenchmark(b) may also be required.

backend.implementation.MySQL

MySQL implementations of the interfaces. Typical “simple” implementations create a prepared SQL statement from the arguments to execute(...) and send it to the SQL server. They may also have methods for undoing and cleanup, using an UndoHandler.

backend.transaction classes

Subclasses of the abstract BackendTransaction which implement the execute() method. These classes may use specific backend implementations (like for example the MySQL implementations) to interact with the backend database.

For example, the structure when getting an Entity ID by name looks like this:

@startuml
together {
  abstract BackendTransaction {
    HashMap impl // stores all implementations
    {abstract} execute()
  }
  note left of BackendTransaction::impl
    Stores the
    implementation
    for each
    interface."
  end note
  package ...backend.interfaces {
    interface GetIDByNameImpl {
      {abstract} execute(String name, String role, String limit)
    }
  }
}
together {
  package ...backend.transaction {
    class GetIDByName extends BackendTransaction {
      execute()
    }
  }
  package ...backend.implementation.MySQL {
    class MySQLGetIDByName implements GetIDByNameImpl {
      execute(String name, String role, String limit)
    }
  }
}

GetIDByName::execute --r-> MySQLGetIDByName
@enduml

Transactions and Schedules

In CaosDB, several client requests may be handled concurrently. This poses no problem as long as only read-only requests are processed, but writing transactions need to block other requests. Therefore all transactions (between their first and last access) block write transactions other than themselves from writing to the backend, while read transactions may happen at any time, except when a write transaction actually writes to the backend.

Note

There is a fine distinction between write transactions on the CaosDB server and actually writing to the backend, since even transactions which need only very short write access to the backend may require extensive read access before, for example to check for permissions or to check if the intended write action makes sense (linked entities must exist, they may need to be of the correct RecordType, etc.).

The request handling in CaosDB is organized in the following way:

  • HTTP resources usually create a Transaction object and call its Transaction.execute() method. Entities are passed to and from the transaction via TransactionContainers (basically normal Containers, enriched with some metadata).

  • The Transaction keeps a Schedule of related Jobs (each also wrapping a specific Transaction), which may be called at different stages, called TransactionStages.

  • The Transaction’s execute() method, when called, in turn calls a number of methods for initialization, checks, preparations, cleanup etc. Additionally the scheduled jobs are executed at their specified stages, for example all jobs scheduled for INIT are executed immediately after calling Transaction.init(). Please consult the API documentation for Transaction.execute() for details.

    Most importantly, the (abstract) method transaction() is called by execute(), which in inheriting classes typically interacts with the backend via execute(BackendTransaction, Access), which in turn calls the BackendTransaction’s BackendTransaction.executeTransaction() method (just a thin wrapper around its execute() method).

Summarized, the classes are connected like this:

@startuml
hide empty members

class Container
class TransactionContainer extends Container

abstract Transaction {
  Schedule schedule
  TransactionContainer container
  execute()
  execute(BackendTransaction t, Access access)\n    // -> t.executeTransaction(t)
}

class Schedule
class ScheduledJob
abstract Job {
  TransactionStage stage
  Transaction transaction
  execute(BackendTransaction t)\n    // -> transaction.execute(t, transaction.access)
}

Schedule "*" *- ScheduledJob
ScheduledJob *- Job
Job o--d- Transaction

TransactionContainer -* Transaction::container
Transaction::schedule *- Schedule
@enduml