Benchmarking CaosDB
===================

Benchmarking CaosDB may encompass several distinct areas: How much time
is spent in the server’s Java code, how much time is spent inside the
SQL backend, are the same costly methods called more than once? This
documentation tries to answer some questions connected with these
benchmarking aspects and give you the tools to answer your own
questions.

Before you start
----------------

In order to obtain meaningful results, you should disable caching.

MariaDB
~~~~~~~

Set the corresponding variable to 0: ``SET GLOBAL query_cache_type = 0;``

Java Server
~~~~~~~~~~~

In the config:

.. code:: cfg

   CACHE_DISABLE=true

Tools for the benchmarking
--------------------------

For averaging over many runs of comparable requests and for putting the
database into a representative state, Python scripts are used. The
scripts can be found in the ``caosdb-dev-tools`` repository, located at
https://gitlab.indiscale.com/caosdb/src/caosdb-dev-tools in the folder
``benchmarking``:

Python Script ``fill_database.py``
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

This commandline script is meant for filling the database with enough
data to represeny an actual real-life case, it can easily create
hundreds of thousands of Entities.

The script inserts predefined amounts of randomized Entities into the
database, RecordTypes, Properties and Records. Each Record has a random
(but with defined average) number of Properties, some of which may be
references to other Records which have been inserted before. Actual
insertion of the Entities into CaosDB is done in chunks of a defined
size.

Users can tell the script to store times needed for the insertion of
each chunk into a tsv file.

Python Script ``measure_execution_time.py``
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

A somewhat outdated script which executes a given query a number of
times and then save statistics about the ``TransactionBenchmark``
readings (see below for more information about the transaction
benchmarks) delivered by the server.

Python Script ``sql_routine_measurement.py``
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Simply call ``./sql_routine_measurement.py`` in the scripts directory.
An sql file is automatically executed which enables the correct
``performance_schema`` tables. However, the performance_schema of
mariadb needs to be enabled. Add ``performance_schema=ON`` to the
configuration file of mariadb as it needs to be enabled on start up.
This script expects the MariaDB server to be accessible on 127.0.0.1
with the default caosdb user and password (caosdb;random1234).

You might consider to increase
``performance_schema_events_transactions_history_long_size``.

::

   performance_schema_events_transactions_history_long_size=1000000

The performance schema must be enabled (see below).

MariaDB General Query Log
~~~~~~~~~~~~~~~~~~~~~~~~~

MariaDB and MySQL have a feature to enable the logging of SQL queries’
times. This logging must be turned on on the SQL server as described in
the `upstream
documentation <https://mariadb.com/kb/en/general-query-log/>`__: Add to
the mysql configuration:

::

   log_output=TABLE
   general_log

or calling

.. code:: sql

   SET GLOBAL log_output = 'TABLE';
   SET GLOBAL general_log = 'ON';

In the Docker environment LinkAhead, this can conveniently be done with
``linkahead mysqllog {on,off,store}``.

MariaDB Slow Query Log
~~~~~~~~~~~~~~~~~~~~~~

See `slow query log
docs <https://mariadb.com/kb/en/slow-query-log-overview/>`__

MariaDB Performance Schema
~~~~~~~~~~~~~~~~~~~~~~~~~~

The most detailed information on execution times can be acquired using
the performance schema.

To use it, the ``performance_schema`` setting in the MariaDB server must
be
enabled(`docs <https://mariadb.com/kb/en/performance-schema-overview/#enabling-the-performance-schema>`__,
for example by setting this in the config files:

::

   [mysqld]

   performance_schema=ON

The performance schema provides many different tables in the
``performance_schema``. You can instruct MariaDB to create those tables
by setting the appropriate ``instrument`` and ``consumer`` variables.
E.g.

.. code:: sql

   update performance_schema.setup_instruments set enabled='YES', timed='YES' WHERE NAME LIKE '%statement%';
   update performance_schema.setup_consumers set enabled='YES' WHERE NAME LIKE '%statement%';

This can also be done via the configuration.

::

   [mysqld]

   performance_schema=ON
   performance-schema-instrument='statement/%=ON'
   performance-schema-consumer-events-statements-history=ON                        
   performance-schema-consumer-events-statements-history-long=ON

You may want to look at the result of the following commands:

.. code:: sql


   select * from performance_schema.setup_consumers;
   select * from performance_schema.setup_instruments;

Note, that the ``base_settings.sql`` enables appropriate instruments and
consumers.

Before you start a measurement, you will want to empty the tables. E.g.:

.. code:: sql

   truncate table  performance_schema.events_statements_history_long ;

The procedure ``reset_stats`` in ``base_settings.sql`` clears the
typically used ones.

The tables contain many columns. An example to get an informative view
is

.. code:: sql

   select left(sql_text,50), left(digest_text,50), ms(timer_wait) from performance_schema.events_statements_history_long order by ms(timer_wait);

where the function ``ms`` is defined in ``base_settings.sql``. Or a very
useful one:

.. code:: sql

   select  left(digest_text,100) as digest,ms(sum_timer_wait) as time_ms, count_star from performance_schema.events_statements_summary_by_digest order by time_ms;

Useful SQL configuration with docker
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

In order to allow easy testing and debugging the following is useful
when using docker. Change the docker-compose file to include the
following for the mariadb service:

::

       networks:
         # available on port 3306, host name 'sqldb'
         - caosnet
       ports:
         - 3306:3306

Check it with ``mysql -ucaosdb -prandom1234 -h127.0.0.1 caosdb`` Add the
appropriate changes (e.g. ``performance_schema=ON``) to
``profiles/empty/custom/mariadb.conf.d/mariadb.cnf`` (or in the profile
folder that you use).

Manual Java-side benchmarking
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Benchmarking can be done using the ``TransactionBenchmark`` class (in
package ``org.caosdb.server.database.misc``).

-  Single timings can be added to instances of that class via the
   ``addBenchmark(object, time)`` method. Multiple benchmarks for the
   same object (typically just strings) can be averaged.
-  Benchmarks can be serialized into XML, ``Container`` and ``Query``
   objects already use this with their included benchmarks to output
   benchmarking results.
-  To work with the benchmarks of often used objects, use these methods:

   -  ``Container.getTransactionBenchmark().addBenchmark()``
   -  ``Query.addBenchmark()``

To enable transaction benchmarks and disable caching in the server, set
these server settings:

.. code:: cfg

   TRANSACTION_BENCHMARK_ENABLED=true
   CACHE_DISABLE=true

Additionally, the server should be started via ``make run-debug``
(instead of ``make run-single``), otherwise the benchmarking will not be
active.

Notable benchmarks and where to find them
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

+----------------------+---------------------------+------------------+
| Name                 | Where measured            | What measured    |
+======================+===========================+==================+
| ``Retrieve.init``    | transac                   | transaction/     |
|                      | tion/Transaction.java#135 | Retrieve.java#48 |
+----------------------+---------------------------+------------------+
| ``Re                 | transac                   | transaction/R    |
| trieve.transaction`` | tion/Transaction.java#174 | etrieve.java#133 |
+----------------------+---------------------------+------------------+
| ``Retriev            | transac                   | transaction/     |
| e.post_transaction`` | tion/Transaction.java#182 | Retrieve.java#77 |
+----------------------+---------------------------+------------------+
| ``EntityResource.h   | resource/transactio       | all except XML   |
| ttpGetInChildClass`` | n/EntityResource.java#118 | generation       |
+----------------------+---------------------------+------------------+
| ``ExecuteQuery``     | ?                         | ?                |
+----------------------+---------------------------+------------------+
|                      |                           |                  |
+----------------------+---------------------------+------------------+

External JVM profilers
~~~~~~~~~~~~~~~~~~~~~~

Additionally to the transaction benchmarks, it is possible to benchmark
the server execution via external Java profilers. For example,
`VisualVM <https://visualvm.github.io/>`__ can connect to JVMs running
locally or remotely (e.g. in a Docker container). To enable this in
LinkAhead’s Docker environment, set

.. code:: yaml

   devel:
     profiler: true

Alternatively, start the server (without docker) with the
``run-debug-single`` make target, it will expose the JMX interface, by
default on port 9090.

Most profilers, like as VisualVM, only gather cumulative data for call
trees, they do not provide complete call graphs (as
callgrind/kcachegrind would do). They also do not differentiate between
calls with different query strings, as long as the Java process flow is
the same (for example, ``FIND Record 1234`` and
``FIND Record A WHICH HAS A Property B WHICH HAS A Property C>100``
would be handled equally).

Example settings for VisualVM
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

In the sampler settings, you may want to add these expressions to the
blocked packages: ``org.restlet.**, com.mysql.**``. Branches on the call
tree which are entirely inside the blacklist, will become leaves.
Alternatively, specify a whitelist, for example with
``org.caosdb.server.database.backend.implementation.**``, if you only
want to see the time spent for certain MySQL calls.

How to set up a representative database
---------------------------------------

For reproducible results, it makes sense to start off with an empty
database and fill it using the ``fill_database.py`` script, for example
like this:

.. code:: sh

   ./fill_database.py -t 500 -p 700 -r 10000 -s 100 --clean

The ``--clean`` argument is not strictly necessary when the database was
empty before, but it may make sense when there have been previous runs
of the command. This example would create 500 RecordTypes, 700
Properties and 10000 Records with randomized properties, everything is
inserted in chunks of 100 Entities.

How to measure request times
----------------------------

If the execution of the Java components is of interest, the VisualVM
profiler should be started and connected to the server before any
requests to the server are started.

When doing performance tests which are used for detailed analysis, it is
important that

1. CaosDB is in a reproducible state, which should be documented
2. all measurements are repeated several times to account for inevitable
   variance in access (for example file system caching, network
   variablity etc.)

Filling the database
~~~~~~~~~~~~~~~~~~~~

By simply adding the option ``-T logfile.tsv`` to the
``fill_database.py`` command above, the times for inserting the records
are stored in a tsv file and can be analyzed later.

Obtain statistics about a query
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

To repeat single queries a number of times,
``measure_execution_time.py`` can be used, for example:

.. code:: sh

   ./measure_execution_time.py -n 120 -q "FIND MusicalInstrument WHICH IS REFERENCED BY Analysis"

This command executes the query 120 times, additional arguments could
even plot the TransactionBenchmark results directly.

On method calling order and benchmarked events
----------------------------------------------

-  ``Transaction.execute()`` :: Logs benchmarks for events like:

   -  ``INIT`` :: The transaction’s ``init()`` method.
   -  ``PRE_CHECK``
   -  ``CHECK``
   -  ``POST_CHECK``
   -  ``PRE_TRANSACTION``
   -  ``TRANSACTION`` -> typically calls
      ``database.backend.transaction.[BackendTransaction].execute()``,
      which in turn calls, some levels deeper,
      ``backend.transaction.....execute(<k extends   BackendTransaction> t)``
      -> see next point
   -  …

-  ``backend.transaction.[...].execute(transaction)`` :: This method is
   benchmarked again (via parent class ``BackendTransaction``), this is
   probably the deepest level of benchmarking currently (Benchmark is
   logged as e.g. ``<RetrieveFullEntity>...</>``). It finally calls
   ``[MySQLTransaction].execute()``.
-  ``[MySQLTransaction].execute()`` :: This is the deepest backend
   implementation part, it typically creates a prepared statement and
   executes it.
-  Currently not benchmarked separately:

   -  Getting the actual implementation (probably fast?)
   -  Preparing the SQL statement
   -  Executing the SQL statement
   -  Java-side caching

What is measured
----------------

For a consistent interpretation, the exact definitions of the measured
times are as follows:

SQL logs
~~~~~~~~

As per https://mariadb.com/kb/en/general-query-log, the logs store only
the time at which the SQL server received a query, not the duration of
the query.

Possible future enhancements
^^^^^^^^^^^^^^^^^^^^^^^^^^^^

-  The ``query_response_time`` plugin may be additionally used in the
   future, see https://mariadb.com/kb/en/query-response-time-plugin

Transaction benchmarks
~~~~~~~~~~~~~~~~~~~~~~

Transaction benchmarking manually collects timing information for each
transaction. At defined points, different measurements can be made,
accumulated and will finally be returned to the client. Benchmark
objects may consist of sub benchmarks and have a number of measurement
objects, which contain the actual statistics.

Because transaction benchmarks must be manually added to the server
code, they only monitor those code paths where they are added. On the
other hand, their manual nature allows for a more abstracted analysis of
performance bottlenecks.

Java profiler
~~~~~~~~~~~~~

VisualVM records for each thread the call tree, specifically which
methods were called how often and how much time was spent inside these
methods.

Global requests
~~~~~~~~~~~~~~~

Python scripts may measure the global time needed for the execution of
each request. ``fill_database.py`` obtains its numbers this way.