caoscrawler.identifiable_adapters module

caoscrawler.identifiable_adapters.get_children_of_rt(rtname)

Supply the name of a recordtype. This name and the name of all children RTs are returned in a list

caoscrawler.identifiable_adapters.convert_value(value: Any) str

Return a string representation of the value suitable for the search query.

This is for search queries looking for the identified record.

Parameters:

value (Any) – The value to be converted.

Returns:

out – the string reprensentation of the value.

Return type:

str

class caoscrawler.identifiable_adapters.IdentifiableAdapter

Bases: object

Base class for identifiable adapters.

Some terms:

  • A registered identifiable defines an identifiable template, for example by specifying:
    • Parent record types

    • Properties

    • is_referenced_by statements

  • An identifiable belongs to a concrete record. It consists of identifying attributes which “fill in” the registered identifiable. In code, it can be represented as a Record based on the registered identifiable with all the values filled in.

  • An identified record is the result of retrieving a record from the database, based on the identifiable (and its values).

General question to clarify:

  • Do we want to support multiple identifiables per RecordType?

  • Current implementation supports only one identifiable per RecordType.

The list of referenced by statements is currently not implemented.

The IdentifiableAdapter can be used to retrieve the three above mentioned objects (registered identifiabel, identifiable and identified record) for a Record.

static create_query_for_identifiable(ident: Identifiable, startswith: bool = False)

This function is taken from the old crawler: caosdb-advanced-user-tools/src/caosadvancedtools/crawler.py

uses the properties of ident to create a query that can determine whether the required record already exists.

If startswith is True, use LIKE for long string values to test if the strings starts with the first 200 characters of the value.

all_identifying_properties_exist(node: SyncNode, raise_exception: bool = True)

checks whether all identifying properties exist and raises an error if that’s not the case. It furthermore raises an error if “name” is part of the identifiable, but the node does not have a name.

If raise_exception is False, the function returns False instead of raising an error.

Backreferences are not checked.

Returns True if all identifying properties exist.

Last review by Alexander Schlemmer on 2024-05-24.

static create_property_query(entity: Identifiable, startswith: bool = False)

Create a POV query part with the entity’s properties.

Parameters:
  • entity (Identifiable) – The Identifiable whose properties shall be used.

  • startswith (bool, optional) – If True, check string typed properties against the first 200 characters only. Default is False.

abstract get_registered_identifiable(record: Entity)

Check whether an identifiable is registered for this record and return its definition. If there is no identifiable registered, return None.

abstract get_file(identifiable: File)
static get_identifying_referenced_entities(record, registered_identifiable)

Create a list of all entities that are referenced by record and that are used as identying properties of the identifiable.

Last review by Alexander Schlemmer on 2024-05-29.

get_identifiable(se: SyncNode, identifiable_backrefs: set[SyncNode]) Identifiable

Take the registered identifiable of given SyncNode se and fill the property values to create an identifiable.

Parameters:
  • se – the SyncNode for which the Identifiable shall be created.

  • identifiable_backrefs – a set (Type: set[SyncNode]), that contains SyncNodes with a certain RecordType, that reference se

Returns:

Identifiable, the identifiable for record.

Last review by Alexander Schlemmer on 2024-05-29.

abstract retrieve_identified_record_for_identifiable(identifiable: Identifiable)

Retrieve identifiable record for a given identifiable.

This function will return None if there is either no identifiable registered or no corresponding identified record in the database for a given record.

Warning: this function is not expected to work correctly for file identifiables.

static referencing_entity_has_appropriate_type(parents, register_identifiable)

returns true if one of the parents is listed by the ‘is_referenced_by’ property

This function also returns True if ‘is_referenced_by’ contains the wildcard ‘*’.

Last review by Alexander Schlemmer on 2024-05-29.

class caoscrawler.identifiable_adapters.LocalStorageIdentifiableAdapter

Bases: IdentifiableAdapter

Identifiable adapter which can be used for unit tests.

register_identifiable(name: str, definition: RecordType)
get_records()
get_file(identifiable: Identifiable)

Just look in records for a file with the same path.

store_state(filename)
restore_state(filename)
is_identifiable_for_record(registered_identifiable: RecordType, record: Record)

Check whether this registered_identifiable is an identifiable for the record.

That means: - The properties of the registered_identifiable are a subset of the properties of record. - One of the parents of record is the parent of registered_identifiable.

Return True in that case and False otherwise.

get_registered_identifiable(record: Entity)

Check whether an identifiable is registered for this record and return its definition. If there is no identifiable registered, return None.

check_record(record: Record, identifiable: Identifiable)

Check for a record from the local storage (named “record”) if it is the identified record for an identifiable which was created by a run of the crawler.

Naming of the parameters could be confusing: record is the record from the local database to check against. identifiable is the record that was created during the crawler run.

retrieve_identified_record_for_identifiable(identifiable: Identifiable)

Retrieve identifiable record for a given identifiable.

This function will return None if there is either no identifiable registered or no corresponding identified record in the database for a given record.

Warning: this function is not expected to work correctly for file identifiables.

class caoscrawler.identifiable_adapters.CaosDBIdentifiableAdapter

Bases: IdentifiableAdapter

Identifiable adapter which can be used for production.

load_from_yaml_definition(path: str)

Load identifiables defined in a yaml file

load_from_yaml_object(identifiable_data)

Load identifiables defined in a yaml object.

register_identifiable(name: str, definition: RecordType)
get_file(identifiable: Identifiable)
get_registered_identifiable(record: Entity)

returns the registered identifiable for the given Record

It is assumed, that there is exactly one identifiable for each RecordType. Only the first parent of the given Record is considered; others are ignored

retrieve_identified_record_for_identifiable(identifiable: Identifiable)

Retrieve identifiable record for a given identifiable.

This function will return None if there is either no identifiable registered or no corresponding identified record in the database for a given record.

Warning: this function is not expected to work correctly for file identifiables.