caoscrawler.identifiable_adapters module

class caoscrawler.identifiable_adapters.CaosDBIdentifiableAdapter

Bases: IdentifiableAdapter

Identifiable adapter which can be used for production.

get_file(identifiable: Identifiable)
get_registered_identifiable(record: Record)

returns the registered identifiable for the given Record

It is assumed, that there is exactly one identifiable for each RecordType. Only the first parent of the given Record is considered; others are ignored

load_from_yaml_definition(path: str)

Load identifiables defined in a yaml file

register_identifiable(name: str, definition: RecordType)
resolve_reference(record: Record)

Current implementation just sets the id for this record as a value. It needs to be verified that references all contain an ID.

retrieve_identified_record_for_identifiable(identifiable: Identifiable)

Retrieve identifiable record for a given identifiable.

This function will return None if there is either no identifiable registered or no corresponding identified record in the database for a given record.

Warning: this function is not expected to work correctly for file identifiables.

class caoscrawler.identifiable_adapters.IdentifiableAdapter

Bases: object

Base class for identifiable adapters.

Some terms:

  • A registered identifiable defines an identifiable template, for example by specifying:
    • Parent record types

    • Properties

    • is_referenced_by statements

  • An identifiable belongs to a concrete record. It consists of identifying attributes which “fill in” the registered identifiable. In code, it can be represented as a Record based on the registered identifiable with all the values filled in.

  • An identified record is the result of retrieving a record from the database, based on the identifiable (and its values).

General question to clarify:

  • Do we want to support multiple identifiables per RecordType?

  • Current implementation supports only one identifiable per RecordType.

The list of referenced by statements is currently not implemented.

The IdentifiableAdapter can be used to retrieve the three above mentioned objects (registered identifiabel, identifiable and identified record) for a Record.

static create_property_query(entity: Identifiable, startswith: bool = False)

Create a POV query part with the entity’s properties.

Parameters:
  • entity (Identifiable) – The Identifiable whose properties shall be used.

  • startswith (bool, optional) – If True, check string typed properties against the first 200 characters only. Default is False.

static create_query_for_identifiable(ident: Identifiable, startswith: bool = False)

This function is taken from the old crawler: caosdb-advanced-user-tools/src/caosadvancedtools/crawler.py

uses the properties of ident to create a query that can determine whether the required record already exists.

If startswith is True, use LIKE for long string values to test if the strings starts with the first 200 characters of the value.

abstract get_file(identifiable: File)
get_identifiable(record: Record, referencing_entities=None)

Retrieve the registered identifiable and fill the property values to create an identifiable.

Parameters:
  • record – the record for which the Identifiable shall be created.

  • referencing_entities – a dictionary (Type: dict[str, list[db.Entity]]), that allows to look up entities with a certain RecordType, that reference record

Returns:

Identifiable, the identifiable for record.

static get_identifying_referenced_entities(record, registered_identifiable)
static get_identifying_referencing_entities(referencing_entities, registered_identifiable)
abstract get_registered_identifiable(record: Record)

Check whether an identifiable is registered for this record and return its definition. If there is no identifiable registered, return None.

abstract resolve_reference(record: Record)
abstract retrieve_identified_record_for_identifiable(identifiable: Identifiable)

Retrieve identifiable record for a given identifiable.

This function will return None if there is either no identifiable registered or no corresponding identified record in the database for a given record.

Warning: this function is not expected to work correctly for file identifiables.

retrieve_identified_record_for_record(record: Record, referencing_entities=None)

This function combines all functionality of the IdentifierAdapter by returning the identifiable after having checked for an appropriate registered identifiable.

In case there was no appropriate registered identifiable or no identifiable could be found return value is None.

class caoscrawler.identifiable_adapters.LocalStorageIdentifiableAdapter

Bases: IdentifiableAdapter

Identifiable adapter which can be used for unit tests.

check_record(record: Record, identifiable: Identifiable)

Check for a record from the local storage (named “record”) if it is the identified record for an identifiable which was created by a run of the crawler.

Naming of the parameters could be confusing: record is the record from the local database to check against. identifiable is the record that was created during the crawler run.

get_file(identifiable: Identifiable)

Just look in records for a file with the same path.

get_records()
get_registered_identifiable(record: Record)

Check whether an identifiable is registered for this record and return its definition. If there is no identifiable registered, return None.

is_identifiable_for_record(registered_identifiable: RecordType, record: Record)

Check whether this registered_identifiable is an identifiable for the record.

That means: - The properties of the registered_identifiable are a subset of the properties of record. - One of the parents of record is the parent of registered_identifiable.

Return True in that case and False otherwise.

register_identifiable(name: str, definition: RecordType)
resolve_reference(value: Record)
restore_state(filename)
retrieve_identified_record_for_identifiable(identifiable: Identifiable)

Retrieve identifiable record for a given identifiable.

This function will return None if there is either no identifiable registered or no corresponding identified record in the database for a given record.

Warning: this function is not expected to work correctly for file identifiables.

store_state(filename)
caoscrawler.identifiable_adapters.convert_value(value: Any) str

Return a string representation of the value suitable for the search query.

This is for search queries looking for the identified record.

Parameters:

value (Any) – The value to be converted.

Returns:

out – the string reprensentation of the value.

Return type:

str

caoscrawler.identifiable_adapters.get_children_of_rt(rtname)

Supply the name of a recordtype. This name and the name of all children RTs are returned in a list