Kestrel Data Source Interface

The abstract interface for building a data source interface for Kestrel.

A Kestrel data source interface is a Python package with the following rules:

  • The package name should use prefix kestrel_datasource_.

  • The package should have one and only one root level class inherited from AbstractDataSourceInterface.

    • There is no restriction on package structure for the package.

    • There is no restriction on interface class name.

    • The interface class should inhert AbstractDataSourceInterface.

    • The interface class should be importable from the package directly, i.e., it needs to be imported into __init__.py of the package.

    • Zero class inherited from AbstractDataSourceInterface will result in an exception.

    • Multiple classes inherited from AbstractDataSourceInterface will result in an exception.

class kestrel.datasource.interface.AbstractDataSourceInterface[source]

Bases: abc.ABC

The abstract class for building a data source interface.

Why do we design the interface this way? Actually we do not need a class for building the interface since all methods are static. However, in Python, we need to have a class if we’d like to enforce developers to implement the methods when developing a concrete interface. This is done by using both @staticmethod and @abstractmethod decorators for all methods/functions. When using an interface, Kestrel runtime will not instantiate an object from an interface class but use the static methods directly. This may not look beautiful in design, and hope we have something comparable to typeclass in Haskell for non-OOP interface abstraction in the future.

abstract static schemes()[source]

scheme (the URI prefix before ://) of the data source interface.

Every data source interface should have at least one unique scheme to use at the beginning of the data source URI. To develop a new data source, one needs to check public Kestrel data source packages to name a new one that is not taken. Note that scheme defined here should be in lowercase, and Kestrel data source manager will normalize schemes of incoming URIs into lowercase.

Returns

A list of schemes; A URI with one of the scheme will be processed by this interface.

Return type

[str]

abstract static list_data_sources(config)[source]

List data source names accessible from this interface.

Parameters

config (dict) – a layered list/dict that contains config for the interface and can be edited/updated by the interface.

Returns

A list of data source names accessible from this interface.

Return type

[str]

abstract static query(uri, pattern, session_id, config)[source]

Sending a data query to a specific data source.

If the store of the session is modified and directly gets the data loaded into a query_id, it should return kestrel.datasource.ReturnFromStore.

If the interface uses local files as intermediate/temporary storage before loading it to the store, it should return kestrel.datasource.ReturnFromFile.

Parameters
  • uri (str) – the full URI including the scheme and data source name.

  • pattern (str) – the pattern to query (currently we support STIX).

  • session_id (str) – id of the session, may be useful for analytics directly writing into the store.

  • config (dict) – a layered list/dict that contains config for the interface and can be edited/updated by the interface.

Returns

returned data. Currently there are two choices: kestrel.datasource.ReturnFromFile and kestrel.datasource.ReturnFromStore.

Return type

kestrel.datasource.retstruct.AbstractReturnStruct