Source code for kestrel.datasource.interface

"""The abstract interface for building a data source interface for Kestrel.

A Kestrel data source interface is a Python package with the following rules:

- The package name should use prefix ``kestrel_datasource_``.

- The package should have one and only one root level class inherited from
  :class:`AbstractDataSourceInterface`.

  - There is no restriction on package structure for the package.

  - There is no restriction on interface class name.

  - The interface class should inhert :class:`AbstractDataSourceInterface`.

  - The interface class should be importable from the package directly, i.e.,
    it needs to be imported into ``__init__.py`` of the package.

  - Zero class inherited from :class:`AbstractDataSourceInterface` will
    result in an exception.

  - Multiple classes inherited from :class:`AbstractDataSourceInterface` will
    result in an exception.

"""

from abc import ABC, abstractmethod

MODULE_PREFIX = "kestrel_datasource_"


[docs]class AbstractDataSourceInterface(ABC): """The abstract class for building a data source interface. Why do we design the interface this way? Actually we do not need a class for building the interface since all methods are static. However, in Python, we need to have a class if we'd like to enforce developers to implement the methods when developing a concrete interface. This is done by using both ``@staticmethod`` and ``@abstractmethod`` decorators for all methods/functions. When using an interface, Kestrel runtime will not instantiate an object from an interface class but use the static methods directly. This may not look beautiful in design, and hope we have something comparable to ``typeclass`` in Haskell for non-OOP interface abstraction in the future. """
[docs] @staticmethod @abstractmethod def schemes(): """``scheme`` (the URI prefix before ``://``) of the data source interface. Every data source interface should have at least one *unique* scheme to use at the beginning of the data source URI. To develop a new data source, one needs to check public Kestrel data source packages to name a new one that is not taken. Note that scheme defined here should be in lowercase, and Kestrel data source manager will normalize schemes of incoming URIs into lowercase. Returns: [str]: A list of schemes; A URI with one of the scheme will be processed by this interface. """ return []
[docs] @staticmethod @abstractmethod def list_data_sources(config): """List data source names accessible from this interface. Args: config (dict): a layered list/dict that contains config for the interface and can be edited/updated by the interface. Returns: [str]: A list of data source names accessible from this interface. """ return []
[docs] @staticmethod @abstractmethod def query(uri, pattern, session_id, config, store=None, limit=None): """Sending a data query to a specific data source. If the store of the session is modified and directly gets the data loaded into a ``query_id``, it should return :attr:`kestrel.datasource.ReturnFromStore`. If the interface uses local files as intermediate/temporary storage before loading it to the store, it should return :attr:`kestrel.datasource.ReturnFromFile`. Args: uri (str): the full URI including the scheme and data source name. pattern (str): the pattern to query (currently we support STIX). session_id (str): id of the session, may be useful for analytics directly writing into the store. config (dict): a layered list/dict that contains config for the interface and can be edited/updated by the interface. store (firepit.SqlStorage): The internal store used by the session limit (Optional[int]): limit on the number of records to return; None if there is no limit Returns: kestrel.datasource.retstruct.AbstractReturnStruct: returned data. Currently there are two choices: :attr:`kestrel.datasource.ReturnFromFile` and :attr:`kestrel.datasource.ReturnFromStore`. """ return None