Python Analytics Interface
Python analytics interface executes Python function as Kestrel analytics.
Use a Python Analytics
Create a profile for each analytics in the python analytics interface config file (YAML):
Default path:
~/.config/kestrel/pythonanalytics.yaml
.A customized path specified in the environment variable
KESTREL_PYTHON_ANALYTICS_CONFIG
.
Example of the python analytics interface config file:
profiles:
analytics-name-1: # the analytics name to use in the APPLY command
module: /home/user/kestrel-analytics/analytics/piniponmap/analytics.py
func: analytics # the analytics function in the module to call
analytics-name-2:
module: /home/user/kestrel-analytics/analytics/suspiciousscoring/analytics.py
func: analytics
Develop a Python Analytics
A Python analytics is a python function that follows the rules:
The function takes in one or more Kestrel variable dumps in Pandas DataFrames.
The return of the function is a tuple containing either or both:
Updated variables. The number of variables can be either 0, e.g., visualization analytics, or the same number as input Kestrel variables. The order of the updated variables should follow the same order as input variables.
An object to display, which can be any of the following types:
Kestrel display object
HTML element as a string
Matplotlib figure (by default, Pandas DataFrame plots use this)
The display object can be either before or after updated variables. In other words, if the input variables are
var1
,var2
, andvar3
, the return of the analytics can be either of the following:# the analytics enriches variables without returning a display object return var1_updated, var3_updated, var3_updated # this is a visualization analytics and no variable updates return display_obj # the analytics does both variable updates and visualization return var1_updated, var3_updated, var3_updated, display_obj # the analytics does both variable updates and visualization return display_obj, var1_updated, var3_updated, var3_updated
Parameters in the APPLY command are passed in as environment varibles. The names of the environment variables are the exact parameter keys given in the
APPLY
command. For example, the following commandAPPLY python://a1 ON var1 WITH XPARAM=src_ref.value, YPARAM=number_observed
creates environment variables
$XPARAM
with valuesrc_ref.value
and$YPARAM
with valuenumber_observed
to be used by the analyticsa1
. After the execution of the analytics, the environment variables will be roll back to the original state.The Python function could spawn other processes or execute other binaries, where the Python function just acts like a wrapper. Check our domain name lookup analytics as an example.
- class kestrel_analytics_python.interface.PythonInterface[source]
Bases:
kestrel.analytics.interface.AbstractAnalyticsInterface
- class kestrel_analytics_python.interface.PythonAnalytics(profile_name, profiles, parameters)[source]
Bases:
contextlib.AbstractContextManager
Handler of a Python Analytics
Use it as a context manager:
with PythonAnalytics(profile_name, profiles, parameters) as func: func(input_kestrel_variables)
Validate and retrieve profile data. The data should be a dict with “module” and “func”, plus appropriate values.
Prepare the analytics by loading the module. Also verify the function exists.
Execute the analytics and process return intelligently.
Clean the environment.
- Parameters
profile_name (str) – The name of the profile/analytics.
profiles (dict) – name to profile (dict) mapping.
parameters (dict) – key-value pairs of parameters.