Share data in the AI Toolkit
What data is collected
The AI Toolkit collects the following basic usage information:
| Component | Description | Example |
|---|---|---|
ai_processing_time
| Time taken to process the ai command request. Triggered during ai command usage. |
|
algo_name
| Name of algorithm used in fit or apply. |
|
app_context
| Name of the app from which search is run. |
|
apply_time
| Time the apply command took. |
|
app.session.Splunk_ML_Toolkit.changeSmartAssistantStep
| User progress through an AI Toolkit Smart Assistant. |
|
app.session.Splunk_ML_Toolkit.createExperiment
| User creating an AI Toolkit Experiment. |
|
app.session.Splunk_ML_Toolkit.createExperimentAlert
| Users creating alerts for AI Toolkit Experiments. |
|
app.session.Splunk_ML_Toolkit.loadAssistant
| Number of times the user has loaded an AI Toolkit Assistant. |
|
app.session.Splunk_ML_Toolkit.saveExperiment
| Users saving their work in AI Toolkit Experiments. |
|
app.session.Splunk_ML_Toolkit.scheduleExperimentTraining
| Users scheduling model re-training for AI Toolkit Experiments. |
|
col_dimension
| Collects dimension of the dataset from model schema. Triggered during apply. |
|
columns
| The number of columns being run through fit command. |
|
command
| fit, apply, or score
|
|
csv_parse_time
| CSV parse time. |
|
csv_read_time
| CSV read time. |
|
csv_render_time
| CSV render time. |
|
deployment.app
| Apps installed per Splunk instance. |
|
df_shape
| Shape of data input received from splunk. Triggered during apply. |
|
example_name
| Name of the Showcase example being run. |
|
experiment_id
| ID of the fit and apply run on the Experiments page. All preprocessing steps and final fit have the same ID. |
|
fit_time
| Amount of time it took to run the fit command. |
|
full_punct
| The punct of the data during fit or apply. |
|
handle_time
| Time for the handler to handle the data. |
|
metrics_type
| Collects the type of request sent. Used to differentiate model upload and model inference call flows. Contains two values:
|
|
model
| To capture the LLM model name under the specific provider while running the ai command. |
|
modelId
| Model ID in which user saves their model. |
|
model_upload
| Monitors the model upload process to determine if the model has been successfully uploaded and is ready for inference. |
|
numColumns
| Total number of columns in the dataset. |
|
numRows
| Total number of rows (events) in the dataset. |
|
num_fields
| Total number of fields. |
|
num_fields_fs
| Number of fields that have the fs for Field Selector prefix. |
|
num_fields_PC
| Number of fields that have the PC for preprocessed prefix. |
|
num_fields_prefixed
| Total number of preprocessed fields. |
|
num_fields_RS
| Number of fields that have the RS for Robust Scaler prefix. |
|
num_fields_SS
| Number of fields that have the SS for Standard Scaler prefix. |
|
num_fields_tfidf
| Number of fields that have used term frequency-inverse document frequency preprocessing. |
|
onnx_input_shape
| Shape of input data stored in the onnx model schema. Triggered during apply time. |
|
onnx_model_size_on_disk
| Total size in MB taken up by the model file on the disk after encoding. Triggered during model upload. |
|
onnx_upload_time
| Time taken to upload an onnx model file from UI. Triggered during model upload. |
|
orig_sourcetype
| The original sourcetype of the machine data. |
|
params
| Optional parameters used in fit step. |
|
params
| Collects the boolean value of supervise_split_by. Checks whether DecisionTreeRegressor is used as part of DensityFunction. |
|
partialFit
| Whether or not the fit is a type of partial fit action. |
|
PID
| Process identifier associated with the command. |
|
pipeline_stage
| Each preprocessing step on the Experiments page is assigned a number starting from 0. This helps determine the order of the preprocessing steps and length of the pipeline. |
|
provider
| To capture the provider name while running the ai command. |
|
rows
| The number of rows being run through fit command. |
|
rows
| The number of rows processed at a given ai command request. |
|
rows_processor_time
| Time taken to process the rows in seconds while using the ai command request. |
|
SageMaker model apply/inference event
| The AWS Sagemaker model apply/inference event. |
|
scoringName
| Name of the scoring operation if whitelisted. If name is not whitelisted, logs the hash of the scoringName. |
|
scoringTimeSec
| Time taken by the scoring operation. |
|
UUID
| Universally unique identifier associated with command. This is 128-bit and used to keep each fit/apply unique. |
|