great_expectations.validation_operators.validation_operators

Module Contents

Classes

ValidationOperator()

The base class of all validation operators.

ActionListValidationOperator(data_context, action_list, name, result_format={‘result_format’: ‘SUMMARY’})

ActionListValidationOperator validates each batch in its run method’s assets_to_validate argument against the Expectation Suite included within that batch.

WarningAndFailureExpectationSuitesValidationOperator(data_context, action_list, name, base_expectation_suite_name=None, expectation_suite_name_suffixes=None, stop_on_first_error=False, slack_webhook=None, notify_on=’all’, notify_with=None, result_format={‘result_format’: ‘SUMMARY’})

WarningAndFailureExpectationSuitesValidationOperator is a validation operator

great_expectations.validation_operators.validation_operators.logger
class great_expectations.validation_operators.validation_operators.ValidationOperator

The base class of all validation operators.

It defines the signature of the public run method. This method and the validation_operator_config property are the only contract re operators’ API. Everything else is up to the implementors of validation operator classes that will be the descendants of this base class.

property validation_operator_config(self)

This method builds the config dict of a particular validation operator. The “kwargs” key is what really distinguishes different validation operators.

e.g.: {

“class_name”: “ActionListValidationOperator”, “module_name”: “great_expectations.validation_operators”, “name”: self.name, “kwargs”: {

“action_list”: self.action_list

},

}

{

“class_name”: “WarningAndFailureExpectationSuitesValidationOperator”, “module_name”: “great_expectations.validation_operators”, “name”: self.name, “kwargs”: {

“action_list”: self.action_list, “base_expectation_suite_name”: self.base_expectation_suite_name, “expectation_suite_name_suffixes”: self.expectation_suite_name_suffixes, “stop_on_first_error”: self.stop_on_first_error, “slack_webhook”: self.slack_webhook, “notify_on”: self.notify_on,

},

}

abstract run(self, assets_to_validate, run_id=None, evaluation_parameters=None, run_name=None, run_time=None)
class great_expectations.validation_operators.validation_operators.ActionListValidationOperator(data_context, action_list, name, result_format={'result_format': 'SUMMARY'})

Bases: great_expectations.validation_operators.validation_operators.ValidationOperator

ActionListValidationOperator validates each batch in its run method’s assets_to_validate argument against the Expectation Suite included within that batch.

Then it invokes a list of configured actions on every validation result.

Each action in the list must be an instance of ValidationAction class (or its descendants). See the actions included in Great Expectations and how to configure them here. You can also implement your own actions by extending the base class.

The init command includes this operator in the default configuration file.

Configuration

An instance of ActionListValidationOperator is included in the default configuration file great_expectations.yml that great_expectations init command creates.

perform_action_list_operator:  # this is the name you will use when you invoke the operator
  class_name: ActionListValidationOperator

  # the operator will call the following actions on each validation result
  # you can remove or add actions to this list. See the details in the actions
  # reference
  action_list:
    - name: store_validation_result
      action:
        class_name: StoreValidationResultAction
        target_store_name: validations_store
    - name: send_slack_notification_on_validation_result
      action:
        class_name: SlackNotificationAction
        # put the actual webhook URL in the uncommitted/config_variables.yml file
        slack_webhook: ${validation_notification_slack_webhook}
        notify_on: all # possible values: "all", "failure", "success"
        notify_with: optional list of DataDocs sites (ie local_site or gcs_site") to include in Slack notification. Will default to including all configured DataDocs sites.
        renderer:
          module_name: great_expectations.render.renderer.slack_renderer
          class_name: SlackRenderer
    - name: update_data_docs
      action:
        class_name: UpdateDataDocsAction

Invocation

This is an example of invoking an instance of a Validation Operator from Python:

results = context.run_validation_operator(
    assets_to_validate=[batch0, batch1, ...],
    run_id=RunIdentifier(**{
      "run_name": "some_string_that_uniquely_identifies_this_run",
      "run_time": "2020-04-29T10:46:03.197008"  # optional run timestamp, defaults to current UTC datetime
    }),  # you may also pass in a dictionary with run_name and run_time keys
    validation_operator_name="operator_instance_name",
)
  • assets_to_validate - an iterable that specifies the data assets that the operator will validate. The members of the list can be either batches or triples that will allow the operator to fetch the batch: (data_asset_name, expectation_suite_name, batch_kwargs) using this method: get_batch()

  • run_id - pipeline run id of type RunIdentifier, consisting of a run_time (always assumed to be UTC time) and run_name string that is meaningful to you and will help you refer to the result of this operation later

  • validation_operator_name you can instances of a class that implements a Validation Operator

The run method returns a ValidationOperatorResult object:

{
    "run_id": {"run_time": "20200527T041833.074212Z", "run_name": "my_run_name"},
    "success": True,
    "evaluation_parameters": None,
    "validation_operator_config": {
        "class_name": "ActionListValidationOperator",
        "module_name": "great_expectations.validation_operators",
        "name": "action_list_operator",
        "kwargs": {
            "action_list": [
                {
                    "name": "store_validation_result",
                    "action": {"class_name": "StoreValidationResultAction"},
                },
                {
                    "name": "store_evaluation_params",
                    "action": {"class_name": "StoreEvaluationParametersAction"},
                },
                {
                    "name": "update_data_docs",
                    "action": {"class_name": "UpdateDataDocsAction"},
                },
            ]
        },
    },
    "run_results": {
        ValidationResultIdentifier: {
            "validation_result": ExpectationSuiteValidationResult object,
            "actions_results": {
                "store_validation_result": {},
                "store_evaluation_params": {},
                "update_data_docs": {},
            },
        }
    },
}
property validation_operator_config(self)

This method builds the config dict of a particular validation operator. The “kwargs” key is what really distinguishes different validation operators.

e.g.: {

“class_name”: “ActionListValidationOperator”, “module_name”: “great_expectations.validation_operators”, “name”: self.name, “kwargs”: {

“action_list”: self.action_list

},

}

{

“class_name”: “WarningAndFailureExpectationSuitesValidationOperator”, “module_name”: “great_expectations.validation_operators”, “name”: self.name, “kwargs”: {

“action_list”: self.action_list, “base_expectation_suite_name”: self.base_expectation_suite_name, “expectation_suite_name_suffixes”: self.expectation_suite_name_suffixes, “stop_on_first_error”: self.stop_on_first_error, “slack_webhook”: self.slack_webhook, “notify_on”: self.notify_on,

},

}

_build_batch_from_item(self, item)
Internal helper method to take an asset to validate, which can be either:
  1. a DataAsset; or

  2. a tuple of data_asset_name, expectation_suite_name, and batch_kwargs (suitable for passing to get_batch)

Parameters

item – The item to convert to a batch (see above)

Returns

A batch of data

run(self, assets_to_validate, run_id=None, evaluation_parameters=None, run_name=None, run_time=None, catch_exceptions=None, result_format=None, checkpoint_identifier=None)
_run_actions(self, batch: Union[Batch, DataAsset], expectation_suite_identifier: ExpectationSuiteIdentifier, expectation_suite, batch_validation_result, run_id, validation_result_id=None, checkpoint_identifier=None)

Runs all actions configured for this operator on the result of validating one batch against one expectation suite.

If an action fails with an exception, the method does not continue.

Parameters
  • batch

  • expectation_suite

  • batch_validation_result

  • run_id

Returns

a dictionary: {action name -> result returned by the action}

class great_expectations.validation_operators.validation_operators.WarningAndFailureExpectationSuitesValidationOperator(data_context, action_list, name, base_expectation_suite_name=None, expectation_suite_name_suffixes=None, stop_on_first_error=False, slack_webhook=None, notify_on='all', notify_with=None, result_format={'result_format': 'SUMMARY'})

Bases: great_expectations.validation_operators.validation_operators.ActionListValidationOperator

WarningAndFailureExpectationSuitesValidationOperator is a validation operator that accepts a list batches of data assets (or the information necessary to fetch these batches). The operator retrieves 2 expectation suites for each data asset/batch - one containing the critical expectations (“failure”) and the other containing non-critical expectations (“warning”). By default, the operator assumes that the first is called “failure” and the second is called “warning”, but “base_expectation_suite_name” attribute can be specified in the operator’s configuration to make sure it searched for “{base_expectation_suite_name}.failure” and {base_expectation_suite_name}.warning” expectation suites for each data asset.

The operator validates each batch against its “failure” and “warning” expectation suites and invokes a list of actions on every validation result.

The list of these actions is specified in the operator’s configuration

Each action in the list must be an instance of ValidationAction class (or its descendants).

The operator sends a Slack notification (if “slack_webhook” is present in its config). The “notify_on” config property controls whether the notification should be sent only in the case of failure (“failure”), only in the case of success (“success”), or always (“all”).

Configuration

Below is an example of this operator’s configuration:

run_warning_and_failure_expectation_suites:
    class_name: WarningAndFailureExpectationSuitesValidationOperator

    # the following two properties are optional - by default the operator looks for
    # expectation suites named "failure" and "warning".
    # You can use these two properties to override these names.
    # e.g., with expectation_suite_name_prefix=boo_ and
    # expectation_suite_name_suffixes = ["red", "green"], the operator
    # will look for expectation suites named "boo_red" and "boo_green"
    expectation_suite_name_prefix="",
    expectation_suite_name_suffixes=["failure", "warning"],

    # optional - if true, the operator will stop and exit after first failed validation. false by default.
    stop_on_first_error=False,

    # put the actual webhook URL in the uncommitted/config_variables.yml file
    slack_webhook: ${validation_notification_slack_webhook}
    # optional - if "all" - notify always, "success" - notify only on success, "failure" - notify only on failure
    notify_on="all"

    # the operator will call the following actions on each validation result
    # you can remove or add actions to this list. See the details in the actions
    # reference
    action_list:
      - name: store_validation_result
        action:
          class_name: StoreValidationResultAction
          target_store_name: validations_store
      - name: store_evaluation_params
        action:
          class_name: StoreEvaluationParametersAction
          target_store_name: evaluation_parameter_store

Invocation

This is an example of invoking an instance of a Validation Operator from Python:

results = context.run_validation_operator(
    assets_to_validate=[batch0, batch1, ...],
    run_id=RunIdentifier(**{
      "run_name": "some_string_that_uniquely_identifies_this_run",
      "run_time": "2020-04-29T10:46:03.197008"  # optional run timestamp, defaults to current UTC datetime
    }),  # you may also pass in a dictionary with run_name and run_time keys
    validation_operator_name="operator_instance_name",
)
  • assets_to_validate - an iterable that specifies the data assets that the operator will validate. The members of the list can be either batches or triples that will allow the operator to fetch the batch: (data_asset_name, expectation_suite_name, batch_kwargs) using this method: get_batch()

  • run_id - pipeline run id of type RunIdentifier, consisting of a run_time (always assumed to be UTC time) and run_name string that is meaningful to you and will help you refer to the result of this operation later

  • validation_operator_name you can instances of a class that implements a Validation Operator

The run method returns a ValidationOperatorResult object.

The value of “success” is True if no critical expectation suites (“failure”) failed to validate (non-critical warning”) expectation suites are allowed to fail without affecting the success status of the run.

{
    "run_id": {"run_time": "20200527T041833.074212Z", "run_name": "my_run_name"},
    "success": True,
    "evaluation_parameters": None,
    "validation_operator_config": {
        "class_name": "WarningAndFailureExpectationSuitesValidationOperator",
        "module_name": "great_expectations.validation_operators",
        "name": "warning_and_failure_operator",
        "kwargs": {
            "action_list": [
                {
                    "name": "store_validation_result",
                    "action": {"class_name": "StoreValidationResultAction"},
                },
                {
                    "name": "store_evaluation_params",
                    "action": {"class_name": "StoreEvaluationParametersAction"},
                },
                {
                    "name": "update_data_docs",
                    "action": {"class_name": "UpdateDataDocsAction"},
                },
            ],
            "base_expectation_suite_name": ...,
            "expectation_suite_name_suffixes": ...,
            "stop_on_first_error": ...,
            "slack_webhook": ...,
            "notify_on": ...,
            "notify_with":...,
        },
    },
    "run_results": {
        ValidationResultIdentifier: {
            "validation_result": ExpectationSuiteValidationResult object,
            "expectation_suite_severity_level": "warning",
            "actions_results": {
                "store_validation_result": {},
                "store_evaluation_params": {},
                "update_data_docs": {},
            },
        }
    }
}
property validation_operator_config(self)

This method builds the config dict of a particular validation operator. The “kwargs” key is what really distinguishes different validation operators.

e.g.: {

“class_name”: “ActionListValidationOperator”, “module_name”: “great_expectations.validation_operators”, “name”: self.name, “kwargs”: {

“action_list”: self.action_list

},

}

{

“class_name”: “WarningAndFailureExpectationSuitesValidationOperator”, “module_name”: “great_expectations.validation_operators”, “name”: self.name, “kwargs”: {

“action_list”: self.action_list, “base_expectation_suite_name”: self.base_expectation_suite_name, “expectation_suite_name_suffixes”: self.expectation_suite_name_suffixes, “stop_on_first_error”: self.stop_on_first_error, “slack_webhook”: self.slack_webhook, “notify_on”: self.notify_on,

},

}

_build_slack_query(self, validation_operator_result: ValidationOperatorResult)
run(self, assets_to_validate, run_id=None, base_expectation_suite_name=None, evaluation_parameters=None, run_name=None, run_time=None, result_format=None)