great_expectations.validator.validator
¶
Module Contents¶
Classes¶
|
|
|
This is currently helping bridge APIs |
-
great_expectations.validator.validator.
logger
¶
-
great_expectations.validator.validator.
pd
¶
-
class
great_expectations.validator.validator.
ValidationDependencies
¶ -
metric_configurations
:Dict[str, MetricConfiguration]¶
-
result_format
:Dict[str, Any]¶
-
set_metric_configuration
(self, metric_name: str, metric_configuration: MetricConfiguration)¶ Sets specified “MetricConfiguration” for “metric_name” to “metric_configurations” dependencies dictionary.
-
get_metric_configuration
(self, metric_name: str)¶ Obtains “MetricConfiguration” for specified “metric_name” from “metric_configurations” dependencies dictionary.
-
remove_metric_configuration
(self, metric_name: str)¶ Removes “MetricConfiguration” for specified “metric_name” from “metric_configurations” dependencies dictionary.
-
get_metric_names
(self)¶ Returns “metric_name” keys, for which “MetricConfiguration” dependency objects have been specified.
-
get_metric_configurations
(self)¶ Returns “MetricConfiguration” dependency objects specified.
-
-
great_expectations.validator.validator.
ValidationStatistics
¶
-
class
great_expectations.validator.validator.
Validator
(execution_engine: ExecutionEngine, interactive_evaluation: bool = True, expectation_suite: Optional[ExpectationSuite] = None, expectation_suite_name: Optional[str] = None, data_context: Optional[AbstractDataContext] = None, batches: Optional[Union[List[Batch], Sequence[Union[Batch, XBatch]]]] = None, include_rendered_content: Optional[bool] = None, **kwargs)¶ -
DEFAULT_RUNTIME_CONFIGURATION
¶
-
RUNTIME_KEYS
¶
-
property
execution_engine
(self)¶ Returns the execution engine being used by the validator at the given time
-
property
metrics_calculator
(self)¶ Returns the “MetricsCalculator” object being used by the Validator to handle metrics computations.
-
property
data_context
(self)¶ Reference to DataContext object handle.
-
property
expose_dataframe_methods
(self)¶ The “expose_dataframe_methods” getter property.
-
property
loaded_batch_ids
(self)¶ Getter for IDs of loaded Batch objects (convenience property)
-
property
active_batch_data
(self)¶ Getter for BatchData object from the currently-active Batch object (convenience property).
-
property
batch_cache
(self)¶ Getter for dictionary of Batch objects (convenience property)
-
property
batches
(self)¶ Getter for dictionary of Batch objects (alias convenience property, to be deprecated)
-
property
active_batch_id
(self)¶ Getter for batch_id of active Batch (convenience property)
-
property
active_batch
(self)¶ Getter for active Batch (convenience property)
-
property
active_batch_spec
(self)¶ Getter for batch_spec of active Batch (convenience property)
-
property
active_batch_markers
(self)¶ Getter for batch_markers of active Batch (convenience property)
-
property
active_batch_definition
(self)¶ Getter for batch_definition of active Batch (convenience property)
-
property
expectation_suite
(self)¶
-
property
expectation_suite_name
(self)¶ Gets the current expectation_suite name of this data_asset as stored in the expectations configuration.
-
load_batch_list
(self, batch_list: List[Batch])¶
-
get_metric
(self, metric: MetricConfiguration)¶ Convenience method that returns the value of the requested metric.
(To be deprecated in favor of using methods in “MetricsCalculator” class.)
-
get_metrics
(self, metrics: Dict[str, MetricConfiguration])¶ Convenience method that resolves requested metrics (specified as dictionary, keyed by MetricConfiguration ID).
(To be deprecated in favor of using methods in “MetricsCalculator” class.)
- Parameters
metrics – Dictionary of desired metrics to be resolved; metric_name is key and MetricConfiguration is value.
- Returns
Return Dictionary with requested metrics resolved, with metric_name as key and computed metric as value.
-
compute_metrics
(self, metric_configurations: List[MetricConfiguration], runtime_configuration: Optional[dict] = None)¶ Convenience method that computes requested metrics (specified as elements of “MetricConfiguration” list).
(To be deprecated in favor of using methods in “MetricsCalculator” class.)
- Parameters
metric_configurations – List of desired MetricConfiguration objects to be resolved.
runtime_configuration – Additional run-time settings (see “Validator.DEFAULT_RUNTIME_CONFIGURATION”).
- Returns
Dictionary with requested metrics resolved, with unique metric ID as key and computed metric as value.
-
columns
(self, domain_kwargs: Optional[Dict[str, Any]] = None)¶ Convenience method to obtain Batch columns.
(To be deprecated in favor of using methods in “MetricsCalculator” class.)
-
head
(self, n_rows: int = 5, domain_kwargs: Optional[Dict[str, Any]] = None, fetch_all: bool = False)¶ Convenience method to obtain Batch first few rows.
(To be deprecated in favor of using methods in “MetricsCalculator” class.)
-
__dir__
(self)¶ This custom magic method is used to enable expectation tab completion on Validator objects. It also allows users to call Pandas.DataFrame methods on Validator objects
-
_determine_progress_bars
(self)¶
-
__getattr__
(self, name)¶
-
validate_expectation
(self, name: str)¶ Given the name of an Expectation, obtains the Class-first Expectation implementation and utilizes the expectation’s validate method to obtain a validation result. Also adds in the runtime configuration
- Args:
name (str): The name of the Expectation being validated
- Returns:
The Expectation’s validation result
-
_build_expectation_configuration
(self, expectation_type: str, expectation_kwargs: dict, meta: dict, expectation_impl: Expectation)¶
-
build_rule_based_profiler_for_expectation
(self, expectation_type: str)¶ Given name of Expectation (“expectation_type”), builds effective RuleBasedProfiler object from configuration. :param expectation_type: Name of Expectation for which Rule-Based Profiler may be configured. :type expectation_type: str
- Returns
Function that builds effective RuleBasedProfiler object (for specified “expectation_type”).
-
_build_rule_based_profiler_from_config_and_runtime_args
(self, expectation_type: str, expectation_kwargs: dict, success_keys: Tuple[str], profiler_config: RuleBasedProfilerConfig, override_profiler_config: Optional[RuleBasedProfilerConfig] = None)¶
-
_validate_profiler_and_update_rules_properties
(self, profiler: BaseRuleBasedProfiler, expectation_type: str, expectation_kwargs: dict, success_keys: Tuple[str])¶
-
_update_metric_value_kwargs_for_success_keys
(self, parameter_builder: ParameterBuilder, metric_value_kwargs: Optional[dict] = None)¶
-
list_available_expectation_types
(self)¶ Returns a list of all expectations available to the validator
-
graph_validate
(self, configurations: List[ExpectationConfiguration], runtime_configuration: Optional[dict] = None)¶ Obtains validation dependencies for each metric using the implementation of their associated expectation, then proceeds to add these dependencies to the validation graph, supply readily available metric implementations to fulfill current metric requirements, and validate these metrics.
- Parameters
configurations (List[ExpectationConfiguration]) – A list of needed Expectation Configurations that will be
to supply domain and values for metrics. (used) –
runtime_configuration (dict) – A dictionary of runtime keyword arguments, controlling semantics, such as the
result_format. –
- Returns
A list of Validations, validating that all necessary metrics are available.
-
_generate_metric_dependency_subgraphs_for_each_expectation_configuration
(self, expectation_configurations: List[ExpectationConfiguration], processed_configurations: List[ExpectationConfiguration], catch_exceptions: bool, runtime_configuration: Optional[dict] = None)¶
-
_generate_suite_level_graph_from_expectation_level_sub_graphs
(self, expectation_validation_graphs: List[ExpectationValidationGraph])¶
-
static
_resolve_suite_level_graph_and_process_metric_evaluation_errors
(graph: ValidationGraph, runtime_configuration: dict, expectation_validation_graphs: List[ExpectationValidationGraph], evrs: List[ExpectationValidationResult], processed_configurations: List[ExpectationConfiguration], show_progress_bars: bool)¶
-
static
_catch_exceptions_in_failing_expectation_validations
(exception_traceback: str, exception: Exception, failing_expectation_configurations: List[ExpectationConfiguration], evrs: List[ExpectationValidationResult])¶ Catch exceptions in failing Expectation validations and convert to unsuccessful ExpectationValidationResult :param exception_traceback: Traceback related to raised Exception :param exception: Exception raised :param failing_expectation_configurations: ExpectationConfigurations that failed :param evrs: List of ExpectationValidationResult objects to append failures to
- Returns
List of ExpectationValidationResult objects with unsuccessful ExpectationValidationResult objects appended
-
append_expectation
(self, expectation_config: ExpectationConfiguration)¶ This method is a thin wrapper for ExpectationSuite.append_expectation
-
find_expectation_indexes
(self, expectation_configuration: ExpectationConfiguration, match_type: str = 'domain')¶ This method is a thin wrapper for ExpectationSuite.find_expectation_indexes
-
find_expectations
(self, expectation_configuration: ExpectationConfiguration, match_type: str = 'domain', ge_cloud_id: Optional[str] = None)¶ This method is a thin wrapper for ExpectationSuite.find_expectations()
-
remove_expectation
(self, expectation_configuration: ExpectationConfiguration, match_type: str = 'domain', remove_multiple_matches: bool = False, ge_cloud_id: Optional[str] = None)¶
-
discard_failing_expectations
(self)¶ Removes any expectations from the validator where the validation has failed
-
get_default_expectation_arguments
(self)¶ Fetch default expectation arguments for this data_asset
- Returns
A dictionary containing all the current default expectation arguments for a data_asset
Ex:
{ "include_config" : True, "catch_exceptions" : False, "result_format" : 'BASIC' }
See also
set_default_expectation_arguments
-
property
cloud_mode
(self)¶ Wrapper around cloud_mode property of associated Data Context
-
property
ge_cloud_mode
(self)¶
-
property
default_expectation_args
(self)¶ A getter for default Expectation arguments
-
set_default_expectation_argument
(self, argument: str, value)¶ Set a default expectation argument for this data_asset
- Parameters
argument (string) – The argument to be replaced
value – The New argument to use for replacement
- Returns
None
See also
get_default_expectation_arguments
-
get_expectations_config
(self, discard_failed_expectations: bool = True, discard_result_format_kwargs: bool = True, discard_include_config_kwargs: bool = True, discard_catch_exceptions_kwargs: bool = True, suppress_warnings: bool = False)¶ Returns an expectation configuration, providing an option to discard failed expectation and discard/ include’ different result aspects, such as exceptions and result format.
-
get_expectation_suite
(self, discard_failed_expectations: bool = True, discard_result_format_kwargs: bool = True, discard_include_config_kwargs: bool = True, discard_catch_exceptions_kwargs: bool = True, suppress_warnings: bool = False, suppress_logging: bool = False)¶ Returns _expectation_config as a JSON object, and perform some cleaning along the way.
- Parameters
discard_failed_expectations (boolean) – Only include expectations with success_on_last_run=True in the exported config. Defaults to True.
discard_result_format_kwargs (boolean) – In returned expectation objects, suppress the result_format parameter. Defaults to True.
discard_include_config_kwargs (boolean) – In returned expectation objects, suppress the include_config parameter. Defaults to True.
discard_catch_exceptions_kwargs (boolean) – In returned expectation objects, suppress the catch_exceptions parameter. Defaults to True.
suppress_warnings (boolean) – If true, do not include warnings in logging information about the operation.
suppress_logging (boolean) – If true, do not create a log entry (useful when using get_expectation_suite programmatically)
- Returns
An expectation suite.
Note
get_expectation_suite does not affect the underlying expectation suite at all. The returned suite is a copy of _expectation_suite, not the original object.
-
save_expectation_suite
(self, filepath: Optional[str] = None, discard_failed_expectations: bool = True, discard_result_format_kwargs: bool = True, discard_include_config_kwargs: bool = True, discard_catch_exceptions_kwargs: bool = True, suppress_warnings: bool = False)¶ Writes
_expectation_config
to a JSON file.Writes the DataAsset’s expectation config to the specified JSON
filepath
. Failing expectations can be excluded from the JSON expectations config withdiscard_failed_expectations
. The kwarg key-value pairs result_format, include_config, and catch_exceptions are optionally excluded from the JSON expectations config.- Parameters
filepath (string) – The location and name to write the JSON config file to.
discard_failed_expectations (boolean) – If True, excludes expectations that do not return
success = True
. If False, all expectations are written to the JSON config file.discard_result_format_kwargs (boolean) – If True, the result_format attribute for each expectation is not written to the JSON config file.
discard_include_config_kwargs (boolean) – If True, the include_config attribute for each expectation is not written to the JSON config file.
discard_catch_exceptions_kwargs (boolean) – If True, the catch_exceptions attribute for each expectation is not written to the JSON config file.
suppress_warnings (boolean) – If True, all warnings raised by Great Expectations, as a result of dropped expectations, are suppressed.
-
validate
(self, expectation_suite=None, run_id=None, data_context: Optional[Any] = None, evaluation_parameters: Optional[dict] = None, catch_exceptions: bool = True, result_format: Optional[str] = None, only_return_failures: bool = False, run_name: Optional[str] = None, run_time: Optional[str] = None, checkpoint_name: Optional[str] = None)¶ Generates a JSON-formatted report describing the outcome of all expectations.
Use the default expectation_suite=None to validate the expectations config associated with the DataAsset.
- Parameters
expectation_suite (json or None) – If None, uses the expectations config generated with the DataAsset during the current session. If a JSON file, validates those expectations.
run_id (str) – Used to identify this validation result as part of a collection of validations. See DataContext for more information.
run_name (str) – Used to identify this validation result as part of a collection of validations. See DataContext for more information.
run_time (str) – Used to identify this validation result as part of a collection of validations. See DataContext for more information.
data_context (DataContext) – A datacontext object to use as part of validation for binding evaluation parameters and registering validation results.
evaluation_parameters (dict or None) – If None, uses the evaluation_paramters from the expectation_suite provided or as part of the data_asset. If a dict, uses the evaluation parameters in the dictionary.
catch_exceptions (boolean) – If True, exceptions raised by tests will not end validation and will be described in the returned report.
result_format (string or None) – If None, uses the default value (‘BASIC’ or as specified). If string, the returned expectation output follows the specified format (‘BOOLEAN_ONLY’,’BASIC’, etc.).
only_return_failures (boolean) – If True, expectation results are only returned when
success = False
checkpoint_name (string or None): Name of the Checkpoint which invoked this Validator.validate() call against an Expectation Suite. It will be added to meta field of the returned ExpectationSuiteValidationResult.
- Returns
A JSON-formatted dictionary containing a list of the validation results. An example of the returned format:
{ "results": [ { "unexpected_list": [unexpected_value_1, unexpected_value_2], "expectation_type": "expect_*", "kwargs": { "column": "Column_Name", "output_format": "SUMMARY" }, "success": true, "raised_exception: false. "exception_traceback": null }, { ... (Second expectation results) }, ... (More expectations results) ], "success": true, "statistics": { "evaluated_expectations": n, "successful_expectations": m, "unsuccessful_expectations": n - m, "success_percent": m / n } }
Notes
If the configuration object was built with a different version of great expectations then the current environment. If no version was found in the configuration file.
- Raises
AttributeError - if 'catch_exceptions'=None and an expectation throws an AttributeError –
-
get_evaluation_parameter
(self, parameter_name, default_value=None)¶ Get an evaluation parameter value that has been stored in meta.
- Parameters
parameter_name (string) – The name of the parameter to store.
default_value (any) – The default value to be returned if the parameter is not found.
- Returns
The current value of the evaluation parameter.
-
set_evaluation_parameter
(self, parameter_name, parameter_value)¶ Provide a value to be stored in the data_asset evaluation_parameters object and used to evaluate parameterized expectations.
- Parameters
parameter_name (string) – The name of the kwarg to be replaced at evaluation time
parameter_value (any) – The value to be used
-
add_citation
(self, comment: str, batch_spec: Optional[dict] = None, batch_markers: Optional[dict] = None, batch_definition: Optional[dict] = None, citation_date: Optional[str] = None)¶ Adds a citation to an existing Expectation Suite within the validator
-
test_expectation_function
(self, function: Callable, *args, **kwargs)¶ Test a generic expectation function
- Parameters
function (func) – The function to be tested. (Must be a valid expectation function.)
*args – Positional arguments to be passed the the function
**kwargs – Keyword arguments to be passed the the function
- Returns
A JSON-serializable expectation result object.
Notes
This function is a thin layer to allow quick testing of new expectation functions, without having to define custom classes, etc. To use developed expectations from the command-line tool, you will still need to define custom classes, etc.
Check out How to create custom Expectations for more information.
-
static
_parse_validation_graph
(validation_graph: ValidationGraph, metrics: Dict[Tuple[str, str, str], MetricValue])¶ Given validation graph, returns the ready and needed metrics necessary for validation using a traversal of validation graph (a graph structure of metric ids) edges
-
_initialize_expectations
(self, expectation_suite: Optional[ExpectationSuite] = None, expectation_suite_name: Optional[str] = None)¶ Instantiates _expectation_suite as empty by default or with a specified expectation config. In addition, this always sets the default_expectation_args to:
include_config: False, catch_exceptions: False, output_format: ‘BASIC’
By default, initializes data_asset_type to the name of the implementing class, but subclasses that have interoperable semantics (e.g. Dataset) may override that parameter to clarify their interoperability.
- Parameters
expectation_suite (json) – A json-serializable expectation config. If None, creates default _expectation_suite with an empty list of expectations and key value data_asset_name as data_asset_name.
expectation_suite_name (string) – The name to assign to the expectation_suite.expectation_suite_name
- Returns
None
-
_get_runtime_configuration
(self, catch_exceptions: Optional[bool] = None, result_format: Optional[Union[dict, str]] = None)¶
-
static
_calc_validation_statistics
(validation_results: List[ExpectationValidationResult])¶ Calculate summary statistics for the validation results and return
ExpectationStatistics
.
-
-
class
great_expectations.validator.validator.
BridgeValidator
(batch, expectation_suite, expectation_engine=None, **kwargs)¶ This is currently helping bridge APIs
-
get_dataset
(self)¶ Bridges between Execution Engines in providing access to the batch data. Validates that Dataset classes contain proper type of data (i.e. a Pandas Dataset does not contain SqlAlchemy data)
-