great_expectations.data_context
¶
Subpackages¶
great_expectations.data_context.store
great_expectations.data_context.store.database_store_backend
great_expectations.data_context.store.expectations_store
great_expectations.data_context.store.html_site_store
great_expectations.data_context.store.metric_store
great_expectations.data_context.store.query_store
great_expectations.data_context.store.store
great_expectations.data_context.store.store_backend
great_expectations.data_context.store.tuple_store_backend
great_expectations.data_context.store.validations_store
great_expectations.data_context.types
Submodules¶
Package Contents¶
Classes¶
|
This class implements most of the functionality of DataContext, with a few exceptions. |
|
A DataContext represents a Great Expectations project. It organizes storage and access for |
|
A DataContext represents a Great Expectations project. It organizes storage and access for |
-
class
great_expectations.data_context.
BaseDataContext
(project_config, context_root_dir=None, runtime_environment=None)¶ Bases:
object
This class implements most of the functionality of DataContext, with a few exceptions.
BaseDataContext does not attempt to keep its project_config in sync with a file on disc.
- BaseDataContext doesn’t attempt to “guess” paths or objects types. Instead, that logic is pushed
into DataContext class.
Together, these changes make BaseDataContext class more testable.
OS - Linux - How-to Guide
TODO: OS - Linux DescriptionMaturity: ProductionDetails:API Stability: N/AImplementation Completeness: N/AUnit Test Coverage: CompleteIntegration Infrastructure/Test Coverage: CompleteDocumentation Completeness: CompleteBug Risk: LowOS - MacOS - How-to Guide
TODO: OS - MacOS DescriptionMaturity: ProductionDetails:API Stability: N/AImplementation Completeness: N/AUnit Test Coverage: Complete (local only)Integration Infrastructure/Test Coverage: Complete (local only)Documentation Completeness: CompleteBug Risk: LowOS - Windows - How-to Guide
TODO: OS - Windows DescriptionMaturity: BetaDetails:API Stability: N/AImplementation Completeness: N/AUnit Test Coverage: MinimalIntegration Infrastructure/Test Coverage: MinimalDocumentation Completeness: CompleteBug Risk: ModerateCreate and Edit Expectations - suite scaffold - How-to Guide
Creating Expectation Suites through an interactive development loop using suite scaffoldMaturity: Experimental (expect exciting changes to Profiler capability)Details:API Stability: N/AImplementation Completeness: N/AUnit Test Coverage: N/AIntegration Infrastructure/Test Coverage: PartialDocumentation Completeness: CompleteBug Risk: LowCreate and Edit Expectations - CLI - How-to Guide
Creating a Expectation Suite great_expectations suite new commandMaturity: Experimental (expect exciting changes to Profiler and Suite Renderer capability)Details:API Stability: N/AImplementation Completeness: N/AUnit Test Coverage: N/AIntegration Infrastructure/Test Coverage: PartialDocumentation Completeness: CompleteBug Risk: LowCreate and Edit Expectations - Json schema - How-to Guide
Creating a new Expectation Suite using JsonSchemaProfiler function and json schema fileMaturity: Experimental (expect exciting changes to Profiler capability)Details:API Stability: N/AImplementation Completeness: N/AUnit Test Coverage: N/AIntegration Infrastructure/Test Coverage: PartialDocumentation Completeness: CompleteBug Risk: Low-
PROFILING_ERROR_CODE_TOO_MANY_DATA_ASSETS
= 2¶
-
PROFILING_ERROR_CODE_SPECIFIED_DATA_ASSETS_NOT_FOUND
= 3¶
-
PROFILING_ERROR_CODE_NO_BATCH_KWARGS_GENERATORS_FOUND
= 4¶
-
PROFILING_ERROR_CODE_MULTIPLE_BATCH_KWARGS_GENERATORS_FOUND
= 5¶
-
UNCOMMITTED_DIRECTORIES
= ['data_docs', 'validations']¶
-
GE_UNCOMMITTED_DIR
= uncommitted¶
-
CHECKPOINTS_DIR
= checkpoints¶
-
BASE_DIRECTORIES
¶
-
NOTEBOOK_SUBDIRECTORIES
= ['pandas', 'spark', 'sql']¶
-
GE_DIR
= great_expectations¶
-
GE_YML
= great_expectations.yml¶
-
GE_EDIT_NOTEBOOK_DIR
¶
-
FALSEY_STRINGS
= ['FALSE', 'false', 'False', 'f', 'F', '0']¶
-
GLOBAL_CONFIG_PATHS
¶
-
classmethod
validate_config
(cls, project_config)¶
-
_build_store
(self, store_name, store_config)¶
-
_init_stores
(self, store_configs)¶ Initialize all Stores for this DataContext.
- Stores are a good fit for reading/writing objects that:
follow a clear key-value pattern, and
are usually edited programmatically, using the Context
In general, Stores should take over most of the reading and writing to disk that DataContext had previously done. As of 9/21/2019, the following Stores had not yet been implemented
great_expectations.yml
expectations
data documentation
config_variables
anything accessed via write_resource
Note that stores do NOT manage plugins.
-
_apply_global_config_overrides
(self)¶
-
_get_global_config_value
(self, environment_variable=None, conf_file_section=None, conf_file_option=None)¶
-
_check_global_usage_statistics_opt_out
(self)¶
-
_initialize_usage_statistics
(self, usage_statistics_config: AnonymizedUsageStatisticsConfig)¶ Initialize the usage statistics system.
-
add_store
(self, store_name, store_config)¶ Add a new Store to the DataContext and (for convenience) return the instantiated Store object.
- Parameters
store_name (str) – a key for the new Store in in self._stores
store_config (dict) – a config for the Store to add
- Returns
store (Store)
-
add_validation_operator
(self, validation_operator_name, validation_operator_config)¶ Add a new ValidationOperator to the DataContext and (for convenience) return the instantiated object.
- Parameters
validation_operator_name (str) – a key for the new ValidationOperator in in self._validation_operators
validation_operator_config (dict) – a config for the ValidationOperator to add
- Returns
validation_operator (ValidationOperator)
-
_normalize_absolute_or_relative_path
(self, path)¶
-
_normalize_store_path
(self, resource_store)¶
-
get_docs_sites_urls
(self, resource_identifier=None, site_name: Optional[str] = None, only_if_exists=True)¶ Get URLs for a resource for all data docs sites.
This function will return URLs for any configured site even if the sites have not been built yet.
- Parameters
resource_identifier (object) – optional. It can be an identifier of ExpectationSuite’s, ValidationResults and other resources that have typed identifiers. If not provided, the method will return the URLs of the index page.
site_name – Optionally specify which site to open. If not specified, return all urls in the project.
- Returns
- a list of URLs. Each item is the URL for the resource for a
data docs site
- Return type
list
-
_load_site_builder_from_site_config
(self, site_config)¶
-
open_data_docs
(self, resource_identifier: Optional[str] = None, site_name: Optional[str] = None, only_if_exists=True)¶ A stdlib cross-platform way to open a file in a browser.
- Parameters
resource_identifier – ExpectationSuiteIdentifier, ValidationResultIdentifier or any other type’s identifier. The argument is optional - when not supplied, the method returns the URL of the index page.
site_name – Optionally specify which site to open. If not specified, open all docs found in the project.
-
property
root_directory
(self)¶ The root directory for configuration objects in the data context; the location in which
great_expectations.yml
is located.
-
property
plugins_directory
(self)¶ The directory in which custom plugin modules should be placed.
-
property
_project_config_with_variables_substituted
(self)¶
-
property
anonymous_usage_statistics
(self)¶
-
property
notebooks
(self)¶
-
property
stores
(self)¶ A single holder for all Stores in this context
-
property
datasources
(self)¶ A single holder for all Datasources in this context
-
property
expectations_store_name
(self)¶
-
property
data_context_id
(self)¶
-
property
instance_id
(self)¶
-
_load_config_variables_file
(self)¶ Get all config variables from the default location.
-
get_config_with_variables_substituted
(self, config=None)¶
-
save_config_variable
(self, config_variable_name, value)¶ Save config variable value
- Parameters
config_variable_name – name of the property
value – the value to save for the property
- Returns
None
-
delete_datasource
(self, datasource_name=None)¶ Delete a data source :param datasource_name: The name of the datasource to delete.
- Raises
ValueError – If the datasource name isn’t provided or cannot be found.
-
get_available_data_asset_names
(self, datasource_names=None, batch_kwargs_generator_names=None)¶ Inspect datasource and batch kwargs generators to provide available data_asset objects.
- Parameters
datasource_names – list of datasources for which to provide available data_asset_name objects. If None, return available data assets for all datasources.
batch_kwargs_generator_names – list of batch kwargs generators for which to provide available
objects. (data_asset_name) –
- Returns
Dictionary describing available data assets
{ datasource_name: { batch_kwargs_generator_name: [ data_asset_1, data_asset_2, ... ] ... } ... }
- Return type
data_asset_names (dict)
-
build_batch_kwargs
(self, datasource, batch_kwargs_generator, data_asset_name=None, partition_id=None, **kwargs)¶ Builds batch kwargs using the provided datasource, batch kwargs generator, and batch_parameters.
- Parameters
datasource (str) – the name of the datasource for which to build batch_kwargs
batch_kwargs_generator (str) – the name of the batch kwargs generator to use to build batch_kwargs
data_asset_name (str) – an optional name batch_parameter
**kwargs – additional batch_parameters
- Returns
BatchKwargs
-
get_batch
(self, batch_kwargs: Union[dict, BatchKwargs], expectation_suite_name: Union[str, ExpectationSuite], data_asset_type=None, batch_parameters=None)¶ Build a batch of data using batch_kwargs, and return a DataAsset with expectation_suite_name attached. If batch_parameters are included, they will be available as attributes of the batch.
- Parameters
batch_kwargs – the batch_kwargs to use; must include a datasource key
expectation_suite_name – The ExpectationSuite or the name of the expectation_suite to get
data_asset_type – the type of data_asset to build, with associated expectation implementations. This can generally be inferred from the datasource.
batch_parameters – optional parameters to store as the reference description of the batch. They should reflect parameters that would provide the passed BatchKwargs.
- Returns
DataAsset
-
run_validation_operator
(self, validation_operator_name, assets_to_validate, run_id=None, evaluation_parameters=None, run_name=None, run_time=None, result_format={'result_format': 'SUMMARY'}, **kwargs)¶ Run a validation operator to validate data assets and to perform the business logic around validation that the operator implements.
- Parameters
validation_operator_name – name of the operator, as appears in the context’s config file
assets_to_validate – a list that specifies the data assets that the operator will validate. The members of the list can be either batches, or a tuple that will allow the operator to fetch the batch: (batch_kwargs, expectation_suite_name)
run_name – The run_name for the validation; if None, a default value will be used
**kwargs – Additional kwargs to pass to the validation operator
- Returns
ValidationOperatorResult
-
list_validation_operator_names
(self)¶
-
add_datasource
(self, name, initialize=True, **kwargs)¶ Add a new datasource to the data context, with configuration provided as kwargs. :param name: the name for the new datasource to add :param initialize: if False, add the datasource to the config, but do not
initialize it, for example if a user needs to debug database connectivity.
- Parameters
kwargs (keyword arguments) – the configuration for the new datasource
- Returns
datasource (Datasource)
-
add_batch_kwargs_generator
(self, datasource_name, batch_kwargs_generator_name, class_name, **kwargs)¶ Add a batch kwargs generator to the named datasource, using the provided configuration.
- Parameters
datasource_name – name of datasource to which to add the new batch kwargs generator
batch_kwargs_generator_name – name of the generator to add
class_name – class of the batch kwargs generator to add
**kwargs – batch kwargs generator configuration, provided as kwargs
Returns:
-
get_config
(self)¶
-
_build_datasource_from_config
(self, name, config)¶
-
get_datasource
(self, datasource_name: str = 'default')¶ Get the named datasource
- Parameters
datasource_name (str) – the name of the datasource from the configuration
- Returns
datasource (Datasource)
-
list_expectation_suites
(self)¶ Return a list of available expectation suite names.
-
list_datasources
(self)¶ List currently-configured datasources on this context.
- Returns
each dictionary includes “name”, “class_name”, and “module_name” keys
- Return type
List(dict)
-
list_stores
(self)¶ List currently-configured Stores on this context
-
list_validation_operators
(self)¶ List currently-configured Validation Operators on this context
-
create_expectation_suite
(self, expectation_suite_name, overwrite_existing=False)¶ Build a new expectation suite and save it into the data_context expectation store.
- Parameters
expectation_suite_name – The name of the expectation_suite to create
overwrite_existing (boolean) – Whether to overwrite expectation suite if expectation suite with given name already exists.
- Returns
A new (empty) expectation suite.
-
delete_expectation_suite
(self, expectation_suite_name)¶ Delete specified expectation suite from data_context expectation store.
- Parameters
expectation_suite_name – The name of the expectation_suite to create
- Returns
True for Success and False for Failure.
-
get_expectation_suite
(self, expectation_suite_name)¶ Get a named expectation suite for the provided data_asset_name.
- Parameters
expectation_suite_name (str) – the name for the expectation suite
- Returns
expectation_suite
-
list_expectation_suite_names
(self)¶ Lists the available expectation suite names
-
save_expectation_suite
(self, expectation_suite, expectation_suite_name=None)¶ Save the provided expectation suite into the DataContext.
- Parameters
expectation_suite – the suite to save
expectation_suite_name – the name of this expectation suite. If no name is provided the name will be read from the suite
- Returns
None
-
_store_metrics
(self, requested_metrics, validation_results, target_store_name)¶ requested_metrics is a dictionary like this:
- requested_metrics:
- : # The asterisk here matches *any expectation suite name
# use the ‘kwargs’ key to request metrics that are defined by kwargs, # for example because they are defined only for a particular column # - column: # Age: # - expect_column_min_to_be_between.result.observed_value
statistics.evaluated_expectations
statistics.successful_expectations
- Parameters
requested_metrics –
validation_results –
target_store_name –
Returns:
-
store_validation_result_metrics
(self, requested_metrics, validation_results, target_store_name)¶
-
store_evaluation_parameters
(self, validation_results, target_store_name=None)¶
-
property
evaluation_parameter_store
(self)¶
-
property
evaluation_parameter_store_name
(self)¶
-
property
validations_store_name
(self)¶
-
property
validations_store
(self)¶
-
_compile_evaluation_parameter_dependencies
(self)¶
-
get_validation_result
(self, expectation_suite_name, run_id=None, batch_identifier=None, validations_store_name=None, failed_only=False)¶ Get validation results from a configured store.
- Parameters
data_asset_name – name of data asset for which to get validation result
expectation_suite_name – expectation_suite name for which to get validation result (default: “default”)
run_id – run_id for which to get validation result (if None, fetch the latest result by alphanumeric sort)
validations_store_name – the name of the store from which to get validation results
failed_only – if True, filter the result to return only failed expectations
- Returns
validation_result
-
update_return_obj
(self, data_asset, return_obj)¶ Helper called by data_asset.
- Parameters
data_asset – The data_asset whose validation produced the current return object
return_obj – the return object to update
- Returns
the return object, potentially changed into a widget by the configured expectation explorer
- Return type
return_obj
-
build_data_docs
(self, site_names=None, resource_identifiers=None, dry_run=False)¶ Build Data Docs for your project.
These make it simple to visualize data quality in your project. These include Expectations, Validations & Profiles. The are built for all Datasources from JSON artifacts in the local repo including validations & profiles from the uncommitted directory.
- Parameters
site_names – if specified, build data docs only for these sites, otherwise, build all the sites specified in the context’s config
resource_identifiers – a list of resource identifiers (ExpectationSuiteIdentifier, ValidationResultIdentifier). If specified, rebuild HTML (or other views the data docs sites are rendering) only for the resources in this list. This supports incremental build of data docs sites (e.g., when a new validation result is created) and avoids full rebuild.
dry_run – a flag, if True, the method returns a structure containing the URLs of the sites that would be built, but it does not build these sites. The motivation for adding this flag was to allow the CLI to display the the URLs before building and to let users confirm.
- Returns
A dictionary with the names of the updated data documentation sites as keys and the the location info of their index.html files as values
-
clean_data_docs
(self, site_name=None)¶
-
profile_datasource
(self, datasource_name, batch_kwargs_generator_name=None, data_assets=None, max_data_assets=20, profile_all_data_assets=True, profiler=BasicDatasetProfiler, profiler_configuration=None, dry_run=False, run_id=None, additional_batch_kwargs=None, run_name=None, run_time=None)¶ Profile the named datasource using the named profiler.
- Parameters
datasource_name – the name of the datasource for which to profile data_assets
batch_kwargs_generator_name – the name of the batch kwargs generator to use to get batches
data_assets – list of data asset names to profile
max_data_assets – if the number of data assets the batch kwargs generator yields is greater than this max_data_assets, profile_all_data_assets=True is required to profile all
profile_all_data_assets – when True, all data assets are profiled, regardless of their number
profiler – the profiler class to use
profiler_configuration – Optional profiler configuration dict
dry_run – when true, the method checks arguments and reports if can profile or specifies the arguments that are missing
additional_batch_kwargs – Additional keyword arguments to be provided to get_batch when loading the data asset.
- Returns
A dictionary:
{ "success": True/False, "results": List of (expectation_suite, EVR) tuples for each of the data_assets found in the datasource }
When success = False, the error details are under “error” key
-
profile_data_asset
(self, datasource_name, batch_kwargs_generator_name=None, data_asset_name=None, batch_kwargs=None, expectation_suite_name=None, profiler=BasicDatasetProfiler, profiler_configuration=None, run_id=None, additional_batch_kwargs=None, run_name=None, run_time=None)¶ Profile a data asset
- Parameters
datasource_name – the name of the datasource to which the profiled data asset belongs
batch_kwargs_generator_name – the name of the batch kwargs generator to use to get batches (only if batch_kwargs are not provided)
data_asset_name – the name of the profiled data asset
batch_kwargs – optional - if set, the method will use the value to fetch the batch to be profiled. If not passed, the batch kwargs generator (generator_name arg) will choose a batch
profiler – the profiler class to use
profiler_configuration – Optional profiler configuration dict
run_name – optional - if set, the validation result created by the profiler will be under the provided run_name
additional_batch_kwargs –
- :returns
A dictionary:
{ "success": True/False, "results": List of (expectation_suite, EVR) tuples for each of the data_assets found in the datasource }
When success = False, the error details are under “error” key
-
class
great_expectations.data_context.
DataContext
(context_root_dir=None, runtime_environment=None)¶ Bases:
great_expectations.data_context.data_context.BaseDataContext
A DataContext represents a Great Expectations project. It organizes storage and access for expectation suites, datasources, notification settings, and data fixtures.
The DataContext is configured via a yml file stored in a directory called great_expectations; the configuration file as well as managed expectation suites should be stored in version control.
Use the create classmethod to create a new empty config, or instantiate the DataContext by passing the path to an existing data context root directory.
DataContexts use data sources you’re already familiar with. BatchKwargGenerators help introspect data stores and data execution frameworks (such as airflow, Nifi, dbt, or dagster) to describe and produce batches of data ready for analysis. This enables fetching, validation, profiling, and documentation of your data in a way that is meaningful within your existing infrastructure and work environment.
DataContexts use a datasource-based namespace, where each accessible type of data has a three-part normalized data_asset_name, consisting of datasource/generator/data_asset_name.
The datasource actually connects to a source of materialized data and returns Great Expectations DataAssets connected to a compute environment and ready for validation.
The BatchKwargGenerator knows how to introspect datasources and produce identifying “batch_kwargs” that define particular slices of data.
The data_asset_name is a specific name – often a table name or other name familiar to users – that batch kwargs generators can slice into batches.
An expectation suite is a collection of expectations ready to be applied to a batch of data. Since in many projects it is useful to have different expectations evaluate in different contexts–profiling vs. testing; warning vs. error; high vs. low compute; ML model or dashboard–suites provide a namespace option for selecting which expectations a DataContext returns.
In many simple projects, the datasource or batch kwargs generator name may be omitted and the DataContext will infer the correct name when there is no ambiguity.
Similarly, if no expectation suite name is provided, the DataContext will assume the name “default”.
-
classmethod
create
(cls, project_root_dir=None, usage_statistics_enabled=True, runtime_environment=None)¶ Build a new great_expectations directory and DataContext object in the provided project_root_dir.
create will not create a new “great_expectations” directory in the provided folder, provided one does not already exist. Then, it will initialize a new DataContext in that folder and write the resulting config.
- Parameters
project_root_dir – path to the root directory in which to create a new great_expectations directory
runtime_environment – a dictionary of config variables that
both those set in config_variables.yml and the environment (override) –
- Returns
DataContext
-
classmethod
all_uncommitted_directories_exist
(cls, ge_dir)¶ Check if all uncommitted direcotries exist.
-
classmethod
config_variables_yml_exist
(cls, ge_dir)¶ Check if all config_variables.yml exists.
-
classmethod
write_config_variables_template_to_disk
(cls, uncommitted_dir)¶
-
classmethod
write_project_template_to_disk
(cls, ge_dir, usage_statistics_enabled=True)¶
-
classmethod
scaffold_directories
(cls, base_dir)¶ Safely create GE directories for a new project.
-
classmethod
scaffold_custom_data_docs
(cls, plugins_dir)¶ Copy custom data docs templates
-
classmethod
scaffold_notebooks
(cls, base_dir)¶ Copy template notebooks into the notebooks directory for a project.
-
_load_project_config
(self)¶ Reads the project configuration from the project configuration file. The file may contain ${SOME_VARIABLE} variables - see self._project_config_with_variables_substituted for how these are substituted.
- Returns
the configuration object read from the file
-
list_checkpoints
(self)¶ List checkpoints. (Experimental)
-
get_checkpoint
(self, checkpoint_name: str)¶ Load a checkpoint. (Experimental)
-
_list_ymls_in_checkpoints_directory
(self)¶
-
_save_project_config
(self)¶ Save the current project to disk.
-
add_store
(self, store_name, store_config)¶ Add a new Store to the DataContext and (for convenience) return the instantiated Store object.
- Parameters
store_name (str) – a key for the new Store in in self._stores
store_config (dict) – a config for the Store to add
- Returns
store (Store)
-
add_datasource
(self, name, **kwargs)¶ Add a new datasource to the data context, with configuration provided as kwargs. :param name: the name for the new datasource to add :param initialize: if False, add the datasource to the config, but do not
initialize it, for example if a user needs to debug database connectivity.
- Parameters
kwargs (keyword arguments) – the configuration for the new datasource
- Returns
datasource (Datasource)
-
classmethod
find_context_root_dir
(cls)¶
-
classmethod
get_ge_config_version
(cls, context_root_dir=None)¶
-
classmethod
set_ge_config_version
(cls, config_version, context_root_dir=None, validate_config_version=True)¶
-
classmethod
find_context_yml_file
(cls, search_start_dir=None)¶ Search for the yml file starting here and moving upward.
-
classmethod
does_config_exist_on_disk
(cls, context_root_dir)¶ Return True if the great_expectations.yml exists on disk.
-
classmethod
is_project_initialized
(cls, ge_dir)¶ Return True if the project is initialized.
To be considered initialized, all of the following must be true: - all project directories exist (including uncommitted directories) - a valid great_expectations.yml is on disk - a config_variables.yml is on disk - the project has at least one datasource - the project has at least one suite
-
classmethod
does_project_have_a_datasource_in_config_file
(cls, ge_dir)¶
-
classmethod
_does_context_have_at_least_one_datasource
(cls, ge_dir)¶
-
classmethod
_does_context_have_at_least_one_suite
(cls, ge_dir)¶
-
classmethod
_attempt_context_instantiation
(cls, ge_dir)¶
-
static
_validate_checkpoint
(checkpoint: dict, checkpoint_name: str)¶
-
class
great_expectations.data_context.
ExplorerDataContext
(context_root_dir=None, expectation_explorer=True)¶ Bases:
great_expectations.data_context.data_context.DataContext
A DataContext represents a Great Expectations project. It organizes storage and access for expectation suites, datasources, notification settings, and data fixtures.
The DataContext is configured via a yml file stored in a directory called great_expectations; the configuration file as well as managed expectation suites should be stored in version control.
Use the create classmethod to create a new empty config, or instantiate the DataContext by passing the path to an existing data context root directory.
DataContexts use data sources you’re already familiar with. BatchKwargGenerators help introspect data stores and data execution frameworks (such as airflow, Nifi, dbt, or dagster) to describe and produce batches of data ready for analysis. This enables fetching, validation, profiling, and documentation of your data in a way that is meaningful within your existing infrastructure and work environment.
DataContexts use a datasource-based namespace, where each accessible type of data has a three-part normalized data_asset_name, consisting of datasource/generator/data_asset_name.
The datasource actually connects to a source of materialized data and returns Great Expectations DataAssets connected to a compute environment and ready for validation.
The BatchKwargGenerator knows how to introspect datasources and produce identifying “batch_kwargs” that define particular slices of data.
The data_asset_name is a specific name – often a table name or other name familiar to users – that batch kwargs generators can slice into batches.
An expectation suite is a collection of expectations ready to be applied to a batch of data. Since in many projects it is useful to have different expectations evaluate in different contexts–profiling vs. testing; warning vs. error; high vs. low compute; ML model or dashboard–suites provide a namespace option for selecting which expectations a DataContext returns.
In many simple projects, the datasource or batch kwargs generator name may be omitted and the DataContext will infer the correct name when there is no ambiguity.
Similarly, if no expectation suite name is provided, the DataContext will assume the name “default”.
-
update_return_obj
(self, data_asset, return_obj)¶ Helper called by data_asset.
- Parameters
data_asset – The data_asset whose validation produced the current return object
return_obj – the return object to update
- Returns
the return object, potentially changed into a widget by the configured expectation explorer
- Return type
return_obj