great_expectations
¶
Subpackages¶
great_expectations.cli
great_expectations.cli.upgrade_helpers
great_expectations.cli.checkpoint
great_expectations.cli.checkpoint_script_template
great_expectations.cli.cli
great_expectations.cli.cli_logging
great_expectations.cli.cli_messages
great_expectations.cli.datasource
great_expectations.cli.docs
great_expectations.cli.init
great_expectations.cli.mark
great_expectations.cli.project
great_expectations.cli.python_subprocess
great_expectations.cli.store
great_expectations.cli.suite
great_expectations.cli.tap_template
great_expectations.cli.toolkit
great_expectations.cli.util
great_expectations.cli.validation_operator
great_expectations.core
great_expectations.data_asset
great_expectations.data_context
great_expectations.data_context.store
great_expectations.data_context.store.database_store_backend
great_expectations.data_context.store.expectations_store
great_expectations.data_context.store.html_site_store
great_expectations.data_context.store.metric_store
great_expectations.data_context.store.query_store
great_expectations.data_context.store.store
great_expectations.data_context.store.store_backend
great_expectations.data_context.store.tuple_store_backend
great_expectations.data_context.store.validations_store
great_expectations.data_context.types
great_expectations.data_context.data_context
great_expectations.data_context.templates
great_expectations.data_context.util
great_expectations.dataset
great_expectations.datasource
great_expectations.datasource.batch_kwargs_generator
great_expectations.datasource.batch_kwargs_generator.batch_kwargs_generator
great_expectations.datasource.batch_kwargs_generator.databricks_batch_kwargs_generator
great_expectations.datasource.batch_kwargs_generator.glob_reader_batch_kwargs_generator
great_expectations.datasource.batch_kwargs_generator.manual_batch_kwargs_generator
great_expectations.datasource.batch_kwargs_generator.query_batch_kwargs_generator
great_expectations.datasource.batch_kwargs_generator.s3_batch_kwargs_generator
great_expectations.datasource.batch_kwargs_generator.subdir_reader_batch_kwargs_generator
great_expectations.datasource.batch_kwargs_generator.table_batch_kwargs_generator
great_expectations.datasource.types
great_expectations.datasource.datasource
great_expectations.datasource.pandas_datasource
great_expectations.datasource.sparkdf_datasource
great_expectations.datasource.sqlalchemy_datasource
great_expectations.datasource.util
great_expectations.jupyter_ux
great_expectations.profile
great_expectations.profile.base
great_expectations.profile.basic_dataset_profiler
great_expectations.profile.basic_suite_builder_profiler
great_expectations.profile.columns_exist
great_expectations.profile.json_schema_profiler
great_expectations.profile.metrics_utils
great_expectations.profile.multi_batch_validation_meta_analysis
great_expectations.render
great_expectations.render.notebook_assets
great_expectations.render.renderer
great_expectations.render.renderer.content_block
great_expectations.render.renderer.call_to_action_renderer
great_expectations.render.renderer.column_section_renderer
great_expectations.render.renderer.other_section_renderer
great_expectations.render.renderer.page_renderer
great_expectations.render.renderer.renderer
great_expectations.render.renderer.site_builder
great_expectations.render.renderer.site_index_page_renderer
great_expectations.render.renderer.slack_renderer
great_expectations.render.renderer.suite_edit_notebook_renderer
great_expectations.render.renderer.suite_scaffold_notebook_renderer
great_expectations.render.types
great_expectations.render.view
great_expectations.render.exceptions
great_expectations.render.page_renderer_util
great_expectations.render.util
great_expectations.types
great_expectations.validation_operators
great_expectations.validator
Package Contents¶
Classes¶
|
A DataContext represents a Great Expectations project. It organizes storage and access for |
Functions¶
Get version information or return default if unable to do so. |
-
great_expectations.
get_versions
()¶ Get version information or return default if unable to do so.
-
great_expectations.
__version__
¶
-
class
great_expectations.
DataContext
(context_root_dir=None, runtime_environment=None)¶ Bases:
great_expectations.data_context.data_context.BaseDataContext
A DataContext represents a Great Expectations project. It organizes storage and access for expectation suites, datasources, notification settings, and data fixtures.
The DataContext is configured via a yml file stored in a directory called great_expectations; the configuration file as well as managed expectation suites should be stored in version control.
Use the create classmethod to create a new empty config, or instantiate the DataContext by passing the path to an existing data context root directory.
DataContexts use data sources you’re already familiar with. BatchKwargGenerators help introspect data stores and data execution frameworks (such as airflow, Nifi, dbt, or dagster) to describe and produce batches of data ready for analysis. This enables fetching, validation, profiling, and documentation of your data in a way that is meaningful within your existing infrastructure and work environment.
DataContexts use a datasource-based namespace, where each accessible type of data has a three-part normalized data_asset_name, consisting of datasource/generator/data_asset_name.
The datasource actually connects to a source of materialized data and returns Great Expectations DataAssets connected to a compute environment and ready for validation.
The BatchKwargGenerator knows how to introspect datasources and produce identifying “batch_kwargs” that define particular slices of data.
The data_asset_name is a specific name – often a table name or other name familiar to users – that batch kwargs generators can slice into batches.
An expectation suite is a collection of expectations ready to be applied to a batch of data. Since in many projects it is useful to have different expectations evaluate in different contexts–profiling vs. testing; warning vs. error; high vs. low compute; ML model or dashboard–suites provide a namespace option for selecting which expectations a DataContext returns.
In many simple projects, the datasource or batch kwargs generator name may be omitted and the DataContext will infer the correct name when there is no ambiguity.
Similarly, if no expectation suite name is provided, the DataContext will assume the name “default”.
-
classmethod
create
(cls, project_root_dir=None, usage_statistics_enabled=True, runtime_environment=None)¶ Build a new great_expectations directory and DataContext object in the provided project_root_dir.
create will not create a new “great_expectations” directory in the provided folder, provided one does not already exist. Then, it will initialize a new DataContext in that folder and write the resulting config.
- Parameters
project_root_dir – path to the root directory in which to create a new great_expectations directory
runtime_environment – a dictionary of config variables that
both those set in config_variables.yml and the environment (override) –
- Returns
DataContext
-
classmethod
all_uncommitted_directories_exist
(cls, ge_dir)¶ Check if all uncommitted direcotries exist.
-
classmethod
config_variables_yml_exist
(cls, ge_dir)¶ Check if all config_variables.yml exists.
-
classmethod
write_config_variables_template_to_disk
(cls, uncommitted_dir)¶
-
classmethod
write_project_template_to_disk
(cls, ge_dir, usage_statistics_enabled=True)¶
-
classmethod
scaffold_directories
(cls, base_dir)¶ Safely create GE directories for a new project.
-
classmethod
scaffold_custom_data_docs
(cls, plugins_dir)¶ Copy custom data docs templates
-
classmethod
scaffold_notebooks
(cls, base_dir)¶ Copy template notebooks into the notebooks directory for a project.
-
_load_project_config
(self)¶ Reads the project configuration from the project configuration file. The file may contain ${SOME_VARIABLE} variables - see self._project_config_with_variables_substituted for how these are substituted.
- Returns
the configuration object read from the file
-
list_checkpoints
(self)¶ List checkpoints. (Experimental)
-
get_checkpoint
(self, checkpoint_name: str)¶ Load a checkpoint. (Experimental)
-
_list_ymls_in_checkpoints_directory
(self)¶
-
_save_project_config
(self)¶ Save the current project to disk.
-
add_store
(self, store_name, store_config)¶ Add a new Store to the DataContext and (for convenience) return the instantiated Store object.
- Parameters
store_name (str) – a key for the new Store in in self._stores
store_config (dict) – a config for the Store to add
- Returns
store (Store)
-
add_datasource
(self, name, **kwargs)¶ Add a new datasource to the data context, with configuration provided as kwargs. :param name: the name for the new datasource to add :param initialize: if False, add the datasource to the config, but do not
initialize it, for example if a user needs to debug database connectivity.
- Parameters
kwargs (keyword arguments) – the configuration for the new datasource
- Returns
datasource (Datasource)
-
classmethod
find_context_root_dir
(cls)¶
-
classmethod
get_ge_config_version
(cls, context_root_dir=None)¶
-
classmethod
set_ge_config_version
(cls, config_version, context_root_dir=None, validate_config_version=True)¶
-
classmethod
find_context_yml_file
(cls, search_start_dir=None)¶ Search for the yml file starting here and moving upward.
-
classmethod
does_config_exist_on_disk
(cls, context_root_dir)¶ Return True if the great_expectations.yml exists on disk.
-
classmethod
is_project_initialized
(cls, ge_dir)¶ Return True if the project is initialized.
To be considered initialized, all of the following must be true: - all project directories exist (including uncommitted directories) - a valid great_expectations.yml is on disk - a config_variables.yml is on disk - the project has at least one datasource - the project has at least one suite
-
classmethod
does_project_have_a_datasource_in_config_file
(cls, ge_dir)¶
-
classmethod
_does_context_have_at_least_one_datasource
(cls, ge_dir)¶
-
classmethod
_does_context_have_at_least_one_suite
(cls, ge_dir)¶
-
classmethod
_attempt_context_instantiation
(cls, ge_dir)¶
-
static
_validate_checkpoint
(checkpoint: dict, checkpoint_name: str)¶
-
great_expectations.
rtd_url_ge_version
¶