great_expectations
¶
Subpackages¶
great_expectations.checkpoint
great_expectations.cli
great_expectations.cli.upgrade_helpers
great_expectations.cli.v012
great_expectations.cli.v012.upgrade_helpers
great_expectations.cli.v012.checkpoint
great_expectations.cli.v012.checkpoint_script_template
great_expectations.cli.v012.cli
great_expectations.cli.v012.cli_logging
great_expectations.cli.v012.cli_messages
great_expectations.cli.v012.datasource
great_expectations.cli.v012.docs
great_expectations.cli.v012.init
great_expectations.cli.v012.mark
great_expectations.cli.v012.project
great_expectations.cli.v012.python_subprocess
great_expectations.cli.v012.store
great_expectations.cli.v012.suite
great_expectations.cli.v012.toolkit
great_expectations.cli.v012.util
great_expectations.cli.v012.validation_operator
great_expectations.cli.batch_request
great_expectations.cli.build_docs
great_expectations.cli.checkpoint
great_expectations.cli.checkpoint_script_template
great_expectations.cli.cli
great_expectations.cli.cli_logging
great_expectations.cli.cli_messages
great_expectations.cli.datasource
great_expectations.cli.docs
great_expectations.cli.init
great_expectations.cli.mark
great_expectations.cli.pretty_printing
great_expectations.cli.project
great_expectations.cli.python_subprocess
great_expectations.cli.store
great_expectations.cli.suite
great_expectations.cli.toolkit
great_expectations.cli.util
great_expectations.cli.validation_operator
great_expectations.core
great_expectations.core.expectation_diagnostics
great_expectations.core.usage_statistics
great_expectations.core.usage_statistics.anonymizers
great_expectations.core.usage_statistics.events
great_expectations.core.usage_statistics.execution_environment
great_expectations.core.usage_statistics.package_dependencies
great_expectations.core.usage_statistics.schemas
great_expectations.core.usage_statistics.usage_statistics
great_expectations.core.usage_statistics.util
great_expectations.core._docs_decorators
great_expectations.core.async_executor
great_expectations.core.batch
great_expectations.core.batch_manager
great_expectations.core.batch_spec
great_expectations.core.config_peer
great_expectations.core.config_provider
great_expectations.core.config_substitutor
great_expectations.core.configuration
great_expectations.core.data_context_key
great_expectations.core.domain
great_expectations.core.evaluation_parameters
great_expectations.core.expectation_configuration
great_expectations.core.expectation_suite
great_expectations.core.expectation_validation_result
great_expectations.core.http
great_expectations.core.id_dict
great_expectations.core.metric
great_expectations.core.metric_domain_types
great_expectations.core.profiler_types_mapping
great_expectations.core.run_identifier
great_expectations.core.serializer
great_expectations.core.urn
great_expectations.core.util
great_expectations.core.yaml_handler
great_expectations.data_asset
great_expectations.data_context
great_expectations.data_context.config_validator
great_expectations.data_context.data_context
great_expectations.data_context.data_context.abstract_data_context
great_expectations.data_context.data_context.base_data_context
great_expectations.data_context.data_context.cloud_data_context
great_expectations.data_context.data_context.data_context
great_expectations.data_context.data_context.ephemeral_data_context
great_expectations.data_context.data_context.explorer_data_context
great_expectations.data_context.data_context.file_data_context
great_expectations.data_context.migrator
great_expectations.data_context.store
great_expectations.data_context.store._store_backend
great_expectations.data_context.store.checkpoint_store
great_expectations.data_context.store.configuration_store
great_expectations.data_context.store.data_context_store
great_expectations.data_context.store.database_store_backend
great_expectations.data_context.store.datasource_store
great_expectations.data_context.store.expectations_store
great_expectations.data_context.store.ge_cloud_store_backend
great_expectations.data_context.store.gx_cloud_store_backend
great_expectations.data_context.store.html_site_store
great_expectations.data_context.store.in_memory_store_backend
great_expectations.data_context.store.inline_store_backend
great_expectations.data_context.store.json_site_store
great_expectations.data_context.store.metric_store
great_expectations.data_context.store.profiler_store
great_expectations.data_context.store.query_store
great_expectations.data_context.store.store
great_expectations.data_context.store.store_backend
great_expectations.data_context.store.tuple_store_backend
great_expectations.data_context.store.validations_store
great_expectations.data_context.types
great_expectations.data_context.cloud_constants
great_expectations.data_context.data_context_variables
great_expectations.data_context.templates
great_expectations.data_context.util
great_expectations.dataset
great_expectations.datasource
great_expectations.datasource.batch_kwargs_generator
great_expectations.datasource.batch_kwargs_generator.batch_kwargs_generator
great_expectations.datasource.batch_kwargs_generator.databricks_batch_kwargs_generator
great_expectations.datasource.batch_kwargs_generator.glob_reader_batch_kwargs_generator
great_expectations.datasource.batch_kwargs_generator.manual_batch_kwargs_generator
great_expectations.datasource.batch_kwargs_generator.query_batch_kwargs_generator
great_expectations.datasource.batch_kwargs_generator.s3_batch_kwargs_generator
great_expectations.datasource.batch_kwargs_generator.s3_subdir_reader_batch_kwargs_generator
great_expectations.datasource.batch_kwargs_generator.subdir_reader_batch_kwargs_generator
great_expectations.datasource.batch_kwargs_generator.table_batch_kwargs_generator
great_expectations.datasource.data_connector
great_expectations.datasource.data_connector.asset
great_expectations.datasource.data_connector.sorter
great_expectations.datasource.data_connector.batch_filter
great_expectations.datasource.data_connector.configured_asset_aws_glue_data_catalog_data_connector
great_expectations.datasource.data_connector.configured_asset_azure_data_connector
great_expectations.datasource.data_connector.configured_asset_dbfs_data_connector
great_expectations.datasource.data_connector.configured_asset_file_path_data_connector
great_expectations.datasource.data_connector.configured_asset_filesystem_data_connector
great_expectations.datasource.data_connector.configured_asset_gcs_data_connector
great_expectations.datasource.data_connector.configured_asset_s3_data_connector
great_expectations.datasource.data_connector.configured_asset_sql_data_connector
great_expectations.datasource.data_connector.data_connector
great_expectations.datasource.data_connector.file_path_data_connector
great_expectations.datasource.data_connector.inferred_asset_aws_glue_data_catalog_data_connector
great_expectations.datasource.data_connector.inferred_asset_azure_data_connector
great_expectations.datasource.data_connector.inferred_asset_dbfs_data_connector
great_expectations.datasource.data_connector.inferred_asset_file_path_data_connector
great_expectations.datasource.data_connector.inferred_asset_filesystem_data_connector
great_expectations.datasource.data_connector.inferred_asset_gcs_data_connector
great_expectations.datasource.data_connector.inferred_asset_s3_data_connector
great_expectations.datasource.data_connector.inferred_asset_sql_data_connector
great_expectations.datasource.data_connector.runtime_data_connector
great_expectations.datasource.data_connector.util
great_expectations.datasource.types
great_expectations.datasource.datasource
great_expectations.datasource.datasource_serializer
great_expectations.datasource.new_datasource
great_expectations.datasource.pandas_datasource
great_expectations.datasource.simple_sqlalchemy_datasource
great_expectations.datasource.sparkdf_datasource
great_expectations.datasource.sqlalchemy_datasource
great_expectations.exceptions
great_expectations.execution_engine
great_expectations.execution_engine.split_and_sample
great_expectations.execution_engine.split_and_sample.data_sampler
great_expectations.execution_engine.split_and_sample.data_splitter
great_expectations.execution_engine.split_and_sample.pandas_data_sampler
great_expectations.execution_engine.split_and_sample.pandas_data_splitter
great_expectations.execution_engine.split_and_sample.sparkdf_data_sampler
great_expectations.execution_engine.split_and_sample.sparkdf_data_splitter
great_expectations.execution_engine.split_and_sample.sqlalchemy_data_sampler
great_expectations.execution_engine.split_and_sample.sqlalchemy_data_splitter
great_expectations.execution_engine.bundled_metric_configuration
great_expectations.execution_engine.execution_engine
great_expectations.execution_engine.pandas_batch_data
great_expectations.execution_engine.pandas_execution_engine
great_expectations.execution_engine.sparkdf_batch_data
great_expectations.execution_engine.sparkdf_execution_engine
great_expectations.execution_engine.sqlalchemy_batch_data
great_expectations.execution_engine.sqlalchemy_dialect
great_expectations.execution_engine.sqlalchemy_execution_engine
great_expectations.execution_engine.util
great_expectations.expectations
great_expectations.expectations.core
great_expectations.expectations.core.expect_column_bootstrapped_ks_test_p_value_to_be_greater_than
great_expectations.expectations.core.expect_column_chisquare_test_p_value_to_be_greater_than
great_expectations.expectations.core.expect_column_distinct_values_to_be_in_set
great_expectations.expectations.core.expect_column_distinct_values_to_contain_set
great_expectations.expectations.core.expect_column_distinct_values_to_equal_set
great_expectations.expectations.core.expect_column_kl_divergence_to_be_less_than
great_expectations.expectations.core.expect_column_max_to_be_between
great_expectations.expectations.core.expect_column_mean_to_be_between
great_expectations.expectations.core.expect_column_median_to_be_between
great_expectations.expectations.core.expect_column_min_to_be_between
great_expectations.expectations.core.expect_column_most_common_value_to_be_in_set
great_expectations.expectations.core.expect_column_pair_cramers_phi_value_to_be_less_than
great_expectations.expectations.core.expect_column_pair_values_a_to_be_greater_than_b
great_expectations.expectations.core.expect_column_pair_values_to_be_equal
great_expectations.expectations.core.expect_column_pair_values_to_be_in_set
great_expectations.expectations.core.expect_column_parameterized_distribution_ks_test_p_value_to_be_greater_than
great_expectations.expectations.core.expect_column_proportion_of_unique_values_to_be_between
great_expectations.expectations.core.expect_column_quantile_values_to_be_between
great_expectations.expectations.core.expect_column_stdev_to_be_between
great_expectations.expectations.core.expect_column_sum_to_be_between
great_expectations.expectations.core.expect_column_to_exist
great_expectations.expectations.core.expect_column_unique_value_count_to_be_between
great_expectations.expectations.core.expect_column_value_lengths_to_be_between
great_expectations.expectations.core.expect_column_value_lengths_to_equal
great_expectations.expectations.core.expect_column_value_z_scores_to_be_less_than
great_expectations.expectations.core.expect_column_values_to_be_between
great_expectations.expectations.core.expect_column_values_to_be_dateutil_parseable
great_expectations.expectations.core.expect_column_values_to_be_decreasing
great_expectations.expectations.core.expect_column_values_to_be_in_set
great_expectations.expectations.core.expect_column_values_to_be_in_type_list
great_expectations.expectations.core.expect_column_values_to_be_increasing
great_expectations.expectations.core.expect_column_values_to_be_json_parseable
great_expectations.expectations.core.expect_column_values_to_be_null
great_expectations.expectations.core.expect_column_values_to_be_of_type
great_expectations.expectations.core.expect_column_values_to_be_unique
great_expectations.expectations.core.expect_column_values_to_match_json_schema
great_expectations.expectations.core.expect_column_values_to_match_like_pattern
great_expectations.expectations.core.expect_column_values_to_match_like_pattern_list
great_expectations.expectations.core.expect_column_values_to_match_regex
great_expectations.expectations.core.expect_column_values_to_match_regex_list
great_expectations.expectations.core.expect_column_values_to_match_strftime_format
great_expectations.expectations.core.expect_column_values_to_not_be_in_set
great_expectations.expectations.core.expect_column_values_to_not_be_null
great_expectations.expectations.core.expect_column_values_to_not_match_like_pattern
great_expectations.expectations.core.expect_column_values_to_not_match_like_pattern_list
great_expectations.expectations.core.expect_column_values_to_not_match_regex
great_expectations.expectations.core.expect_column_values_to_not_match_regex_list
great_expectations.expectations.core.expect_compound_columns_to_be_unique
great_expectations.expectations.core.expect_multicolumn_sum_to_equal
great_expectations.expectations.core.expect_multicolumn_values_to_be_unique
great_expectations.expectations.core.expect_select_column_values_to_be_unique_within_record
great_expectations.expectations.core.expect_table_column_count_to_be_between
great_expectations.expectations.core.expect_table_column_count_to_equal
great_expectations.expectations.core.expect_table_columns_to_match_ordered_list
great_expectations.expectations.core.expect_table_columns_to_match_set
great_expectations.expectations.core.expect_table_row_count_to_be_between
great_expectations.expectations.core.expect_table_row_count_to_equal
great_expectations.expectations.core.expect_table_row_count_to_equal_other_table
great_expectations.expectations.metrics
great_expectations.expectations.metrics.column_aggregate_metrics
great_expectations.expectations.metrics.column_map_metrics
great_expectations.expectations.metrics.column_pair_map_metrics
great_expectations.expectations.metrics.multicolumn_map_metrics
great_expectations.expectations.metrics.query_metrics
great_expectations.expectations.metrics.table_metrics
great_expectations.expectations.metrics.column_aggregate_metric
great_expectations.expectations.metrics.column_aggregate_metric_provider
great_expectations.expectations.metrics.import_manager
great_expectations.expectations.metrics.map_metric
great_expectations.expectations.metrics.map_metric_provider
great_expectations.expectations.metrics.meta_metric_provider
great_expectations.expectations.metrics.metric_provider
great_expectations.expectations.metrics.query_metric_provider
great_expectations.expectations.metrics.table_metric
great_expectations.expectations.metrics.table_metric_provider
great_expectations.expectations.metrics.util
great_expectations.expectations.expectation
great_expectations.expectations.regex_based_column_map_expectation
great_expectations.expectations.registry
great_expectations.expectations.row_conditions
great_expectations.expectations.set_based_column_map_expectation
great_expectations.expectations.sql_tokens_and_types
great_expectations.expectations.util
great_expectations.expectations.validation_handlers
great_expectations.experimental
great_expectations.experimental.datasources
great_expectations.experimental.datasources.config
great_expectations.experimental.datasources.experimental_base_model
great_expectations.experimental.datasources.interfaces
great_expectations.experimental.datasources.metadatasource
great_expectations.experimental.datasources.postgres_datasource
great_expectations.experimental.datasources.sources
great_expectations.experimental.datasources.type_lookup
great_expectations.experimental.context
great_expectations.experimental.logger
great_expectations.jupyter_ux
great_expectations.profile
great_expectations.profile.base
great_expectations.profile.basic_dataset_profiler
great_expectations.profile.basic_suite_builder_profiler
great_expectations.profile.columns_exist
great_expectations.profile.json_schema_profiler
great_expectations.profile.metrics_utils
great_expectations.profile.multi_batch_validation_meta_analysis
great_expectations.profile.user_configurable_profiler
great_expectations.render
great_expectations.render.notebook_assets
great_expectations.render.renderer
great_expectations.render.renderer.content_block
great_expectations.render.renderer.call_to_action_renderer
great_expectations.render.renderer.checkpoint_new_notebook_renderer
great_expectations.render.renderer.column_section_renderer
great_expectations.render.renderer.datasource_new_notebook_renderer
great_expectations.render.renderer.email_renderer
great_expectations.render.renderer.inline_renderer
great_expectations.render.renderer.microsoft_teams_renderer
great_expectations.render.renderer.notebook_renderer
great_expectations.render.renderer.opsgenie_renderer
great_expectations.render.renderer.page_renderer
great_expectations.render.renderer.profiling_results_overview_section_renderer
great_expectations.render.renderer.renderer
great_expectations.render.renderer.site_builder
great_expectations.render.renderer.site_index_page_renderer
great_expectations.render.renderer.slack_renderer
great_expectations.render.renderer.suite_edit_notebook_renderer
great_expectations.render.renderer.suite_scaffold_notebook_renderer
great_expectations.render.types
great_expectations.render.v3
great_expectations.render.view
great_expectations.render.components
great_expectations.render.exceptions
great_expectations.render.page_renderer_util
great_expectations.render.renderer_configuration
great_expectations.render.util
great_expectations.rule_based_profiler
great_expectations.rule_based_profiler.altair
great_expectations.rule_based_profiler.config
great_expectations.rule_based_profiler.data_assistant
great_expectations.rule_based_profiler.data_assistant.data_assistant
great_expectations.rule_based_profiler.data_assistant.data_assistant_dispatcher
great_expectations.rule_based_profiler.data_assistant.data_assistant_runner
great_expectations.rule_based_profiler.data_assistant.onboarding_data_assistant
great_expectations.rule_based_profiler.data_assistant.volume_data_assistant
great_expectations.rule_based_profiler.data_assistant_result
great_expectations.rule_based_profiler.data_assistant_result.data_assistant_result
great_expectations.rule_based_profiler.data_assistant_result.onboarding_data_assistant_result
great_expectations.rule_based_profiler.data_assistant_result.plot_components
great_expectations.rule_based_profiler.data_assistant_result.plot_result
great_expectations.rule_based_profiler.data_assistant_result.volume_data_assistant_result
great_expectations.rule_based_profiler.domain_builder
great_expectations.rule_based_profiler.domain_builder.categorical_column_domain_builder
great_expectations.rule_based_profiler.domain_builder.column_domain_builder
great_expectations.rule_based_profiler.domain_builder.column_pair_domain_builder
great_expectations.rule_based_profiler.domain_builder.domain_builder
great_expectations.rule_based_profiler.domain_builder.map_metric_column_domain_builder
great_expectations.rule_based_profiler.domain_builder.multi_column_domain_builder
great_expectations.rule_based_profiler.domain_builder.table_domain_builder
great_expectations.rule_based_profiler.estimators
great_expectations.rule_based_profiler.estimators.bootstrap_numeric_range_estimator
great_expectations.rule_based_profiler.estimators.exact_numeric_range_estimator
great_expectations.rule_based_profiler.estimators.kde_numeric_range_estimator
great_expectations.rule_based_profiler.estimators.numeric_range_estimation_result
great_expectations.rule_based_profiler.estimators.numeric_range_estimator
great_expectations.rule_based_profiler.estimators.quantiles_numeric_range_estimator
great_expectations.rule_based_profiler.expectation_configuration_builder
great_expectations.rule_based_profiler.helpers
great_expectations.rule_based_profiler.helpers.cardinality_checker
great_expectations.rule_based_profiler.helpers.configuration_reconciliation
great_expectations.rule_based_profiler.helpers.runtime_environment
great_expectations.rule_based_profiler.helpers.simple_semantic_type_filter
great_expectations.rule_based_profiler.helpers.util
great_expectations.rule_based_profiler.parameter_builder
great_expectations.rule_based_profiler.parameter_builder.histogram_single_batch_parameter_builder
great_expectations.rule_based_profiler.parameter_builder.mean_table_columns_set_match_multi_batch_parameter_builder
great_expectations.rule_based_profiler.parameter_builder.mean_unexpected_map_metric_multi_batch_parameter_builder
great_expectations.rule_based_profiler.parameter_builder.metric_multi_batch_parameter_builder
great_expectations.rule_based_profiler.parameter_builder.metric_single_batch_parameter_builder
great_expectations.rule_based_profiler.parameter_builder.numeric_metric_range_multi_batch_parameter_builder
great_expectations.rule_based_profiler.parameter_builder.parameter_builder
great_expectations.rule_based_profiler.parameter_builder.regex_pattern_string_parameter_builder
great_expectations.rule_based_profiler.parameter_builder.simple_date_format_string_parameter_builder
great_expectations.rule_based_profiler.parameter_builder.value_counts_single_batch_parameter_builder
great_expectations.rule_based_profiler.parameter_builder.value_set_multi_batch_parameter_builder
great_expectations.rule_based_profiler.rule
great_expectations.rule_based_profiler.attributed_resolved_metrics
great_expectations.rule_based_profiler.builder
great_expectations.rule_based_profiler.metric_computation_result
great_expectations.rule_based_profiler.parameter_container
great_expectations.rule_based_profiler.rule_based_profiler
great_expectations.rule_based_profiler.rule_based_profiler_result
great_expectations.rule_based_profiler.semantic_type_filter
great_expectations.self_check
great_expectations.types
great_expectations.validation_operators
great_expectations.validator
Package Contents¶
Classes¶
|
|
|
A DataContext represents a Great Expectations project. It is the primary entry point for a Great Expectations |
Functions¶
Get version information or return default if unable to do so. |
|
|
Read a Pandas data frame and return a great_expectations dataset. |
|
Method to return the appropriate DataContext depending on parameters and environment. |
|
Read a file using Pandas read_csv and return a great_expectations dataset. |
|
Read a file using Pandas read_excel and return a great_expectations dataset. |
|
Read a file using Pandas read_feather and return a great_expectations dataset. |
|
Read a file using Pandas read_json and return a great_expectations dataset. |
|
Read a file using Pandas read_parquet and return a great_expectations dataset. |
|
Read a file using Pandas read_pickle and return a great_expectations dataset. |
|
Read a file using Pandas read_sas and return a great_expectations dataset. |
|
Read a file using Pandas read_table and return a great_expectations dataset. |
|
Validate the provided data asset. Validate can accept an optional data_asset_name to apply, data_context to use |
-
great_expectations.
get_versions
()¶ Get version information or return default if unable to do so.
-
great_expectations.
__version__
¶
-
class
great_expectations.
CloudMigrator
(context: BaseDataContext, cloud_base_url: Optional[str] = None, cloud_access_token: Optional[str] = None, cloud_organization_id: Optional[str] = None)¶ -
classmethod
migrate
(cls, context: BaseDataContext, test_migrate: bool, cloud_base_url: Optional[str] = None, cloud_access_token: Optional[str] = None, cloud_organization_id: Optional[str] = None)¶ Migrate your Data Context to GX Cloud.
- Parameters
context – The Data Context you wish to migrate.
test_migrate – True if this is a test, False if you want to perform the migration.
cloud_base_url – Optional, you may provide this alternatively via environment variable GX_CLOUD_BASE_URL
cloud_access_token – Optional, you may provide this alternatively via environment variable GX_CLOUD_ACCESS_TOKEN
cloud_organization_id – Optional, you may provide this alternatively via environment variable GX_CLOUD_ORGANIZATION_ID
- Returns
CloudMigrator instance
-
retry_migrate_validation_results
(self)¶
-
_migrate_to_cloud
(self, test_migrate: bool)¶
-
_emit_log_stmts
(self, configuration_bundle: ConfigurationBundle, test_migrate: bool)¶
-
_log_about_test_migrate
(self)¶
-
_log_about_usage_stats_disabled
(self)¶
-
_log_about_bundle_contains_datasources
(self)¶
-
_print_configuration_bundle_summary
(self, configuration_bundle: ConfigurationBundle)¶
-
_print_object_summary
(self, obj_name: str, obj_collection: List[AbstractConfig])¶
-
_serialize_configuration_bundle
(self, configuration_bundle: ConfigurationBundle)¶
-
_prepare_validation_results
(self, serialized_bundle: dict)¶
-
_send_configuration_bundle
(self, serialized_bundle: dict, test_migrate: bool)¶
-
_send_validation_results
(self, serialized_validation_results: Dict[str, dict], test_migrate: bool)¶
-
_process_validation_results
(self, serialized_validation_results: Dict[str, dict], test_migrate: bool)¶
-
_post_to_cloud_backend
(self, resource_name: str, resource_type: str, attributes_key: str, attributes_value: dict)¶
-
_print_unsuccessful_validation_message
(self)¶
-
_print_migration_introduction_message
(self)¶
-
_print_migration_conclusion_message
(self, test_migrate: bool)¶
-
classmethod
-
class
great_expectations.
DataContext
(context_root_dir: Optional[str] = None, runtime_environment: Optional[dict] = None, cloud_mode: bool = False, cloud_base_url: Optional[str] = None, cloud_access_token: Optional[str] = None, cloud_organization_id: Optional[str] = None, ge_cloud_mode: bool = False, ge_cloud_base_url: Optional[str] = None, ge_cloud_access_token: Optional[str] = None, ge_cloud_organization_id: Optional[str] = None)¶ Bases:
great_expectations.data_context.data_context.base_data_context.BaseDataContext
A DataContext represents a Great Expectations project. It is the primary entry point for a Great Expectations deployment, with configurations and methods for all supporting components.
The DataContext is configured via a yml file stored in a directory called great_expectations; this configuration file as well as managed Expectation Suites should be stored in version control. There are other ways to create a Data Context that may be better suited for your particular deployment e.g. ephemerally or backed by GX Cloud (coming soon). Please refer to our documentation for more details.
You can Validate data or generate Expectations using Execution Engines including:
SQL (multiple dialects supported)
Spark
Pandas
Your data can be stored in common locations including:
databases / data warehouses
files in s3, GCS, Azure, local storage
dataframes (spark and pandas) loaded into memory
Please see our documentation for examples on how to set up Great Expectations, connect to your data, create Expectations, and Validate data.
Other configuration options you can apply to a DataContext besides how to access data include things like where to store Expectations, Profilers, Checkpoints, Metrics, Validation Results and Data Docs and how those Stores are configured. Take a look at our documentation for more configuration options.
–Public API–
- --Documentation--
-
classmethod
create
(cls, project_root_dir: Optional[str] = None, usage_statistics_enabled: bool = True, runtime_environment: Optional[dict] = None)¶ Build a new great_expectations directory and DataContext object in the provided project_root_dir.
create will create a new “great_expectations” directory in the provided folder, provided one does not already exist. Then, it will initialize a new DataContext in that folder and write the resulting config.
- --Documentation--
- Parameters
project_root_dir – path to the root directory in which to create a new great_expectations directory
usage_statistics_enabled – boolean directive specifying whether or not to gather usage statistics
runtime_environment – a dictionary of config variables that override both those set in config_variables.yml and the environment
- Returns
DataContext
-
classmethod
all_uncommitted_directories_exist
(cls, ge_dir: str)¶ Check if all uncommitted directories exist.
-
classmethod
config_variables_yml_exist
(cls, ge_dir: str)¶ Check if all config_variables.yml exists.
-
classmethod
write_config_variables_template_to_disk
(cls, uncommitted_dir: str)¶
-
classmethod
write_project_template_to_disk
(cls, ge_dir: str, usage_statistics_enabled: bool = True)¶
-
classmethod
scaffold_directories
(cls, base_dir: str)¶ Safely create GX directories for a new project.
-
classmethod
scaffold_custom_data_docs
(cls, plugins_dir: str)¶ Copy custom data docs templates
-
_save_project_config
(self)¶ See parent ‘AbstractDataContext._save_project_config()` for more information.
Explicitly override base class implementation to retain legacy behavior.
-
_attach_datasource_to_context
(self, datasource: XDatasource)¶
-
property
sources
(self)¶
-
_init_cloud_config
(self, cloud_mode: bool, cloud_base_url: Optional[str], cloud_access_token: Optional[str], cloud_organization_id: Optional[str])¶
-
_init_context_root_directory
(self, context_root_dir: Optional[str])¶
-
_check_for_usage_stats_sync
(self, project_config: DataContextConfig)¶ If there are differences between the DataContextConfig used to instantiate the DataContext and the DataContextConfig assigned to self.config, we want to save those changes to disk so that subsequent instantiations will utilize the same values.
A small caveat is that if that difference stems from a global override (env var or conf file), we don’t want to write to disk. This is due to the fact that those mechanisms allow for dynamic values and saving them will make them static.
- Parameters
project_config – The DataContextConfig used to instantiate the DataContext.
- Returns
A boolean signifying whether or not the current DataContext’s config needs to be persisted in order to recognize changes made to usage statistics.
-
_load_project_config
(self)¶ Reads the project configuration from the project configuration file. The file may contain ${SOME_VARIABLE} variables - see self.project_config_with_variables_substituted for how these are substituted.
For Data Contexts in GX Cloud mode, a user-specific template is retrieved from the Cloud API - see CloudDataContext.retrieve_data_context_config_from_cloud for more details.
- Returns
the configuration object read from the file or template
-
add_store
(self, store_name, store_config)¶ Add a new Store to the DataContext and (for convenience) return the instantiated Store object.
- Parameters
store_name (str) – a key for the new Store in in self._stores
store_config (dict) – a config for the Store to add
- Returns
store (Store)
-
add_datasource
(self, name: str, **kwargs: dict)¶ Add named datasource, with options to initialize (and return) the datasource and save_config.
Current version will call super(), which preserves the usage_statistics decorator in the current method. A subsequence refactor will migrate the usage_statistics to parent and sibling classes.
- Parameters
name (str) – Name of Datasource
initialize (bool) – Should GX add and initialize the Datasource? If true then current method will return initialized Datasource
save_changes (Optional[bool]) – should GX save the Datasource config?
Optional[dict] (**kwargs) – Additional kwargs that define Datasource initialization kwargs
- Returns
Datasource that was added
-
update_datasource
(self, datasource: Union[LegacyDatasource, BaseDatasource])¶ See parent BaseDataContext.update_datasource for more details. Note that this method persists changes using an underlying Store.
-
delete_datasource
(self, name: str)¶ Delete a data source :param datasource_name: The name of the datasource to delete. :param save_changes: Whether or not to save changes to disk.
- Raises
ValueError – If the datasource name isn’t provided or cannot be found.
-
classmethod
find_context_root_dir
(cls)¶
-
classmethod
get_ge_config_version
(cls, context_root_dir: Optional[str] = None)¶
-
classmethod
set_ge_config_version
(cls, config_version: Union[int, float], context_root_dir: Optional[str] = None, validate_config_version: bool = True)¶
-
classmethod
find_context_yml_file
(cls, search_start_dir: Optional[str] = None)¶ Search for the yml file starting here and moving upward.
-
classmethod
does_config_exist_on_disk
(cls, context_root_dir: str)¶ Return True if the great_expectations.yml exists on disk.
-
classmethod
is_project_initialized
(cls, ge_dir: str)¶ Return True if the project is initialized.
To be considered initialized, all of the following must be true: - all project directories exist (including uncommitted directories) - a valid great_expectations.yml is on disk - a config_variables.yml is on disk - the project has at least one datasource - the project has at least one suite
-
classmethod
does_project_have_a_datasource_in_config_file
(cls, ge_dir: str)¶
-
classmethod
_does_context_have_at_least_one_datasource
(cls, ge_dir: str)¶
-
classmethod
_does_context_have_at_least_one_suite
(cls, ge_dir: str)¶
-
classmethod
_attempt_context_instantiation
(cls, ge_dir: str)¶
-
great_expectations.
from_pandas
(pandas_df, class_name='PandasDataset', module_name='great_expectations.dataset', dataset_class=None, expectation_suite=None, profiler=None)¶ Read a Pandas data frame and return a great_expectations dataset.
- Parameters
pandas_df (Pandas df) – Pandas data frame
class_name (str) – class to which to convert resulting Pandas df
module_name (str) – dataset module from which to try to dynamically load the relevant module
dataset_class (Dataset) – If specified, the class to which to convert the resulting Dataset object; if not specified, try to load the class named via the class_name and module_name parameters
expectation_suite (string) – path to great_expectations expectation suite file
profiler (profiler class) – The profiler that should be run on the dataset to establish a baseline expectation suite.
- Returns
great_expectations dataset
-
great_expectations.
get_context
(project_config: Optional[Union['DataContextConfig', Mapping]] = None, context_root_dir: Optional[str] = None, runtime_environment: Optional[dict] = None, cloud_base_url: Optional[str] = None, cloud_access_token: Optional[str] = None, cloud_organization_id: Optional[str] = None, cloud_mode: Optional[bool] = None, ge_cloud_base_url: Optional[str] = None, ge_cloud_access_token: Optional[str] = None, ge_cloud_organization_id: Optional[str] = None, ge_cloud_mode: Optional[bool] = None) → Union['DataContext', 'BaseDataContext', 'CloudDataContext']¶ Method to return the appropriate DataContext depending on parameters and environment.
- Usage:
import great_expectations as gx my_context = gx.get_context([parameters])
- If gx.get_context() is run in a filesystem where great_expectations init has been run, then it will return a
DataContext
- If gx.get_context() is passed in a context_root_dir (which contains great_expectations.yml) then it will return
a DataContext
- If gx.get_context() is passed in an in-memory project_config then it will return BaseDataContext.
context_root_dir can also be passed in, but the configurations from the in-memory config will override the configurations in the great_expectations.yml file.
- If GX is being run in the cloud, and the information needed for ge_cloud_config (ie ge_cloud_base_url,
ge_cloud_access_token, ge_cloud_organization_id) are passed in as parameters to get_context(), configured as environment variables, or in a .conf file, then get_context() will return a CloudDataContext.
get_context params
Env Not Config’d
Env Config’d
() (cloud_mode=True) (cloud_mode=False)
Local Exception! Local
Cloud Cloud Local
TODO: This method will eventually return FileDataContext and EphemeralDataContext, rather than DataContext and Base
- Parameters
project_config (dict or DataContextConfig) – In-memory configuration for DataContext.
context_root_dir (str) – Path to directory that contains great_expectations.yml file
runtime_environment (dict) – A dictionary of values can be passed to a DataContext when it is instantiated. These values will override both values from the config variables file and from environment variables.
following parameters are relevant when running ge_cloud (The) –
cloud_base_url (str) – url for ge_cloud endpoint.
cloud_access_token (str) – access_token for ge_cloud account.
cloud_organization_id (str) – org_id for ge_cloud account.
cloud_mode (bool) – bool flag to specify whether to run GX in cloud mode (default is None).
- Returns
DataContext. Either a DataContext, BaseDataContext, or CloudDataContext depending on environment and/or parameters
-
great_expectations.
read_csv
(filename, class_name='PandasDataset', module_name='great_expectations.dataset', dataset_class=None, expectation_suite=None, profiler=None, *args, **kwargs)¶ Read a file using Pandas read_csv and return a great_expectations dataset.
- Parameters
filename (string) – path to file to read
class_name (str) – class to which to convert resulting Pandas df
module_name (str) – dataset module from which to try to dynamically load the relevant module
dataset_class (Dataset) – If specified, the class to which to convert the resulting Dataset object; if not specified, try to load the class named via the class_name and module_name parameters
expectation_suite (string) – path to great_expectations expectation suite file
profiler (Profiler class) – profiler to use when creating the dataset (default is None)
- Returns
great_expectations dataset
-
great_expectations.
read_excel
(filename, class_name='PandasDataset', module_name='great_expectations.dataset', dataset_class=None, expectation_suite=None, profiler=None, *args, **kwargs)¶ Read a file using Pandas read_excel and return a great_expectations dataset.
- Parameters
filename (string) – path to file to read
class_name (str) – class to which to convert resulting Pandas df
module_name (str) – dataset module from which to try to dynamically load the relevant module
dataset_class (Dataset) – If specified, the class to which to convert the resulting Dataset object; if not specified, try to load the class named via the class_name and module_name parameters
expectation_suite (string) – path to great_expectations expectation suite file
profiler (Profiler class) – profiler to use when creating the dataset (default is None)
- Returns
great_expectations dataset or ordered dict of great_expectations datasets, if multiple worksheets are imported
-
great_expectations.
read_feather
(filename, class_name='PandasDataset', module_name='great_expectations.dataset', dataset_class=None, expectation_suite=None, profiler=None, *args, **kwargs)¶ Read a file using Pandas read_feather and return a great_expectations dataset.
- Parameters
filename (string) – path to file to read
class_name (str) – class to which to convert resulting Pandas df
module_name (str) – dataset module from which to try to dynamically load the relevant module
dataset_class (Dataset) – If specified, the class to which to convert the resulting Dataset object; if not specified, try to load the class named via the class_name and module_name parameters
expectation_suite (string) – path to great_expectations expectation suite file
profiler (Profiler class) – profiler to use when creating the dataset (default is None)
- Returns
great_expectations dataset
-
great_expectations.
read_json
(filename, class_name='PandasDataset', module_name='great_expectations.dataset', dataset_class=None, expectation_suite=None, accessor_func=None, profiler=None, *args, **kwargs)¶ Read a file using Pandas read_json and return a great_expectations dataset.
- Parameters
filename (string) – path to file to read
class_name (str) – class to which to convert resulting Pandas df
module_name (str) – dataset module from which to try to dynamically load the relevant module
dataset_class (Dataset) – If specified, the class to which to convert the resulting Dataset object; if not specified, try to load the class named via the class_name and module_name parameters
expectation_suite (string) – path to great_expectations expectation suite file
accessor_func (Callable) – functions to transform the json object in the file
profiler (Profiler class) – profiler to use when creating the dataset (default is None)
- Returns
great_expectations dataset
-
great_expectations.
read_parquet
(filename, class_name='PandasDataset', module_name='great_expectations.dataset', dataset_class=None, expectation_suite=None, profiler=None, *args, **kwargs)¶ Read a file using Pandas read_parquet and return a great_expectations dataset.
- Parameters
filename (string) – path to file to read
class_name (str) – class to which to convert resulting Pandas df
module_name (str) – dataset module from which to try to dynamically load the relevant module
dataset_class (Dataset) – If specified, the class to which to convert the resulting Dataset object; if not specified, try to load the class named via the class_name and module_name parameters
expectation_suite (string) – path to great_expectations expectation suite file
profiler (Profiler class) – profiler to use when creating the dataset (default is None)
- Returns
great_expectations dataset
-
great_expectations.
read_pickle
(filename, class_name='PandasDataset', module_name='great_expectations.dataset', dataset_class=None, expectation_suite=None, profiler=None, *args, **kwargs)¶ Read a file using Pandas read_pickle and return a great_expectations dataset.
- Parameters
filename (string) – path to file to read
class_name (str) – class to which to convert resulting Pandas df
module_name (str) – dataset module from which to try to dynamically load the relevant module
dataset_class (Dataset) – If specified, the class to which to convert the resulting Dataset object; if not specified, try to load the class named via the class_name and module_name parameters
expectation_suite (string) – path to great_expectations expectation suite file
profiler (Profiler class) – profiler to use when creating the dataset (default is None)
- Returns
great_expectations dataset
-
great_expectations.
read_sas
(filename, class_name='PandasDataset', module_name='great_expectations.dataset', dataset_class=None, expectation_suite=None, profiler=None, *args, **kwargs)¶ Read a file using Pandas read_sas and return a great_expectations dataset.
- Parameters
filename (string) – path to file to read
class_name (str) – class to which to convert resulting Pandas df
module_name (str) – dataset module from which to try to dynamically load the relevant module
dataset_class (Dataset) – If specified, the class to which to convert the resulting Dataset object; if not specified, try to load the class named via the class_name and module_name parameters
expectation_suite (string) – path to great_expectations expectation suite file
profiler (Profiler class) – profiler to use when creating the dataset (default is None)
- Returns
great_expectations dataset
-
great_expectations.
read_table
(filename, class_name='PandasDataset', module_name='great_expectations.dataset', dataset_class=None, expectation_suite=None, profiler=None, *args, **kwargs)¶ Read a file using Pandas read_table and return a great_expectations dataset.
- Parameters
filename (string) – path to file to read
class_name (str) – class to which to convert resulting Pandas df
module_name (str) – dataset module from which to try to dynamically load the relevant module
dataset_class (Dataset) – If specified, the class to which to convert the resulting Dataset object; if not specified, try to load the class named via the class_name and module_name parameters
expectation_suite (string) – path to great_expectations expectation suite file
profiler (Profiler class) – profiler to use when creating the dataset (default is None)
- Returns
great_expectations dataset
-
great_expectations.
validate
(data_asset, expectation_suite=None, data_asset_name=None, expectation_suite_name=None, data_context=None, data_asset_class_name=None, data_asset_module_name='great_expectations.dataset', data_asset_class=None, *args, **kwargs)¶ Validate the provided data asset. Validate can accept an optional data_asset_name to apply, data_context to use to fetch an expectation_suite if one is not provided, and data_asset_class_name/data_asset_module_name or data_asset_class to use to provide custom expectations.
- Parameters
data_asset – the asset to validate
expectation_suite – the suite to use, or None to fetch one using a DataContext
data_asset_name – the name of the data asset to use
expectation_suite_name – the name of the expectation_suite to use
data_context – data context to use to fetch an an expectation suite, or the path from which to obtain one
data_asset_class_name – the name of a class to dynamically load a DataAsset class
data_asset_module_name – the name of the module to dynamically load a DataAsset class
data_asset_class – a class to use. overrides data_asset_class_name/ data_asset_module_name if provided
*args –
**kwargs –
Returns:
-
great_expectations.
rtd_url_ge_version
¶