great_expectations.rule_based_profiler
¶
Subpackages¶
great_expectations.rule_based_profiler.altair
great_expectations.rule_based_profiler.config
great_expectations.rule_based_profiler.data_assistant
great_expectations.rule_based_profiler.data_assistant.data_assistant
great_expectations.rule_based_profiler.data_assistant.data_assistant_dispatcher
great_expectations.rule_based_profiler.data_assistant.data_assistant_runner
great_expectations.rule_based_profiler.data_assistant.onboarding_data_assistant
great_expectations.rule_based_profiler.data_assistant.volume_data_assistant
great_expectations.rule_based_profiler.data_assistant_result
great_expectations.rule_based_profiler.data_assistant_result.data_assistant_result
great_expectations.rule_based_profiler.data_assistant_result.onboarding_data_assistant_result
great_expectations.rule_based_profiler.data_assistant_result.plot_components
great_expectations.rule_based_profiler.data_assistant_result.plot_result
great_expectations.rule_based_profiler.data_assistant_result.volume_data_assistant_result
great_expectations.rule_based_profiler.domain_builder
great_expectations.rule_based_profiler.domain_builder.categorical_column_domain_builder
great_expectations.rule_based_profiler.domain_builder.column_domain_builder
great_expectations.rule_based_profiler.domain_builder.column_pair_domain_builder
great_expectations.rule_based_profiler.domain_builder.domain_builder
great_expectations.rule_based_profiler.domain_builder.map_metric_column_domain_builder
great_expectations.rule_based_profiler.domain_builder.multi_column_domain_builder
great_expectations.rule_based_profiler.domain_builder.table_domain_builder
great_expectations.rule_based_profiler.estimators
great_expectations.rule_based_profiler.estimators.bootstrap_numeric_range_estimator
great_expectations.rule_based_profiler.estimators.exact_numeric_range_estimator
great_expectations.rule_based_profiler.estimators.kde_numeric_range_estimator
great_expectations.rule_based_profiler.estimators.numeric_range_estimation_result
great_expectations.rule_based_profiler.estimators.numeric_range_estimator
great_expectations.rule_based_profiler.estimators.quantiles_numeric_range_estimator
great_expectations.rule_based_profiler.expectation_configuration_builder
great_expectations.rule_based_profiler.helpers
great_expectations.rule_based_profiler.helpers.cardinality_checker
great_expectations.rule_based_profiler.helpers.configuration_reconciliation
great_expectations.rule_based_profiler.helpers.runtime_environment
great_expectations.rule_based_profiler.helpers.simple_semantic_type_filter
great_expectations.rule_based_profiler.helpers.util
great_expectations.rule_based_profiler.parameter_builder
great_expectations.rule_based_profiler.parameter_builder.histogram_single_batch_parameter_builder
great_expectations.rule_based_profiler.parameter_builder.mean_table_columns_set_match_multi_batch_parameter_builder
great_expectations.rule_based_profiler.parameter_builder.mean_unexpected_map_metric_multi_batch_parameter_builder
great_expectations.rule_based_profiler.parameter_builder.metric_multi_batch_parameter_builder
great_expectations.rule_based_profiler.parameter_builder.metric_single_batch_parameter_builder
great_expectations.rule_based_profiler.parameter_builder.numeric_metric_range_multi_batch_parameter_builder
great_expectations.rule_based_profiler.parameter_builder.parameter_builder
great_expectations.rule_based_profiler.parameter_builder.regex_pattern_string_parameter_builder
great_expectations.rule_based_profiler.parameter_builder.simple_date_format_string_parameter_builder
great_expectations.rule_based_profiler.parameter_builder.value_counts_single_batch_parameter_builder
great_expectations.rule_based_profiler.parameter_builder.value_set_multi_batch_parameter_builder
great_expectations.rule_based_profiler.rule
Submodules¶
great_expectations.rule_based_profiler.attributed_resolved_metrics
great_expectations.rule_based_profiler.builder
great_expectations.rule_based_profiler.metric_computation_result
great_expectations.rule_based_profiler.parameter_container
great_expectations.rule_based_profiler.rule_based_profiler
great_expectations.rule_based_profiler.rule_based_profiler_result
great_expectations.rule_based_profiler.semantic_type_filter
Package Contents¶
Classes¶
“RuleBasedProfilerResult” is an immutable “dataclass” object, designed to hold results with auxiliary information of |
|
|
BaseRuleBasedProfiler class is initialized from RuleBasedProfilerConfig typed object and contains all functionality |
|
RuleBasedProfiler object serves to profile, or automatically evaluate a set of rules, upon a given |
-
class
great_expectations.rule_based_profiler.
RuleBasedProfilerResult
¶ Bases:
great_expectations.types.SerializableDictDot
“RuleBasedProfilerResult” is an immutable “dataclass” object, designed to hold results with auxiliary information of executing “RuleBasedProfiler.run()” method. Principal properties are: “fully_qualified_parameter_names_by_domain”, “parameter_values_for_fully_qualified_parameter_names_by_domain”, “expectation_configurations”, and “citation” (which represents configuration of effective Rule-Based Profiler, with all run-time overrides properly reconciled).
-
fully_qualified_parameter_names_by_domain
:Dict[Domain, List[str]]¶
-
parameter_values_for_fully_qualified_parameter_names_by_domain
:Optional[Dict[Domain, Dict[str, ParameterNode]]]¶
-
expectation_configurations
:List[ExpectationConfiguration]¶
-
citation
:dict¶
-
rule_domain_builder_execution_time
:Dict[str, float]¶
-
rule_execution_time
:Dict[str, float]¶
-
_usage_statistics_handler
:Optional[UsageStatisticsHandler]¶
-
to_dict
(self)¶ Returns: This RuleBasedProfilerResult as dictionary (JSON-serializable for RuleBasedProfilerResult objects).
-
to_json_dict
(self)¶ Returns: This RuleBasedProfilerResult as JSON-serializable dictionary.
-
get_expectation_suite
(self, expectation_suite_name: str)¶ Returns: “ExpectationSuite” object, built from properties, populated into this “RuleBasedProfilerResult” object.
-
-
class
great_expectations.rule_based_profiler.
BaseRuleBasedProfiler
(profiler_config: RuleBasedProfilerConfig, data_context: Optional[AbstractDataContext] = None, usage_statistics_handler: Optional[UsageStatisticsHandler] = None)¶ Bases:
great_expectations.core.config_peer.ConfigPeer
BaseRuleBasedProfiler class is initialized from RuleBasedProfilerConfig typed object and contains all functionality in the form of interface methods (which can be overwritten by subclasses) and their reference implementation.
-
EXPECTATION_SUCCESS_KEYS
:Set[str]¶
-
property
ge_cloud_id
(self)¶
-
_init_profiler_rules
(self, rules: Dict[str, Dict[str, Any]])¶
-
_init_rule
(self, rule_name: str, rule_config: Dict[str, Any])¶
-
static
_init_rule_domain_builder
(domain_builder_config: dict, data_context: Optional[AbstractDataContext] = None)¶
-
run
(self, variables: Optional[Dict[str, Any]] = None, rules: Optional[Dict[str, Dict[str, Any]]] = None, batch_list: Optional[List[Batch]] = None, batch_request: Optional[Union[BatchRequestBase, dict]] = None, recompute_existing_parameter_values: bool = False, reconciliation_directives: ReconciliationDirectives = DEFAULT_RECONCILATION_DIRECTIVES, variables_directives_list: Optional[List[RuntimeEnvironmentVariablesDirectives]] = None, domain_type_directives_list: Optional[List[RuntimeEnvironmentDomainTypeDirectives]] = None, comment: Optional[str] = None)¶ Executes and collects “RuleState” side-effect from all “Rule” objects of this “RuleBasedProfiler”.
- Parameters
variables – attribute name/value pairs (overrides), commonly-used in Builder objects
rules – name/(configuration-dictionary) (overrides)
batch_list – Explicit list of Batch objects to supply data at runtime
batch_request – Explicit batch_request used to supply data at runtime
recompute_existing_parameter_values – If “True”, recompute value if “fully_qualified_parameter_name” exists
reconciliation_directives – directives for how each rule component should be overwritten
variables_directives_list – additional/override runtime variables directives (modify “BaseRuleBasedProfiler”)
domain_type_directives_list – additional/override runtime domain directives (modify “BaseRuleBasedProfiler”)
comment – Optional comment for “citation” of “ExpectationSuite” returned as part of “RuleBasedProfilerResult”
- Returns
“RuleBasedProfilerResult” dataclass object, containing essential outputs of profiling.
-
get_expectation_configurations
(self)¶ - Returns
List of ExpectationConfiguration objects, accumulated from RuleState of every Rule executed.
-
get_fully_qualified_parameter_names_by_domain
(self)¶ - Returns
Dictionary of fully-qualified parameter names by Domain, accumulated from RuleState of every Rule executed.
-
get_fully_qualified_parameter_names_for_domain_id
(self, domain_id: str)¶ - Parameters
domain_id – ID of desired Domain object.
- Returns
List of fully-qualified parameter names for Domain with domain_id as specified, accumulated from RuleState of corresponding Rule executed.
-
get_parameter_values_for_fully_qualified_parameter_names_by_domain
(self)¶ - Returns
Dictionaries of values for fully-qualified parameter names by Domain, accumulated from RuleState of every Rule executed.
-
get_parameter_values_for_fully_qualified_parameter_names_for_domain_id
(self, domain_id: str)¶ - Parameters
domain_id – ID of desired Domain object.
- Returns
Dictionary of values for fully-qualified parameter names for Domain with domain_id as specified, accumulated from RuleState of corresponding Rule executed.
-
add_rule
(self, rule: Rule)¶ Add Rule object to existing profiler object by reconciling profiler rules and updating _profiler_config.
-
reconcile_profiler_variables
(self, variables: Optional[Dict[str, Any]] = None, reconciliation_strategy: ReconciliationStrategy = DEFAULT_RECONCILATION_DIRECTIVES.variables)¶ Profiler “variables” reconciliation involves combining the variables, instantiated from Profiler configuration (e.g., stored in a YAML file managed by the Profiler store), with the variables overrides, provided at run time.
The reconciliation logic for “variables” is of the “replace” nature: An override value complements the original on key “miss”, and replaces the original on key “hit” (or “collision”), because “variables” is a unique member.
- Parameters
variables – variables overrides, supplied in dictionary (configuration) form
reconciliation_strategy – one of update, nested_update, or overwrite ways of reconciling overwrites
- Returns
reconciled variables in their canonical ParameterContainer object form
-
_reconcile_profiler_variables_as_dict
(self, variables: Optional[Dict[str, Any]], reconciliation_strategy: ReconciliationStrategy = DEFAULT_RECONCILATION_DIRECTIVES.variables)¶
-
reconcile_profiler_rules
(self, rules: Optional[Dict[str, Dict[str, Any]]] = None, reconciliation_directives: ReconciliationDirectives = DEFAULT_RECONCILATION_DIRECTIVES)¶ Profiler “rules” reconciliation involves combining the rules, instantiated from Profiler configuration (e.g., stored in a YAML file managed by the Profiler store), with the rules overrides, provided at run time.
The reconciliation logic for “rules” is of the “procedural” nature: (1) Combine every rule override configuration with any instantiated rule into a reconciled configuration (2) Re-instantiate Rule objects from the reconciled rule configurations
- Parameters
rules – rules overrides, supplied in dictionary (configuration) form for each rule name as the key
:param reconciliation_directives directives for how each rule component should be overwritten :return: reconciled rules in their canonical List[Rule] object form
-
_reconcile_profiler_rules_as_dict
(self, rules: Optional[Dict[str, Dict[str, Any]]] = None, reconciliation_directives: ReconciliationDirectives = DEFAULT_RECONCILATION_DIRECTIVES)¶
-
static
_reconcile_rule_config
(existing_rules: Dict[str, Rule], rule_name: str, rule_config: dict, reconciliation_directives: ReconciliationDirectives = DEFAULT_RECONCILATION_DIRECTIVES)¶ A “rule configuration” reconciliation is the process of combining the configuration of a single candidate override rule with at most one configuration corresponding to the list of rules instantiated from Profiler configuration (e.g., stored in a YAML file managed by the Profiler store).
The reconciliation logic for “Rule configuration” employes the “by construction” principle: (1) Find a common configuration between the variables configuration, possibly supplied as part of the candiate override Rule configuration, and the variables configuration of an instantiated Rule (2) Find a common configuration between the domain builder configuration, possibly supplied as part of the candiate override Rule configuration, and the domain builder configuration of an instantiated Rule (3) Find common configurations between parameter builder configurations, possibly supplied as part of the candiate override Rule configuration, and the parameter builder configurations of an instantiated Rule (4) Find common configurations between expectation configuration builder configurations, possibly supplied as part of the candiate override Rule configuration, and the expectation configuration builder configurations of an instantiated Rule (5) Construct the reconciled Rule configuration dictionary using the formal Rule properties (“domain_builder”, “parameter_builders”, and “expectation_configuration_builders”) as keys and their reconciled configuration dictionaries as values
In order to insure successful instantiation of custom builder classes using “instantiate_class_from_config()”, candidate builder override configurations are required to supply both “class_name” and “module_name” attributes.
- Parameters
existing_rules – all currently instantiated rules represented as a dictionary, keyed by rule name
rule_name – name of the override rule candidate
rule_config – configuration of an override rule candidate, supplied in dictionary (configuration) form
:param reconciliation_directives directives for how each rule component should be overwritten :return: reconciled rule configuration, returned in dictionary (configuration) form
-
static
_reconcile_rule_domain_builder_config
(domain_builder: DomainBuilder, domain_builder_config: dict, reconciliation_strategy: ReconciliationStrategy = DEFAULT_RECONCILATION_DIRECTIVES.domain_builder)¶ Rule “domain builder” reconciliation involves combining the domain builder, instantiated from Rule configuration (e.g., stored in a YAML file managed by the Profiler store), with the domain builder override, possibly supplied as part of the candiate override rule configuration.
The reconciliation logic for “domain builder” is of the “replace” nature: An override value complements the original on key “miss”, and replaces the original on key “hit” (or “collision”), because “domain builder” is a unique member for a Rule.
- Parameters
domain_builder – existing domain builder of a Rule
domain_builder_config – domain builder configuration override, supplied in dictionary (configuration) form
reconciliation_strategy – one of update, nested_update, or overwrite ways of reconciling overwrites
- Returns
reconciled domain builder configuration, returned in dictionary (configuration) form
-
static
_reconcile_rule_parameter_builder_configs
(rule: Rule, parameter_builder_configs: List[dict], reconciliation_strategy: ReconciliationStrategy = DEFAULT_RECONCILATION_DIRECTIVES.parameter_builder)¶ Rule “parameter builders” reconciliation involves combining the parameter builders, instantiated from Rule configuration (e.g., stored in a YAML file managed by the Profiler store), with the parameter builders overrides, possibly supplied as part of the candiate override rule configuration.
The reconciliation logic for “parameter builders” is of the “upsert” nature: A candidate override parameter builder configuration contributes to the parameter builders list of the rule if the corresponding parameter builder name does not exist in the list of instantiated parameter builders of the rule; otherwise, once instnatiated, it replaces the configuration associated with the original parameter builder having the same name.
- Parameters
rule – Profiler “rule”, subject to parameter builder overrides
parameter_builder_configs – parameter builder configuration overrides, supplied in dictionary (configuration) form
reconciliation_strategy – one of update, nested_update, or overwrite ways of reconciling overwrites
- Returns
reconciled parameter builder configuration, returned in dictionary (configuration) form
-
static
_reconcile_rule_expectation_configuration_builder_configs
(rule: Rule, expectation_configuration_builder_configs: List[dict], reconciliation_strategy: ReconciliationStrategy = DEFAULT_RECONCILATION_DIRECTIVES.expectation_configuration_builder)¶ Rule “expectation configuration builders” reconciliation involves combining the expectation configuration builders, instantiated from Rule configuration (e.g., stored in a YAML file managed by the Profiler store), with the expectation configuration builders overrides, possibly supplied as part of the candiate override rule configuration.
The reconciliation logic for “expectation configuration builders” is of the “upsert” nature: A candidate override expectation configuration builder configuration contributes to the expectation configuration builders list of the rule if the corresponding expectation configuration builder name does not exist in the list of instantiated expectation configuration builders of the rule; otherwise, once instnatiated, it replaces the configuration associated with the original expectation configuration builder having the same name.
- Parameters
rule – Profiler “rule”, subject to expectations configuration builder overrides
expectation_configuration_builder_configs – expectation configuration builder configuration overrides, supplied in dictionary (configuration) form
reconciliation_strategy – one of update, nested_update, or overwrite ways of reconciling overwrites
- Returns
reconciled expectation configuration builder configuration, returned in dictionary (configuration) form
-
_get_rules_as_dict
(self)¶
-
_apply_runtime_environment
(self, variables: Optional[ParameterContainer] = None, rules: Optional[List[Rule]] = None, variables_directives_list: Optional[List[RuntimeEnvironmentVariablesDirectives]] = None, domain_type_directives_list: Optional[List[RuntimeEnvironmentDomainTypeDirectives]] = None)¶ variables: attribute name/value pairs, commonly-used in Builder objects, to modify using “runtime_environment” rules: name/(configuration-dictionary) to modify using “runtime_environment” variables_directives_list: additional/override runtime variables directives (modify “BaseRuleBasedProfiler”) domain_type_directives_list: additional/override runtime domain directives (modify “BaseRuleBasedProfiler”)
-
static
_apply_variables_directives_runtime_environment
(rules: Optional[List[Rule]] = None, variables_directives_list: Optional[List[RuntimeEnvironmentVariablesDirectives]] = None)¶ rules: name/(configuration-dictionary) to modify using “runtime_environment” variables_directives_list: additional/override runtime variables directives (modify “BaseRuleBasedProfiler”)
-
static
_apply_domain_type_directives_runtime_environment
(rules: Optional[List[Rule]] = None, domain_type_directives_list: Optional[List[RuntimeEnvironmentDomainTypeDirectives]] = None)¶ rules: name/(configuration-dictionary) to modify using “runtime_environment” domain_type_directives_list: additional/override runtime domain directives (modify “BaseRuleBasedProfiler”)
-
static
_get_effective_domain_builder_property_value
(dest_property_value: Optional[Any] = None, source_property_value: Optional[Any] = None)¶
-
static
run_profiler
(data_context: AbstractDataContext, profiler_store: ProfilerStore, batch_list: Optional[List[Batch]] = None, batch_request: Optional[Union[BatchRequestBase, dict]] = None, name: Optional[str] = None, ge_cloud_id: Optional[str] = None, variables: Optional[dict] = None, rules: Optional[dict] = None)¶
-
static
run_profiler_on_data
(data_context: AbstractDataContext, profiler_store: ProfilerStore, batch_list: Optional[List[Batch]] = None, batch_request: Optional[Union[BatchRequestBase, dict]] = None, name: Optional[str] = None, ge_cloud_id: Optional[str] = None)¶
-
static
add_profiler
(config: RuleBasedProfilerConfig, data_context: AbstractDataContext, profiler_store: ProfilerStore)¶
-
static
_check_validity_of_batch_requests_in_config
(config: RuleBasedProfilerConfig)¶
-
static
get_profiler
(data_context: AbstractDataContext, profiler_store: ProfilerStore, name: Optional[str] = None, ge_cloud_id: Optional[str] = None)¶
-
static
delete_profiler
(profiler_store: ProfilerStore, name: Optional[str] = None, ge_cloud_id: Optional[str] = None)¶
-
static
list_profilers
(profiler_store: ProfilerStore, ge_cloud_mode: bool = False)¶
-
self_check
(self, pretty_print: bool = True)¶ Necessary to enable integration with AbstractDataContext.test_yaml_config :param pretty_print: flag to turn on verbose output
- Returns
Dictionary that contains RuleBasedProfiler state
-
property
config
(self)¶
-
property
name
(self)¶
-
property
config_version
(self)¶
-
property
variables
(self)¶
-
property
rules
(self)¶
-
property
rule_states
(self)¶
-
to_json_dict
(self)¶
-
__repr__
(self)¶ Return repr(self).
-
__str__
(self)¶ Return str(self).
-
-
class
great_expectations.rule_based_profiler.
RuleBasedProfiler
(name: str, config_version: float, variables: Optional[Dict[str, Any]] = None, rules: Optional[Dict[str, Dict[str, Any]]] = None, data_context: Optional[AbstractDataContext] = None, id: Optional[str] = None)¶ Bases:
great_expectations.rule_based_profiler.rule_based_profiler.BaseRuleBasedProfiler
RuleBasedProfiler object serves to profile, or automatically evaluate a set of rules, upon a given batch / multiple batches of data.
Rule-Based Profiler - How-to GuideUse YAML to configure a flexible Profiler engine, which will then generate an ExpectationSuite for a data setMaturity: ExperimentalDetails:API Stability: Low (instantiation of Profiler and the signature of the run() method will change)Implementation Completeness: Moderate (some augmentation and/or growth in capabilities is to be expected)Unit Test Coverage: High (but not complete – additional unit tests will be added, commensurate with the upcoming new functionality)Integration Infrastructure/Test Coverage: N/A -> TBDDocumentation Completeness: ModerateBug Risk: Low/ModerateExpectation Completeness: ModerateDomain Builders - How-to GuideUse YAML to build domains for ExpectationConfiguration generator (table, column, semantic types, etc.)Maturity: ExperimentalDetails:API Stability: ModerateImplementation Completeness: Moderate (additional DomainBuilder classes will be developed)Unit Test Coverage: High (but not complete – additional unit tests will be added, commensurate with the upcoming new functionality)Integration Infrastructure/Test Coverage: N/A -> TBDDocumentation Completeness: ModerateBug Risk: Low/ModerateExpectation Completeness: ModerateParameter Builders - How-to GuideUse YAML to configure single and multi batch based parameter computation modules for the use by ExpectationConfigurationBuilder classesMaturity: ExperimentalDetails:API Stability: ModerateImplementation Completeness: Moderate (additional ParameterBuilder classes will be developed)Unit Test Coverage: High (but not complete – additional unit tests will be added, commensurate with the upcoming new functionality)Integration Infrastructure/Test Coverage: N/A -> TBDDocumentation Completeness: ModerateBug Risk: Low/ModerateExpectation Completeness: ModerateExpectationConfiguration Builders - How-to GuideUse YAML to configure ExpectationConfigurationBuilder classes, which emit lists of ExpectationConfiguration objects (e.g., as kwargs and meta arguments)Maturity: ExperimentalDetails:API Stability: ModerateImplementation Completeness: Moderate (additional ExpectationConfigurationBuilder classes might be developed)Unit Test Coverage: High (but not complete – additional unit tests will be added, commensurate with the upcoming new functionality)Integration Infrastructure/Test Coverage: N/A -> TBDDocumentation Completeness: ModerateBug Risk: Low/ModerateExpectation Completeness: Moderate