great_expectations.rule_based_profiler.rule_based_profiler

Module Contents

Classes

ReconciliationStrategy()

Generic enumeration.

ReconciliationDirectives()

A convenience class for migrating away from untyped dictionaries to stronger typed objects.

BaseRuleBasedProfiler(profiler_config: RuleBasedProfilerConfig, data_context: Optional[‘DataContext’] = None)

BaseRuleBasedProfiler class is initialized from RuleBasedProfilerConfig typed object and contains all functionality

RuleBasedProfiler(name: str, config_version: float, variables: Optional[Dict[str, Any]] = None, rules: Optional[Dict[str, Dict[str, Any]]] = None, data_context: Optional[‘DataContext’] = None)

RuleBasedProfiler object serves to profile, or automatically evaluate a set of rules, upon a given

Functions

_validate_builder_override_config(builder_config: dict)

In order to insure successful instantiation of custom builder classes using “instantiate_class_from_config()”,

great_expectations.rule_based_profiler.rule_based_profiler.logger
great_expectations.rule_based_profiler.rule_based_profiler._validate_builder_override_config(builder_config: dict)

In order to insure successful instantiation of custom builder classes using “instantiate_class_from_config()”, candidate builder override configurations are required to supply both “class_name” and “module_name” attributes.

Parameters

builder_config – candidate builder override configuration

Raises

ProfilerConfigurationError

class great_expectations.rule_based_profiler.rule_based_profiler.ReconciliationStrategy

Bases: enum.Enum

Generic enumeration.

Derive from this class to define new enumerations.

NESTED_UPDATE = nested_update
REPLACE = replace
UPDATE = update
class great_expectations.rule_based_profiler.rule_based_profiler.ReconciliationDirectives

Bases: great_expectations.types.SerializableDictDot

A convenience class for migrating away from untyped dictionaries to stronger typed objects.

Can be instantiated with arguments:

my_A = MyClassA(

foo=”a string”, bar=1,

)

Can be instantiated from a dictionary:

my_A = MyClassA(
**{

“foo”: “a string”, “bar”: 1,

}

)

Can be accessed using both dictionary and dot notation

my_A.foo == “a string” my_A.bar == 1

my_A[“foo”] == “a string” my_A[“bar”] == 1

Pairs nicely with @dataclass:

@dataclass() class MyClassA(DictDot):

foo: str bar: int

Can be made immutable:

@dataclass(frozen=True) class MyClassA(DictDot):

foo: str bar: int

For more examples of usage, please see test_dataclass_serializable_dot_dict_pattern.py in the tests folder.

variables :ReconciliationStrategy
domain_builder :ReconciliationStrategy
parameter_builder :ReconciliationStrategy
expectation_configuration_builder :ReconciliationStrategy
to_dict(self)
to_json_dict(self)

# TODO: <Alex>2/4/2022</Alex> A reference implementation can be provided, once circular import dependencies, caused by relative locations of the “great_expectations/types/__init__.py” and “great_expectations/core/util.py” modules are resolved.

class great_expectations.rule_based_profiler.rule_based_profiler.BaseRuleBasedProfiler(profiler_config: RuleBasedProfilerConfig, data_context: Optional['DataContext'] = None)

Bases: great_expectations.core.config_peer.ConfigPeer

BaseRuleBasedProfiler class is initialized from RuleBasedProfilerConfig typed object and contains all functionality in the form of interface methods (which can be overwritten by subclasses) and their reference implementation.

DEFAULT_RECONCILATION_DIRECTIVES :ReconciliationDirectives
EXPECTATION_SUCCESS_KEYS :Set[str]
_init_profiler_rules(self, rules: Dict[str, Dict[str, Any]])
_init_rule(self, rule_name: str, rule_config: Dict[str, Any])
static _init_rule_domain_builder(domain_builder_config: dict, data_context: Optional['DataContext'] = None)
static _init_rule_parameter_builders(parameter_builder_configs: Optional[List[dict]] = None, data_context: Optional['DataContext'] = None)
static _init_parameter_builder(parameter_builder_config: dict, data_context: Optional['DataContext'] = None)
static _init_rule_expectation_configuration_builders(expectation_configuration_builder_configs: List[dict])
static _init_expectation_configuration_builder(expectation_configuration_builder_config: dict)
run(self, variables: Optional[Dict[str, Any]] = None, rules: Optional[Dict[str, Dict[str, Any]]] = None, reconciliation_directives: ReconciliationDirectives = DEFAULT_RECONCILATION_DIRECTIVES, expectation_suite_name: Optional[str] = None, include_citation: bool = True)

:param : param variables attribute name/value pairs (overrides) :param : param rules name/(configuration-dictionary) (overrides) :param : param reconciliation_directives directives for how each rule component should be overwritten :param : param expectation_suite_name: A name for returned Expectation suite. :param : param include_citation: Whether or not to include the Profiler config in the metadata for the ExpectationSuite produced by the Profiler

Returns

Set of rule evaluation results in the form of an ExpectationSuite

reconcile_profiler_variables(self, variables: Optional[Dict[str, Any]] = None, reconciliation_strategy: ReconciliationStrategy = DEFAULT_RECONCILATION_DIRECTIVES.variables)

Profiler “variables” reconciliation involves combining the variables, instantiated from Profiler configuration (e.g., stored in a YAML file managed by the Profiler store), with the variables overrides, provided at run time.

The reconciliation logic for “variables” is of the “replace” nature: An override value complements the original on key “miss”, and replaces the original on key “hit” (or “collision”), because “variables” is a unique member.

Parameters
  • variables – variables overrides, supplied in dictionary (configuration) form

  • reconciliation_strategy – one of update, nested_update, or overwrite ways of reconciling overwrites

Returns

reconciled variables in their canonical ParameterContainer object form

reconcile_profiler_rules(self, rules: Optional[Dict[str, Dict[str, Any]]] = None, reconciliation_directives: ReconciliationDirectives = DEFAULT_RECONCILATION_DIRECTIVES)

Profiler “rules” reconciliation involves combining the rules, instantiated from Profiler configuration (e.g., stored in a YAML file managed by the Profiler store), with the rules overrides, provided at run time.

The reconciliation logic for “rules” is of the “procedural” nature: (1) Combine every rule override configuration with any instantiated rule into a reconciled configuration (2) Re-instantiate Rule objects from the reconciled rule configurations

Parameters

rules – rules overrides, supplied in dictionary (configuration) form for each rule name as the key

:param reconciliation_directives directives for how each rule component should be overwritten :return: reconciled rules in their canonical List[Rule] object form

static _reconcile_rule_config(existing_rules: Dict[str, Rule], rule_name: str, rule_config: dict, reconciliation_directives: ReconciliationDirectives = DEFAULT_RECONCILATION_DIRECTIVES)

A “rule configuration” reconciliation is the process of combining the configuration of a single candidate override rule with at most one configuration corresponding to the list of rules instantiated from Profiler configuration (e.g., stored in a YAML file managed by the Profiler store).

The reconciliation logic for “rule configuration” employes the “by construction” principle: (1) Find a common configuration between the domain builder configuration, possibly supplied as part of the candiate override rule configuration, and the comain builder configuration of an instantiated rule (2) Find common configurations between parameter builder configurations, possibly supplied as part of the candiate override rule configuration, and the parameter builder configurations of an instantiated rule (3) Find common configurations between expectation configuration builder configurations, possibly supplied as part of the candiate override rule configuration, and the expectation configuration builder configurations of an instantiated rule (4) Construct the reconciled rule configuration dictionary using the formal rule properties (“domain_builder”, “parameter_builders”, and “expectation_configuration_builders”) as keys and their reconciled configuration dictionaries as values

In order to insure successful instantiation of custom builder classes using “instantiate_class_from_config()”, candidate builder override configurations are required to supply both “class_name” and “module_name” attributes.

Parameters
  • existing_rules – all currently instantiated rules represented as a dictionary, keyed by rule name

  • rule_name – name of the override rule candidate

  • rule_config – configuration of an override rule candidate, supplied in dictionary (configuration) form

:param reconciliation_directives directives for how each rule component should be overwritten :return: reconciled rule configuration, returned in dictionary (configuration) form

static _reconcile_rule_domain_builder_config(domain_builder: DomainBuilder, domain_builder_config: dict, reconciliation_strategy: ReconciliationStrategy = DEFAULT_RECONCILATION_DIRECTIVES.domain_builder)

Rule “domain builder” reconciliation involves combining the domain builder, instantiated from Rule configuration (e.g., stored in a YAML file managed by the Profiler store), with the domain builder override, possibly supplied as part of the candiate override rule configuration.

The reconciliation logic for “domain builder” is of the “replace” nature: An override value complements the original on key “miss”, and replaces the original on key “hit” (or “collision”), because “domain builder” is a unique member for a rule.

Parameters
  • domain_builder – existing domain builder of a rule

  • domain_builder_config – domain builder configuration override, supplied in dictionary (configuration) form

  • reconciliation_strategy – one of update, nested_update, or overwrite ways of reconciling overwrites

Returns

reconciled domain builder configuration, returned in dictionary (configuration) form

static _reconcile_rule_parameter_builder_configs(rule: Rule, parameter_builder_configs: List[dict], reconciliation_strategy: ReconciliationStrategy = DEFAULT_RECONCILATION_DIRECTIVES.parameter_builder)

Rule “parameter builders” reconciliation involves combining the parameter builders, instantiated from Rule configuration (e.g., stored in a YAML file managed by the Profiler store), with the parameter builders overrides, possibly supplied as part of the candiate override rule configuration.

The reconciliation logic for “parameter builders” is of the “upsert” nature: A candidate override parameter builder configuration contributes to the parameter builders list of the rule if the corresponding parameter builder name does not exist in the list of instantiated parameter builders of the rule; otherwise, once instnatiated, it replaces the configuration associated with the original parameter builder having the same name.

Parameters
  • rule – Profiler “rule”, subject to parameter builder overrides

  • parameter_builder_configs – parameter builder configuration overrides, supplied in dictionary (configuration) form

  • reconciliation_strategy – one of update, nested_update, or overwrite ways of reconciling overwrites

Returns

reconciled parameter builder configuration, returned in dictionary (configuration) form

static _reconcile_rule_expectation_configuration_builder_configs(rule: Rule, expectation_configuration_builder_configs: List[dict], reconciliation_strategy: ReconciliationStrategy = DEFAULT_RECONCILATION_DIRECTIVES.expectation_configuration_builder)

Rule “expectation configuration builders” reconciliation involves combining the expectation configuration builders, instantiated from Rule configuration (e.g., stored in a YAML file managed by the Profiler store), with the expectation configuration builders overrides, possibly supplied as part of the candiate override rule configuration.

The reconciliation logic for “expectation configuration builders” is of the “upsert” nature: A candidate override expectation configuration builder configuration contributes to the expectation configuration builders list of the rule if the corresponding expectation configuration builder name does not exist in the list of instantiated expectation configuration builders of the rule; otherwise, once instnatiated, it replaces the configuration associated with the original expectation configuration builder having the same name.

Parameters
  • rule – Profiler “rule”, subject to expectations configuration builder overrides

  • expectation_configuration_builder_configs – expectation configuration builder configuration overrides, supplied in dictionary (configuration) form

  • reconciliation_strategy – one of update, nested_update, or overwrite ways of reconciling overwrites

Returns

reconciled expectation configuration builder configuration, returned in dictionary (configuration) form

_get_rules_as_dict(self)
self_check(self, pretty_print=True)

Necessary to enable integration with DataContext.test_yaml_config :param pretty_print: flag to turn on verbose output

Returns

Dictionary that contains RuleBasedProfiler state

property config(self)
property name(self)
property variables(self)
property rules(self)
to_json_dict(self)
__repr__(self)

Return repr(self).

__str__(self)

Return str(self).

class great_expectations.rule_based_profiler.rule_based_profiler.RuleBasedProfiler(name: str, config_version: float, variables: Optional[Dict[str, Any]] = None, rules: Optional[Dict[str, Dict[str, Any]]] = None, data_context: Optional['DataContext'] = None)

Bases: great_expectations.rule_based_profiler.rule_based_profiler.BaseRuleBasedProfiler

RuleBasedProfiler object serves to profile, or automatically evaluate a set of rules, upon a given batch / multiple batches of data.

Feature Maturity

icon-c178aac899cf11ecaf600242ac110002 Rule-Based Profiler - How-to Guide
Use YAML to configure a flexible Profiler engine, which will then generate an ExpectationSuite for a data set
Maturity: Experimental
Details:
API Stability: Low (instantiation of Profiler and the signature of the run() method will change)
Implementation Completeness: Moderate (some augmentation and/or growth in capabilities is to be expected)
Unit Test Coverage: High (but not complete – additional unit tests will be added, commensurate with the upcoming new functionality)
Integration Infrastructure/Test Coverage: N/A -> TBD
Documentation Completeness: Moderate
Bug Risk: Low/Moderate
Expectation Completeness: Moderate
icon-c178acbc99cf11ecaf600242ac110002 Domain Builders - How-to Guide
Use YAML to build domains for ExpectationConfiguration generator (table, column, semantic types, etc.)
Maturity: Experimental
Details:
API Stability: Moderate
Implementation Completeness: Moderate (additional DomainBuilder classes will be developed)
Unit Test Coverage: High (but not complete – additional unit tests will be added, commensurate with the upcoming new functionality)
Integration Infrastructure/Test Coverage: N/A -> TBD
Documentation Completeness: Moderate
Bug Risk: Low/Moderate
Expectation Completeness: Moderate
icon-c178adc099cf11ecaf600242ac110002 Parameter Builders - How-to Guide
Use YAML to configure single and multi batch based parameter computation modules for the use by ExpectationConfigurationBuilder classes
Maturity: Experimental
Details:
API Stability: Moderate
Implementation Completeness: Moderate (additional ParameterBuilder classes will be developed)
Unit Test Coverage: High (but not complete – additional unit tests will be added, commensurate with the upcoming new functionality)
Integration Infrastructure/Test Coverage: N/A -> TBD
Documentation Completeness: Moderate
Bug Risk: Low/Moderate
Expectation Completeness: Moderate
icon-c178aea699cf11ecaf600242ac110002 ExpectationConfiguration Builders - How-to Guide
Use YAML to configure ExpectationConfigurationBuilder classes, which emit lists of ExpectationConfiguration objects (e.g., as kwargs and meta arguments)
Maturity: Experimental
Details:
API Stability: Moderate
Implementation Completeness: Moderate (additional ExpectationConfigurationBuilder classes might be developed)
Unit Test Coverage: High (but not complete – additional unit tests will be added, commensurate with the upcoming new functionality)
Integration Infrastructure/Test Coverage: N/A -> TBD
Documentation Completeness: Moderate
Bug Risk: Low/Moderate
Expectation Completeness: Moderate
static run_profiler(data_context: DataContext, profiler_store: ProfilerStore, name: Optional[str] = None, ge_cloud_id: Optional[str] = None, variables: Optional[dict] = None, rules: Optional[dict] = None, expectation_suite_name: Optional[str] = None, include_citation: bool = True)
static run_profiler_on_data(data_context: DataContext, profiler_store: ProfilerStore, batch_request: Union[BatchRequest, RuntimeBatchRequest, dict], name: Optional[str] = None, ge_cloud_id: Optional[str] = None, expectation_suite_name: Optional[str] = None, include_citation: bool = True)
_generate_rule_overrides_from_batch_request(self, batch_request: Union[BatchRequest, RuntimeBatchRequest, dict])

Iterates through the profiler’s builder attributes and generates a set of Rules that contain overrides from the input batch request. This only applies to ParameterBuilder and any DomainBuilder with a COLUMN MetricDomainType.

Note that we are passing all batches, corresponding to the specified batch_request, to ParameterBuilder objects. If not used carefully, bias may creep in to the resulting estimates, computed by these ParameterBuilder objects.

Users of this override should be aware that a batch request should either have no notion of “current/active” batch or it is excluded.

Parameters

batch_request – Data used to override builder attributes

Returns

The dictionary representation of the Rules used as runtime arguments to run()

static add_profiler(config: RuleBasedProfilerConfig, data_context: DataContext, profiler_store: ProfilerStore, ge_cloud_id: Optional[str] = None)
static _check_validity_of_batch_requests_in_config(config: RuleBasedProfilerConfig)
static get_profiler(data_context: DataContext, profiler_store: ProfilerStore, name: Optional[str] = None, ge_cloud_id: Optional[str] = None)
static delete_profiler(profiler_store: ProfilerStore, name: Optional[str] = None, ge_cloud_id: Optional[str] = None)
static list_profilers(profiler_store: ProfilerStore, ge_cloud_mode: bool)