great_expectations.rule_based_profiler.domain_builder

Package Contents

Classes

DomainBuilder(batch_list: Optional[List[Batch]] = None, batch_request: Optional[Union[BatchRequest, RuntimeBatchRequest, dict]] = None, data_context: Optional[‘DataContext’] = None)

A DomainBuilder provides methods to get domains based on one or more batches of data.

CategoricalColumnDomainBuilder(batch_list: Optional[List[Batch]] = None, batch_request: Optional[Union[BatchRequest, RuntimeBatchRequest, dict]] = None, data_context: Optional[‘DataContext’] = None, limit_mode: Optional[Union[CardinalityLimitMode, str]] = None, max_unique_values: Optional[int] = None, max_proportion_unique: Optional[int] = None, exclude_columns: Optional[List[str]] = None)

This DomainBuilder uses column cardinality to identify domains.

ColumnDomainBuilder(batch_list: Optional[List[Batch]] = None, batch_request: Optional[Union[BatchRequest, RuntimeBatchRequest, dict]] = None, data_context: Optional[‘DataContext’] = None, column_names: Optional[List[str]] = None)

A DomainBuilder provides methods to get domains based on one or more batches of data.

SimpleColumnSuffixDomainBuilder(batch_list: Optional[List[Batch]] = None, batch_request: Optional[Union[BatchRequest, RuntimeBatchRequest, dict]] = None, data_context: Optional[‘DataContext’] = None, column_name_suffixes: Optional[List[str]] = None)

This DomainBuilder uses a column suffix to identify domains.

SimpleSemanticTypeColumnDomainBuilder(batch_list: Optional[List[Batch]] = None, batch_request: Optional[Union[BatchRequest, RuntimeBatchRequest, dict]] = None, data_context: Optional[‘DataContext’] = None, semantic_types: Optional[Union[str, SemanticDomainTypes, List[Union[str, SemanticDomainTypes]]]] = None)

This DomainBuilder utilizes a “best-effort” semantic interpretation of (“storage”) columns of a table.

TableDomainBuilder(data_context: DataContext, batch_list: Optional[List[Batch]] = None, batch_request: Optional[Union[BatchRequest, RuntimeBatchRequest, dict]] = None)

A DomainBuilder provides methods to get domains based on one or more batches of data.

class great_expectations.rule_based_profiler.domain_builder.DomainBuilder(batch_list: Optional[List[Batch]] = None, batch_request: Optional[Union[BatchRequest, RuntimeBatchRequest, dict]] = None, data_context: Optional['DataContext'] = None)

Bases: great_expectations.rule_based_profiler.types.Builder, abc.ABC

A DomainBuilder provides methods to get domains based on one or more batches of data.

get_domains(self, variables: Optional[ParameterContainer] = None)

Note: Please do not overwrite the public “get_domains()” method. If a child class needs to check parameters, then please do so in its implementation of the (private) “_get_domains()” method, or in a utility method.

property domain_type(self)
property batch_request(self)
property batch_list(self)
property data_context(self)
abstract _get_domains(self, variables: Optional[ParameterContainer] = None)

_get_domains is the primary workhorse for the DomainBuilder

get_validator(self, variables: Optional[ParameterContainer] = None)
get_batch_ids(self, variables: Optional[ParameterContainer] = None)
get_batch_id(self, variables: Optional[ParameterContainer] = None)
class great_expectations.rule_based_profiler.domain_builder.CategoricalColumnDomainBuilder(batch_list: Optional[List[Batch]] = None, batch_request: Optional[Union[BatchRequest, RuntimeBatchRequest, dict]] = None, data_context: Optional['DataContext'] = None, limit_mode: Optional[Union[CardinalityLimitMode, str]] = None, max_unique_values: Optional[int] = None, max_proportion_unique: Optional[int] = None, exclude_columns: Optional[List[str]] = None)

Bases: great_expectations.rule_based_profiler.domain_builder.DomainBuilder

This DomainBuilder uses column cardinality to identify domains.

property domain_type(self)
property exclude_columns(self)
_get_domains(self, variables: Optional[ParameterContainer] = None)

Return domains matching the selected limit_mode.

Parameters

variables – Optional variables to substitute when evaluating.

Returns

List of domains that match the desired cardinality.

_get_table_column_names_from_active_batch(self, validator: Validator, batch_id: str)

Retrieve table column names from the active batch.

Parameters

validator – Validator to use in retrieving columns.

Returns

List of column names from the active batch.

_generate_metric_configurations_to_check_cardinality(self, batch_ids: List[str], column_names: List[str])

Generate metric configurations used to compute metrics for checking cardinality.

Parameters
  • batch_ids – List of batch_ids used to create metric configurations.

  • column_names – List of column names used to create metric configurations.

Returns

List[MetricConfiguration]},…]

Return type

List of dicts of the form [{column_name

_columns_meeting_cardinality_limit(self, validator: Validator, table_column_names: List[str], metrics_for_cardinality_check: List[Dict[str, List[MetricConfiguration]]])

Compute cardinality and return column names meeting cardinality limit.

Parameters
  • validator – Validator used to compute column cardinality.

  • table_column_names – column names to verify cardinality.

  • metrics_for_cardinality_check – metric configurations used to compute cardinality.

Returns

List of column names meeting cardinality.

class great_expectations.rule_based_profiler.domain_builder.ColumnDomainBuilder(batch_list: Optional[List[Batch]] = None, batch_request: Optional[Union[BatchRequest, RuntimeBatchRequest, dict]] = None, data_context: Optional['DataContext'] = None, column_names: Optional[List[str]] = None)

Bases: great_expectations.rule_based_profiler.domain_builder.DomainBuilder

A DomainBuilder provides methods to get domains based on one or more batches of data.

property domain_type(self)
property column_names(self)
_get_domains(self, variables: Optional[ParameterContainer] = None)

Obtains and returns domains for all columns of a table (or for configured columns, if they exist in the table).

class great_expectations.rule_based_profiler.domain_builder.SimpleColumnSuffixDomainBuilder(batch_list: Optional[List[Batch]] = None, batch_request: Optional[Union[BatchRequest, RuntimeBatchRequest, dict]] = None, data_context: Optional['DataContext'] = None, column_name_suffixes: Optional[List[str]] = None)

Bases: great_expectations.rule_based_profiler.domain_builder.DomainBuilder

This DomainBuilder uses a column suffix to identify domains.

property domain_type(self)
property column_name_suffixes(self)
_get_domains(self, variables: Optional[ParameterContainer] = None)

Find the column suffix for each column and return all domains matching the specified suffix.

class great_expectations.rule_based_profiler.domain_builder.SimpleSemanticTypeColumnDomainBuilder(batch_list: Optional[List[Batch]] = None, batch_request: Optional[Union[BatchRequest, RuntimeBatchRequest, dict]] = None, data_context: Optional['DataContext'] = None, semantic_types: Optional[Union[str, SemanticDomainTypes, List[Union[str, SemanticDomainTypes]]]] = None)

Bases: great_expectations.rule_based_profiler.domain_builder.DomainBuilder

This DomainBuilder utilizes a “best-effort” semantic interpretation of (“storage”) columns of a table.

property domain_type(self)
property semantic_types(self)
_get_domains(self, variables: Optional[ParameterContainer] = None)

Find the semantic column type for each column and return all domains matching the specified type or types.

infer_semantic_domain_type_from_table_column_type(self, column_types_dict_list: List[Dict[str, Any]], column_name: str)
class great_expectations.rule_based_profiler.domain_builder.TableDomainBuilder(data_context: DataContext, batch_list: Optional[List[Batch]] = None, batch_request: Optional[Union[BatchRequest, RuntimeBatchRequest, dict]] = None)

Bases: great_expectations.rule_based_profiler.domain_builder.DomainBuilder

A DomainBuilder provides methods to get domains based on one or more batches of data.

property domain_type(self)
_get_domains(self, variables: Optional[ParameterContainer] = None)

_get_domains is the primary workhorse for the DomainBuilder