great_expectations.expectations.metrics.multicolumn_map_metrics.compound_columns_unique

Module Contents

Classes

CompoundColumnsUnique()

While the support for “PandasExecutionEngine” and “SparkDFExecutionEngine” is accomplished using a compact

class great_expectations.expectations.metrics.multicolumn_map_metrics.compound_columns_unique.CompoundColumnsUnique

Bases: great_expectations.expectations.metrics.map_metric_provider.MulticolumnMapMetricProvider

While the support for “PandasExecutionEngine” and “SparkDFExecutionEngine” is accomplished using a compact implementation, which combines the “map” and “condition” parts in a single step, the support for “SqlAlchemyExecutionEngine” is more detailed. Thus, the “map” and “condition” parts for “SqlAlchemyExecutionEngine” are handled separately, with the “condition” part relying on the “map” part as a metric dependency.

function_metric_name = compound_columns.count
condition_metric_name = compound_columns.unique
condition_domain_keys = ['batch_id', 'table', 'column_list', 'row_condition', 'condition_parser', 'ignore_row_if']
_pandas(cls, column_list, **kwargs)
_sqlalchemy_function(self, column_list, **kwargs)

Computes the “map” between the specified “column_list” (treated as a group so as to model the “compound” aspect) and the number of occurrences of every permutation of the values of “column_list” as the grouped subset of all rows of the table. In the present context, the term “compound” refers to having to treat the specified columns as unique together (e.g., as a multi-column primary key). For example, suppose that in the example below, all three columns (“A”, “B”, and “C”) of the table are included as part of the “compound” columns list (i.e., column_list = [“A”, “B”, “C”]):

A B C _num_rows 1 1 2 2 1 2 3 1 1 1 2 2 2 2 2 1 3 2 3 1

The fourth column, “_num_rows”, holds the value of the “map” function – the number of rows the group occurs in.

_sqlalchemy_condition(cls, column_list, **kwargs)

Retrieve the specified “map” metric dependency value as the “FromClause” “compound_columns_count_query” object and extract from it – using the supported SQLAlchemy column access method – the “_num_rows” columns. The uniqueness of “compound” columns (as a group) is expressed by the “BinaryExpression” “row_wise_cond” returned.

Importantly, since the “compound_columns_count_query” is the “FromClause” object that incorporates all columns of the original table, no additional “FromClause” objects (“select_from”) must augment this “condition” metric. Other than boolean operations, column access, argument of filtering, and limiting the size of the result set, this “row_wise_cond”, serving as the main component of the unexpected condition logic, carries along with it the entire object hierarchy, making any encapsulating query ready for execution against the database engine.

_spark(cls, column_list, **kwargs)
classmethod _get_evaluation_dependencies(cls, metric: MetricConfiguration, configuration: Optional[ExpectationConfiguration] = None, execution_engine: Optional[ExecutionEngine] = None, runtime_configuration: Optional[dict] = None)

Returns a dictionary of given metric names and their corresponding configuration, specifying the metric types and their respective domains.