great_expectations.expectations.metrics.column_aggregate_metrics.column_histogram

Module Contents

Classes

ColumnHistogram()

Base class for all metric providers.

great_expectations.expectations.metrics.column_aggregate_metrics.column_histogram.logger
class great_expectations.expectations.metrics.column_aggregate_metrics.column_histogram.ColumnHistogram

Bases: great_expectations.expectations.metrics.column_aggregate_metric_provider.ColumnAggregateMetricProvider

Base class for all metric providers.

MetricProvider classes must have the following attributes set:
  1. metric_name: the name to use. Metric Name must be globally unique in a great_expectations installation.

  1. domain_keys: a tuple of the keys used to determine the domain of the metric

  2. value_keys: a tuple of the keys used to determine the value of the metric.

In some cases, subclasses of Expectation, such as TableMetricProvider will already have correct values that may simply be inherited.

They may optionally override the default_kwarg_values attribute.

MetricProvider classes must implement the following:

1. _get_evaluation_dependencies. Note that often, _get_evaluation_dependencies should augment dependencies provided by a parent class; consider calling super()._get_evaluation_dependencies

In some cases, subclasses of Expectation, such as MapMetricProvider will already have correct implementations that may simply be inherited.

Additionally, they may provide implementations of:

1. Data Docs rendering methods decorated with the @renderer decorator. See the guide “How to create renderers for custom expectations” for more information.

metric_name = column.histogram
value_keys = ['bins']
_pandas(cls, execution_engine: PandasExecutionEngine, metric_domain_kwargs: Dict, metric_value_kwargs: Dict, metrics: Dict[Tuple, Any], runtime_configuration: Dict)
_sqlalchemy(cls, execution_engine: SqlAlchemyExecutionEngine, metric_domain_kwargs: Dict, metric_value_kwargs: Dict, metrics: Dict[Tuple, Any], runtime_configuration: Dict)

return a list of counts corresponding to bins

Parameters
  • column – the name of the column for which to get the histogram

  • bins – tuple of bin edges for which to get histogram values; must be tuple to support caching

_spark(cls, execution_engine: SparkDFExecutionEngine, metric_domain_kwargs: Dict, metric_value_kwargs: Dict, metrics: Dict[Tuple, Any], runtime_configuration: Dict)