great_expectations.rule_based_profiler.parameter_builder.parameter_builder

Module Contents

Classes

ParameterBuilder(name: str, batch_list: Optional[List[Batch]] = None, batch_request: Optional[Union[BatchRequest, RuntimeBatchRequest, dict]] = None, data_context: Optional[‘DataContext’] = None)

A ParameterBuilder implementation provides support for building Expectation Configuration Parameters suitable for

great_expectations.rule_based_profiler.parameter_builder.parameter_builder.MetricValue
great_expectations.rule_based_profiler.parameter_builder.parameter_builder.MetricValues
great_expectations.rule_based_profiler.parameter_builder.parameter_builder.MetricComputationDetails
great_expectations.rule_based_profiler.parameter_builder.parameter_builder.MetricComputationResult
class great_expectations.rule_based_profiler.parameter_builder.parameter_builder.ParameterBuilder(name: str, batch_list: Optional[List[Batch]] = None, batch_request: Optional[Union[BatchRequest, RuntimeBatchRequest, dict]] = None, data_context: Optional['DataContext'] = None)

Bases: great_expectations.rule_based_profiler.types.Builder, abc.ABC

A ParameterBuilder implementation provides support for building Expectation Configuration Parameters suitable for use in other ParameterBuilders or in ConfigurationBuilders as part of profiling.

A ParameterBuilder is configured as part of a ProfilerRule. Its primary interface is the build_parameters method.

As part of a ProfilerRule, the following configuration will create a new parameter for each domain returned by the domain_builder, with an associated id.

``` parameter_builders:

  • name: my_parameter_builder class_name: MetricMultiBatchParameterBuilder metric_name: column.mean

```

exclude_field_names :Set[str]
build_parameters(self, parameter_container: ParameterContainer, domain: Domain, variables: Optional[ParameterContainer] = None, parameters: Optional[Dict[str, ParameterContainer]] = None)
property name(self)
property batch_request(self)
property batch_list(self)
property data_context(self)
abstract _build_parameters(self, parameter_container: ParameterContainer, domain: Domain, variables: Optional[ParameterContainer] = None, parameters: Optional[Dict[str, ParameterContainer]] = None)
get_validator(self, domain: Optional[Domain] = None, variables: Optional[ParameterContainer] = None, parameters: Optional[Dict[str, ParameterContainer]] = None)
get_batch_ids(self, domain: Optional[Domain] = None, variables: Optional[ParameterContainer] = None, parameters: Optional[Dict[str, ParameterContainer]] = None)
get_metrics(self, metric_name: str, metric_domain_kwargs: Optional[Union[str, dict]] = None, metric_value_kwargs: Optional[Union[str, dict]] = None, enforce_numeric_metric: Union[str, bool] = False, replace_nan_with_zero: Union[str, bool] = False, domain: Optional[Domain] = None, variables: Optional[ParameterContainer] = None, parameters: Optional[Dict[str, ParameterContainer]] = None)

General multi-batch metric computation facility.

Computes specified metric (can be multi-dimensional, numeric, non-numeric, or mixed) and conditions (or “sanitizes”) result according to two criteria: enforcing metric output to be numeric and handling NaN values. :param metric_name: Name of metric of interest, being computed. :param metric_domain_kwargs: Metric Domain Kwargs is an essential parameter of the MetricConfiguration object. :param metric_value_kwargs: Metric Value Kwargs is an essential parameter of the MetricConfiguration object. :param enforce_numeric_metric: Flag controlling whether or not metric output must be numerically-valued. :param replace_nan_with_zero: Directive controlling how NaN metric values, if encountered, should be handled. :param domain: Domain object scoping “$variable”/”$parameter”-style references in configuration and runtime. :param variables: Part of the “rule state” available for “$variable”-style references. :param parameters: Part of the “rule state” available for “$parameter”-style references. :return: MetricComputationResult object, containing both: data samples in the format “N x R^m”, where “N” (most significant dimension) is the number of measurements (e.g., one per Batch of data), while “R^m” is the multi-dimensional metric, whose values are being estimated, and details (to be used for metadata purposes).

_sanitize_metric_computation(self, metric_name: str, metric_values: np.ndarray, enforce_numeric_metric: Union[str, bool] = False, replace_nan_with_zero: Union[str, bool] = False, domain: Optional[Domain] = None, variables: Optional[ParameterContainer] = None, parameters: Optional[Dict[str, ParameterContainer]] = None)

This method conditions (or “sanitizes”) data samples in the format “N x R^m”, where “N” (most significant dimension) is the number of measurements (e.g., one per Batch of data), while “R^m” is the multi-dimensional metric, whose values are being estimated. The “conditioning” operations are: 1. If “enforce_numeric_metric” flag is set, raise an error if a non-numeric value is found in sample vectors. 2. Further, if a NaN is encountered in a sample vectors and “replace_nan_with_zero” is True, then replace those NaN values with the 0.0 floating point number; if “replace_nan_with_zero” is False, then raise an error.