great_expectations.rule_based_profiler.parameter_builder.regex_pattern_string_parameter_builder
¶
Module Contents¶
Classes¶
|
Detects the domain REGEX from a set of candidate REGEX strings by computing the |
-
great_expectations.rule_based_profiler.parameter_builder.regex_pattern_string_parameter_builder.
logger
¶
-
class
great_expectations.rule_based_profiler.parameter_builder.regex_pattern_string_parameter_builder.
RegexPatternStringParameterBuilder
(name: str, metric_domain_kwargs: Optional[Union[str, dict]] = None, metric_value_kwargs: Optional[Union[str, dict]] = None, threshold: Union[float, str] = 1.0, candidate_regexes: Optional[Union[Iterable[str], str]] = None, data_context: Optional['DataContext'] = None, batch_request: Optional[Union[BatchRequest, RuntimeBatchRequest, dict]] = None)¶ Bases:
great_expectations.rule_based_profiler.parameter_builder.parameter_builder.ParameterBuilder
Detects the domain REGEX from a set of candidate REGEX strings by computing the column_values.match_regex_format.unexpected_count metric for each candidate format and returning the format that has the lowest unexpected_count ratio.
-
CANDIDATE_REGEX
:Set[str]¶
-
property
metric_domain_kwargs
(self)¶
-
property
metric_value_kwargs
(self)¶
-
property
threshold
(self)¶
-
property
candidate_regexes
(self)¶
-
_build_parameters
(self, parameter_container: ParameterContainer, domain: Domain, variables: Optional[ParameterContainer] = None, parameters: Optional[Dict[str, ParameterContainer]] = None)¶ Check the percentage of values matching the REGEX string, and return the best fit, or None if no string exceeds the configured threshold.
- Returns
ParameterContainer object that holds ParameterNode objects with attribute name-value pairs and optional details
-
_get_regex_matched_greater_than_threshold
(self, regex_string_success_ratio_dict: dict, threshold: float)¶ Helper method to calculate which regex_strings match greater than threshold
-
_get_sorted_regex_and_ratios
(self, regex_string_success_ratio_dict: dict)¶ Helper method to sort all regexes that were evaluated by their success ratio. Returns Tuple(ratio, sorted_strings)
-