Evaluation Parameters

Often, the specific parameters associated with an expectation will be derived from upstream steps in a processing pipeline. For example, we may want to expect_table_row_count_to_equal a value stored in a previous step.

Great Expectations makes it possible to use “Evaluation Parameters” to accomplish that goal. We declare Expectations using parameters that need to be provided at validation time; during interactive development, we can even provide a temporary value that should be used during the initial evaluation of the expectation.

>>> my_df.expect_table_row_count_to_equal(
...    value={"$PARAMETER": "upstream_row_count", "$PARAMETER.upstream_row_count": 10},
...    result_format={'result_format': 'BOOLEAN_ONLY'})
{
  'success': True
}

You can also store parameter values in a special dictionary called evaluation_parameters that is stored in the expectation_suite to be available to multiple expectations or while declaring additional expectations.

>>> my_df.set_evaluation_parameter("upstream_row_count", 10)
>>> my_df.get_evaluation_parameter("upstream_row_count")

If a parameter has been stored, then it does not need to be provided for a new expectation to be declared:

>>> my_df.set_evaluation_parameter("upstream_row_count", 10)
>>> my_df.expect_table_row_count_to_be_between(max_value={"$PARAMETER": "upstream_row_count"})

When validating expectations, you can provide evaluation parameters based on upstream results:

>>> my_df.validate(expectation_suite=my_dag_step_config, evaluation_parameters={"upstream_row_count":upstream_row_count})

Finally, the command-line tool also allows you to provide a JSON file that contains parameters to use during evaluation:

>>> cat my_parameters_file.json
{
    "upstream_row_count": 10
}
>>> great_expectations validation csv --evaluation_parameters=my_parameters_file.json dataset_file.csv expectation_suite.json

DataContext Evaluation Parameter Store

When a DataContext has a configured evaluation parameter store, it can automatically identify and store evaluation parameters that are referenced in other expectation suites. The evaluation parameter store uses a URN schema for identifying dependencies between expectation suites.

The DataContext-recognized URN must begin with the string urn:great_expectations:validations. Valid URNs must have one of the following structures to be recognized by the Great Expectations DataContext:

urn:great_expectations:validations:<expectation_suite_name>:<metric_name>
urn:great_expectations:validations:<expectation_suite_name>:<metric_name>:<metric_kwargs_id>

Replace names in <> with the desired name. For example:

urn:great_expectations:validations:dickens_data:expect_column_proportion_of_unique_values_to_be_between.result.observed_value:column=Title

last updated: Aug 13, 2020