How to run a Checkpoint in python¶
This guide will help you run a Checkpoint in python. This is useful if your pipeline environment or orchestration engine does not have shell access.
Prerequisites: This how-to guide assumes you have already:
Set up a working deployment of Great Expectations
You have created a Checkpoint
First, generate the python with the command:
great_expectations checkpoint script my_checkpoint
Next, you will see a message about where the python script was created like:
A python script was created that runs the checkpoint named: `my_checkpoint` - The script is located in `great_expectations/uncommitted/run_my_checkpoint.py` - The script can be run with `python great_expectations/uncommitted/run_my_checkpoint.py`
Next, open the script which should look like this:
""" This is a basic generated Great Expectations script that runs a checkpoint. A checkpoint is a list of one or more batches paired with one or more Expectation Suites and a configurable Validation Operator. Checkpoints can be run directly without this script using the `great_expectations checkpoint run` command. This script is provided for those who wish to run checkpoints via python. Data that is validated is controlled by BatchKwargs, which can be adjusted in the checkpoint file: great_expectations/checkpoints/my_checkpoint.yml. Data are validated by use of the `ActionListValidationOperator` which is configured by default. The default configuration of this Validation Operator saves validation results to your results store and then updates Data Docs. This makes viewing validation results easy for you and your team. Usage: - Run this file: `python great_expectations/uncommitted/run_my_checkpoint.py`. - This can be run manually or via a scheduler such as cron. - If your pipeline runner supports python snippets you can paste this into your pipeline. """ import sys from great_expectations import DataContext # checkpoint configuration context = DataContext("/home/ubuntu/my_project/great_expectations") checkpoint = context.get_checkpoint("my_checkpoint") # load batches of data batches_to_validate =  for batch in checkpoint["batches"]: batch_kwargs = batch["batch_kwargs"] for suite_name in batch["expectation_suite_names"]: suite = context.get_expectation_suite(suite_name) batch = context.get_batch(batch_kwargs, suite) batches_to_validate.append(batch) # run the validation operator results = context.run_validation_operator( checkpoint["validation_operator_name"], assets_to_validate=batches_to_validate, # TODO prepare for new RunID - checkpoint name and timestamp # run_id=RunID(checkpoint) ) # take action based on results if not results["success"]: print("Validation Failed!") sys.exit(1) print("Validation Succeeded!") sys.exit(0)
4. This python can then be invoked directly using python python great_expectations/uncommitted/run_my_checkpoint.py or the python code can be embedded in your pipeline.