Create a Data Context
A Data Context defines the storage location for metadata, such as your configurations for Data Sources, Expectation Suites, Checkpoints, and Data Docs. It also contains your Validation Results and the metrics associated with them, and it provides access to those objects in Python, along with other helper functions for the GX Python API.
All scripts that utilize GX Core should start with the creation of a Data Context.
The following are the available Data Context types:
-
File Data Context: A persistent Data Context that stores metadata and configuration information as YAML files within a file system. File Data Contexts allow you to re-use previously configured Expectation Suites, Data Sources, and Checkpoints.
-
Ephemeral Data Context: A temporary Data Context that stores metadata and configuration information in memory. This Data Context will not persist beyond the current Python session. Ephemeral Data Contexts are useful when you don’t have write permissions to a file system or if you are going to engage in data exploration without needing to save your results.
-
GX Cloud Data Context: A Data Context that connects to a GX Cloud Account to retrieve and store GX Cloud metadata and configuration information. The GX Cloud Data Context lets you leverage GX Cloud to share your Expectation Suites, Data Sources, and Checkpoints with your organization.
- Quick Start
- File
- Ephemeral
- GX Cloud
Prerequisites
Request an available Data Context
- Instructions
- Sample code
-
Run the following code to request a Data Context:
Python inputimport great_expectations as gx
context = gx.get_context()If you don't specify parameters with the
get_context()
method, GX checks your project environment and returns the first Data Context using the following criteria:get_context()
instantiates and returns a GX Cloud Data Context if it finds the necessary credentials in your environment variables.- If a GX Cloud Data Context cannot be instantiated,
get_context()
will instantiate and return the first File Data Context it finds in the folder hierarchy of your current working directory. - If neither of the above options are viable,
get_context()
instantiates and returns an Ephemeral Data Context.
-
Optional. Run the following code to verify the type of Data Context you received:
Python inputprint(type(context).__name__)
The name of the Data Context class is displayed.
# Import great_expectations and request a Data Context.
import great_expectations as gx
context = gx.get_context()
# Optional. Check the type of Data Context that was returned.
print(type(context).__name__)
Prerequisites
Create a File Data Context
- Instructions
- Sample code
-
Run the following code to request a File Data Context:
Python inputimport great_expectations as gx
context = gx.get_context(mode="file")When you specify
mode="file"
, theget_context()
method instantiates and returns the first File Data Context it finds in the folder hierarchy of your current working directory.If a File Data Context configuration is not found,
get_context(mode="file")
creates a new File Data Context in your current working directory and then instantiates and returns the newly created File Data Context.Alternatively, you can request a specific File Data Context by providing a folder path with the
project_root_dir
parameter. If a File Data Context exists in the specified folder it will be instantiated and returned. If a File Data Context is not found in the specified folder, a new File Data Context will be created.Python inputcontext = gx.get_context(mode="file", project_root_dir="./new_context_folder")
-
Optional. Run the following code to review the File Data Context configuration:
Python inputprint(context)
The Data Context configuration, formatted as a Python dictionary, is displayed.
# Import great_expectations and request a Data Context.
import great_expectations as gx
context = gx.get_context(mode="file")
# Optional. Request a File Data Context from a specific folder.
context = gx.get_context(mode="file", project_root_dir="./new_context_folder")
# Optional. Review the configuration of the returned File Data Context.
print(context)
Prerequisites
Create an Ephemeral Data Context
- Instructions
- Sample code
-
Run the following code to request an Ephemeral Data Context:
Python inputimport great_expectations as gx
context = gx.get_context(mode="ephemeral")Ephemeral Data Contexts are temporary and
get_context(mode="ephemeral")
always instantiates and returns a new Ephemeral Data Context. -
Optional. Run the following code to review the Ephemeral Data Context configuration:
Python inputprint(context)
The Data Context configuration, formatted as a Python dictionary, is displayed.
# Import great_expectations and request a Data Context.
import great_expectations as gx
context = gx.get_context(mode="ephemeral")
# Optional. Review the configuration of the returned Ephemeral Data Context.
print(context)
Prerequisites
- Python version 3.9 to 3.12
- An installation of GX Core
- A GX Cloud access token and organization ID set as environment variables. (See Configure credentials under Create a Cloud Data Context.)
Create a Cloud Data Context
- Instructions
- Sample code
-
Run the following code to request a GX Cloud Data Context:
Python inputimport great_expectations as gx
context = gx.get_context(mode="cloud")When you specify
mode="cloud"
, theget_context()
method uses the GX_CLOUD_ACCESS_TOKEN and GX_CLOUD_ORGANIZATION_ID environment variables to connect to your GX Cloud account. -
Optional. Run the following code to review the Cloud Data Context configuration:
Python inputprint(context)
The Data Context configuration, formatted as a Python dictionary, is displayed.
# Import great_expectations and request a Data Context.
import great_expectations as gx
context = gx.get_context(mode="cloud")
# Optional. Review the configuration of the returned Cloud Data Context.
print(context)
Get your user access token and organization ID
You'll need your user access token and organization ID to set your environment variables. Don't commit your access tokens to your version control software.
-
In GX Cloud, click Settings > Tokens.
-
In the User access tokens pane, click Create user access token.
-
In the Token name field, enter a name for the token that will help you quickly identify it.
-
Click Create.
-
Copy and then paste the user access token into a temporary file. The token can't be retrieved after you close the dialog.
-
Click Close.
-
Copy the value in the Organization ID field into the temporary file with your user access token and then save the file.
GX recommends deleting the temporary file after you set the environment variables.
Set the GX Cloud Organization ID and user access token as environment variables
Environment variables securely store your GX Cloud access credentials.
-
Save your GX_CLOUD_ACCESS_TOKEN and GX_CLOUD_ORGANIZATION_ID as environment variables by entering
export ENV_VAR_NAME=env_var_value
in the terminal or adding the command to your~/.bashrc
or~/.zshrc
file. For example:Terminal inputexport GX_CLOUD_ACCESS_TOKEN=<user_access_token>
export GX_CLOUD_ORGANIZATION_ID=<organization_id> -
Optional. If you created a temporary file to record your user access token and Organization ID, delete it.