How to load a database table, view, or query result as a batch

This guide shows how to get a batch of data that Great Expectations can validate from a SQL database. When you validate data using SQL, all compute is managed by your database, and only Validation Results are returned.

Prerequisites – this how-to guide assumes you have already:


  1. Construct Batch Kwargs that describe the data you plan to validate.

    For example, if your configured datasource is named “my_db” and you plan to validate data in a view called “my_view” located in the “my_schema” schema:

    • To manually build table or view-based Batch Kwargs, create a dictionary:

      batch_kwargs = {
          "datasource": "my_db",
          "schema": "my_schema",  # schema is optional; default schema will be used if it is omitted
          "table": "my_view"  # note that the "table" key is used even to validate a view
    • To use a Batch Kwargs Generator, call the build_batch_kwargs method:

      For example, if your configured generator is called my_generator and you wish to access the my_view asset, you can call:

      batch_kwargs = context.build_batch_kwargs("my_db", "my_generator", "my_view")
    • To manually build query-based Batch Kwargs, create a dictionary:

      batch_kwargs = {
          "datasource": "my_db",
          "query": "SELECT * FROM my_schema.my_table WHERE '1988-01-01' <= date AND date < '1989-01-01';
    • To use a Query Batch Kwargs Generator, call the build_batch_kwargs method:

      batch_kwargs = context.build_batch_kwargs(
              "start": "1988-01-01",
              "end": "1989-01-01"
  2. Obtain an Expectation Suite to use to validate your batch.

    expectation_suite_name = "npi.warning"  # choose an appropriate name for your suite

    If you have not already created a suite, you can do so now.

    # Note, you can add the "overwrite_existing" flag to the below command if the suite
    # exists but you would like to replace it.
  3. Get the batch to validate.

    batch = context.get_batch(

Now that you have a Batch, you can use it to create Expectations or validate the data.

Additional Notes

  • If you are using Snowflake, and you have lowercase table or column names: * If you are loading your batch with a table, you can use pass “use_quoted_name”:True into your batch_kwargs dictionary. This will use the SQL Alchemy quoted_name method to ensure case sensitivity for your table and column names. * If you are loading your batch with a query, if you have lowercase column names, you still need to pass “use_quoted_name”:True into your batch_kwargs dictionary. You will also need to wrap your query in single quotes, and your table or column name in double quotes like so:

    batch_kwargs = {
        "use_quoted_name": True,
        "query: 'select "lowercase_column_one", "lowercase_column_two" from "lowercase_table_name" limit 100'
  • For more information on configuring a Batch Kwargs generator, please see the relevant guides. The above code snippets use the following configuration:

      class_name: SqlAlchemyDatasource
      credentials: ${rds_movies_db}
        class_name: SqlAlchemyDataset
        module_name: great_expectations.dataset
          class_name: TableBatchKwargsGenerator
          class_name: QueryBatchKwargsGenerator
            class_name: TupleFilesystemStoreBackend
            filepath_suffix: .sql
            base_directory: queries
    SELECT * FROM movies WHERE '$start'::date <= release_date AND release_date <= '$end'::date;