great_expectations.datasource.batch_kwargs_generator.query_batch_kwargs_generator

Module Contents

Classes

QueryBatchKwargsGenerator(name=’default’, datasource=None, query_store_backend=None, queries=None)

Produce query-style batch_kwargs from sql files or defined queries.

great_expectations.datasource.batch_kwargs_generator.query_batch_kwargs_generator.logger
great_expectations.datasource.batch_kwargs_generator.query_batch_kwargs_generator.sqlalchemy
class great_expectations.datasource.batch_kwargs_generator.query_batch_kwargs_generator.QueryBatchKwargsGenerator(name='default', datasource=None, query_store_backend=None, queries=None)

Bases: great_expectations.datasource.batch_kwargs_generator.batch_kwargs_generator.BatchKwargsGenerator

Produce query-style batch_kwargs from sql files or defined queries.

By default, a QueryBatchKwargsGenerator will look for queries in the datasources/datasource_name/generators/generator_name directory, and look for files ending in .sql.

For example, a file stored in datasources/datasource_name/generators/generator_name/movies_by_date.sql would allow you to access an asset called movies_by_date

Queries can be parameterized using $substitution.

Example configuration:

queries:

class_name: QueryBatchKwargsGenerator query_store_backend:

class_name: TupleFilesystemStoreBackend filepath_suffix: .sql base_directory: queries

Example query template, to be stored in queries/movies_by_date.sql

SELECT * FROM movies where ‘$start’::date <= release_date AND release_date <= ‘$end’::date;

Example usage:

context.build_batch_kwargs(

“my_db”, “query_generator”, “movies_by_date”, “query_parameters”: {

“start”: “2020-01-01”, “end”: “2020-02-01”

}

recognized_batch_parameters
_get_raw_query(self, data_asset_name)
_get_iterator(self, data_asset_name, query_parameters=None)
add_query(self, generator_asset=None, query=None, data_asset_name=None)
get_available_data_asset_names(self)

Return the list of asset names known by this batch kwargs generator.

Returns

A list of available names

_build_batch_kwargs(self, batch_parameters)

Build batch kwargs from a partition id.

get_available_partition_ids(self, generator_asset=None, data_asset_name=None)

Applies the current _partitioner to the batches available on data_asset_name and returns a list of valid partition_id strings that can be used to identify batches of data.

Parameters

data_asset_name – the data asset whose partitions should be returned.

Returns

A list of partition_id strings