Module Contents


PandasDatasource(name=’pandas’, data_context=None, data_asset_type=None, batch_kwargs_generators=None, boto3_options=None, reader_method=None, reader_options=None, limit=None, **kwargs)

The PandasDatasource produces PandasDataset objects and supports generators capable of

great_expectations.datasource.pandas_datasource.HASH_THRESHOLD = 1000000000.0
class great_expectations.datasource.pandas_datasource.PandasDatasource(name='pandas', data_context=None, data_asset_type=None, batch_kwargs_generators=None, boto3_options=None, reader_method=None, reader_options=None, limit=None, **kwargs)

Bases: great_expectations.datasource.datasource.LegacyDatasource

The PandasDatasource produces PandasDataset objects and supports generators capable of interacting with the local filesystem (the default subdir_reader generator), and from existing in-memory dataframes.

classmethod build_configuration(cls, data_asset_type=None, batch_kwargs_generators=None, boto3_options=None, reader_method=None, reader_options=None, limit=None, **kwargs)

Build a full configuration object for a datasource, potentially including generators with defaults.

  • data_asset_type – A ClassConfig dictionary

  • batch_kwargs_generators – Generator configuration dictionary

  • boto3_options – Optional dictionary with key-value pairs to pass to boto3 during instantiation.

  • reader_method – Optional default reader_method for generated batches

  • reader_options – Optional default reader_options for generated batches

  • limit – Optional default limit for generated batches

  • **kwargs – Additional kwargs to be part of the datasource constructor’s initialization


A complete datasource configuration.

process_batch_parameters(self, reader_method=None, reader_options=None, limit=None, dataset_options=None)

Use datasource-specific configuration to translate any batch parameters into batch kwargs at the datasource level.

  • limit (int) – a parameter all datasources must accept to allow limiting a batch to a smaller number of rows.

  • dataset_options (dict) – a set of kwargs that will be passed to the constructor of a dataset built using these batch_kwargs


Result will include both parameters passed via argument and configured parameters.

Return type


get_batch(self, batch_kwargs, batch_parameters=None)

Get a batch of data from the datasource.

  • batch_kwargs – the BatchKwargs to use to construct the batch

  • batch_parameters – optional parameters to store as the reference description of the batch. They should reflect parameters that would provide the passed BatchKwargs.



static guess_reader_method_from_path(path)
_infer_default_options(self, reader_fn: Callable, reader_options: dict)

Allows reader options to be customized based on file context before loading to a DataFrame

  • reader_method (str) – pandas reader method

  • reader_options – Current options and defaults set to pass to the reader method


A copy of the reader options post-inference

Return type


_get_reader_fn(self, reader_method=None, path=None)

Static helper for parsing reader types. If reader_method is not provided, path will be used to guess the correct reader_method.

  • reader_method (str) – the name of the reader method to use, if available.

  • path (str) – the to use to guess


ReaderMethod to use for the filepath