great_expectations.core.batch

Module Contents

Classes

BatchDefinition(datasource_name: str, data_connector_name: str, data_asset_name: str, batch_identifiers: IDDict, batch_spec_passthrough: Optional[dict] = None)

A convenience class for migrating away from untyped dictionaries to stronger typed objects.

BatchRequestBase(datasource_name: str, data_connector_name: str, data_asset_name: str, data_connector_query: Optional[dict] = None, limit: Optional[int] = None, runtime_parameters: Optional[dict] = None, batch_identifiers: Optional[dict] = None, batch_spec_passthrough: Optional[dict] = None)

This class is for internal inter-object protocol purposes only.

BatchRequest(datasource_name: str, data_connector_name: str, data_asset_name: str, data_connector_query: Optional[dict] = None, limit: Optional[int] = None, batch_spec_passthrough: Optional[dict] = None)

This class contains all attributes of a batch_request. See the comments in BatchRequestBase for design specifics.

RuntimeBatchRequest(datasource_name: str, data_connector_name: str, data_asset_name: str, runtime_parameters: dict, batch_identifiers: dict, batch_spec_passthrough: Optional[dict] = None)

This class is for internal inter-object protocol purposes only.

BatchMarkers(*args, **kwargs)

A BatchMarkers is a special type of BatchKwargs (so that it has a batch_fingerprint) but it generally does

Batch(data, batch_request: Union[BatchRequest, RuntimeBatchRequest] = None, batch_definition: BatchDefinition = None, batch_spec: BatchSpec = None, batch_markers: BatchMarkers = None, data_context=None, datasource_name=None, batch_parameters=None, batch_kwargs=None)

A convenience class for migrating away from untyped dictionaries to stronger typed objects.

Functions

materialize_batch_request(batch_request: Optional[Union[BatchRequest, RuntimeBatchRequest, dict]] = None)

batch_request_contains_batch_data(batch_request: Optional[Union[BatchRequest, RuntimeBatchRequest, dict]] = None)

batch_request_contains_runtime_parameters(batch_request: Optional[Union[BatchRequest, RuntimeBatchRequest, dict]] = None)

get_batch_request_as_dict(batch_request: Optional[Union[BatchRequest, RuntimeBatchRequest, dict]] = None)

get_batch_request_from_acceptable_arguments(datasource_name: Optional[str] = None, data_connector_name: Optional[str] = None, data_asset_name: Optional[str] = None, *, batch_request: Optional[Union[BatchRequest, RuntimeBatchRequest]] = None, batch_data: Optional[Any] = None, data_connector_query: Optional[dict] = None, batch_identifiers: Optional[dict] = None, limit: Optional[int] = None, index: Optional[Union[int, list, tuple, slice, str]] = None, custom_filter_function: Optional[Callable] = None, batch_spec_passthrough: Optional[dict] = None, sampling_method: Optional[str] = None, sampling_kwargs: Optional[dict] = None, splitter_method: Optional[str] = None, splitter_kwargs: Optional[dict] = None, runtime_parameters: Optional[dict] = None, query: Optional[str] = None, path: Optional[str] = None, batch_filter_parameters: Optional[dict] = None, **kwargs)

Obtain formal BatchRequest typed object from allowed attributes (supplied as arguments).

standardize_batch_request_display_ordering(batch_request: Dict[str, Union[str, int, Dict[str, Any]]])

great_expectations.core.batch.logger
class great_expectations.core.batch.BatchDefinition(datasource_name: str, data_connector_name: str, data_asset_name: str, batch_identifiers: IDDict, batch_spec_passthrough: Optional[dict] = None)

Bases: great_expectations.types.SerializableDictDot

A convenience class for migrating away from untyped dictionaries to stronger typed objects.

Can be instantiated with arguments:

my_A = MyClassA(

foo=”a string”, bar=1,

)

Can be instantiated from a dictionary:

my_A = MyClassA(
**{

“foo”: “a string”, “bar”: 1,

}

)

Can be accessed using both dictionary and dot notation

my_A.foo == “a string” my_A.bar == 1

my_A[“foo”] == “a string” my_A[“bar”] == 1

Pairs nicely with @dataclass:

@dataclass() class MyClassA(DictDot):

foo: str bar: int

Can be made immutable:

@dataclass(frozen=True) class MyClassA(DictDot):

foo: str bar: int

For more examples of usage, please see test_dataclass_serializable_dot_dict_pattern.py in the tests folder.

to_json_dict(self)

# TODO: <Alex>2/4/2022</Alex> A reference implementation can be provided, once circular import dependencies, caused by relative locations of the “great_expectations/types/__init__.py” and “great_expectations/core/util.py” modules are resolved.

__repr__(self)

Return repr(self).

static _validate_batch_definition(datasource_name: str, data_connector_name: str, data_asset_name: str, batch_identifiers: IDDict)
property datasource_name(self)
property data_connector_name(self)
property data_asset_name(self)
property batch_identifiers(self)
property batch_spec_passthrough(self)
property id(self)
__eq__(self, other)

Return self==value.

__str__(self)

Return str(self).

__hash__(self)

Overrides the default implementation

class great_expectations.core.batch.BatchRequestBase(datasource_name: str, data_connector_name: str, data_asset_name: str, data_connector_query: Optional[dict] = None, limit: Optional[int] = None, runtime_parameters: Optional[dict] = None, batch_identifiers: Optional[dict] = None, batch_spec_passthrough: Optional[dict] = None)

Bases: great_expectations.types.SerializableDictDot

This class is for internal inter-object protocol purposes only. As such, it contains all attributes of a batch_request, but does not validate them. See the BatchRequest class, which extends BatchRequestBase and validates the attributes.

BatchRequestBase is used for the internal protocol purposes exclusively, not part of API for the developer users.

Previously, the very same BatchRequest was used for both the internal protocol purposes and as part of the API exposed to developers. However, while convenient for internal data interchange, using the same BatchRequest class as arguments to the externally-exported DataContext.get_batch(), DataContext.get_batch_list(), and DataContext.get_validator() API calls for obtaining batches and/or validators was insufficiently expressive to fulfill the needs of both. In the user-accessible API, BatchRequest, must enforce that all members of the triple, consisting of data_source_name, data_connector_name, and data_asset_name, are not NULL. Whereas for the internal protocol, BatchRequest is used as a flexible bag of attributes, in which any fields are allowed to be NULL. Hence, now, BatchRequestBase is dedicated for the use as the bag oof attributes for the internal protocol use, whereby NULL values are allowed as per the internal needs. The BatchRequest class extends BatchRequestBase and adds to it strong validation (described above plus additional attribute validation) so as to formally validate user specified fields.

property datasource_name(self)
property data_connector_name(self)
property data_asset_name(self)
property data_connector_query(self)
property limit(self)
property runtime_parameters(self)
property batch_identifiers(self)
property batch_spec_passthrough(self)
property id(self)
to_dict(self)
to_json_dict(self)

# TODO: <Alex>2/4/2022</Alex> This implementation of “SerializableDictDot.to_json_dict() occurs frequently and should ideally serve as the reference implementation in the “SerializableDictDot” class itself. However, the circular import dependencies, due to the location of the “great_expectations/types/__init__.py” and “great_expectations/core/util.py” modules make this refactoring infeasible at the present time.

__deepcopy__(self, memo)
__eq__(self, other)

Return self==value.

__repr__(self)

# TODO: <Alex>2/4/2022</Alex> This implementation of a custom “__repr__()” occurs frequently and should ideally serve as the reference implementation in the “SerializableDictDot” class. However, the circular import dependencies, due to the location of the “great_expectations/types/__init__.py” and “great_expectations/core/util.py” modules make this refactoring infeasible at the present time.

__str__(self)

# TODO: <Alex>2/4/2022</Alex> This implementation of a custom “__str__()” occurs frequently and should ideally serve as the reference implementation in the “SerializableDictDot” class. However, the circular import dependencies, due to the location of the “great_expectations/types/__init__.py” and “great_expectations/core/util.py” modules make this refactoring infeasible at the present time.

static _validate_init_parameters(datasource_name: str, data_connector_name: str, data_asset_name: str, data_connector_query: Optional[dict] = None, limit: Optional[int] = None)
class great_expectations.core.batch.BatchRequest(datasource_name: str, data_connector_name: str, data_asset_name: str, data_connector_query: Optional[dict] = None, limit: Optional[int] = None, batch_spec_passthrough: Optional[dict] = None)

Bases: great_expectations.core.batch.BatchRequestBase

This class contains all attributes of a batch_request. See the comments in BatchRequestBase for design specifics. limit: refers to the number of batches requested (not rows per batch)

include_field_names :Set[str]
class great_expectations.core.batch.RuntimeBatchRequest(datasource_name: str, data_connector_name: str, data_asset_name: str, runtime_parameters: dict, batch_identifiers: dict, batch_spec_passthrough: Optional[dict] = None)

Bases: great_expectations.core.batch.BatchRequestBase

This class is for internal inter-object protocol purposes only. As such, it contains all attributes of a batch_request, but does not validate them. See the BatchRequest class, which extends BatchRequestBase and validates the attributes.

BatchRequestBase is used for the internal protocol purposes exclusively, not part of API for the developer users.

Previously, the very same BatchRequest was used for both the internal protocol purposes and as part of the API exposed to developers. However, while convenient for internal data interchange, using the same BatchRequest class as arguments to the externally-exported DataContext.get_batch(), DataContext.get_batch_list(), and DataContext.get_validator() API calls for obtaining batches and/or validators was insufficiently expressive to fulfill the needs of both. In the user-accessible API, BatchRequest, must enforce that all members of the triple, consisting of data_source_name, data_connector_name, and data_asset_name, are not NULL. Whereas for the internal protocol, BatchRequest is used as a flexible bag of attributes, in which any fields are allowed to be NULL. Hence, now, BatchRequestBase is dedicated for the use as the bag oof attributes for the internal protocol use, whereby NULL values are allowed as per the internal needs. The BatchRequest class extends BatchRequestBase and adds to it strong validation (described above plus additional attribute validation) so as to formally validate user specified fields.

include_field_names :Set[str]
static _validate_runtime_batch_request_specific_init_parameters(runtime_parameters: dict, batch_identifiers: dict, batch_spec_passthrough: Optional[dict] = None)
class great_expectations.core.batch.BatchMarkers(*args, **kwargs)

Bases: great_expectations.core.id_dict.BatchKwargs

A BatchMarkers is a special type of BatchKwargs (so that it has a batch_fingerprint) but it generally does NOT require specific keys and instead captures information about the OUTPUT of a datasource’s fetch process, such as the timestamp at which a query was executed.

property ge_load_time(self)
class great_expectations.core.batch.Batch(data, batch_request: Union[BatchRequest, RuntimeBatchRequest] = None, batch_definition: BatchDefinition = None, batch_spec: BatchSpec = None, batch_markers: BatchMarkers = None, data_context=None, datasource_name=None, batch_parameters=None, batch_kwargs=None)

Bases: great_expectations.types.SerializableDictDot

A convenience class for migrating away from untyped dictionaries to stronger typed objects.

Can be instantiated with arguments:

my_A = MyClassA(

foo=”a string”, bar=1,

)

Can be instantiated from a dictionary:

my_A = MyClassA(
**{

“foo”: “a string”, “bar”: 1,

}

)

Can be accessed using both dictionary and dot notation

my_A.foo == “a string” my_A.bar == 1

my_A[“foo”] == “a string” my_A[“bar”] == 1

Pairs nicely with @dataclass:

@dataclass() class MyClassA(DictDot):

foo: str bar: int

Can be made immutable:

@dataclass(frozen=True) class MyClassA(DictDot):

foo: str bar: int

For more examples of usage, please see test_dataclass_serializable_dot_dict_pattern.py in the tests folder.

property data(self)
property batch_request(self)
property batch_definition(self)
property batch_spec(self)
property batch_markers(self)
property data_context(self)
property datasource_name(self)
property batch_parameters(self)
property batch_kwargs(self)
to_dict(self)
to_json_dict(self)

# TODO: <Alex>2/4/2022</Alex> A reference implementation can be provided, once circular import dependencies, caused by relative locations of the “great_expectations/types/__init__.py” and “great_expectations/core/util.py” modules are resolved.

property id(self)
__str__(self)

Return str(self).

head(self, n_rows=5, fetch_all=False)
great_expectations.core.batch.materialize_batch_request(batch_request: Optional[Union[BatchRequest, RuntimeBatchRequest, dict]] = None) → Optional[Union[BatchRequest, RuntimeBatchRequest]]
great_expectations.core.batch.batch_request_contains_batch_data(batch_request: Optional[Union[BatchRequest, RuntimeBatchRequest, dict]] = None) → bool
great_expectations.core.batch.batch_request_contains_runtime_parameters(batch_request: Optional[Union[BatchRequest, RuntimeBatchRequest, dict]] = None) → bool
great_expectations.core.batch.get_batch_request_as_dict(batch_request: Optional[Union[BatchRequest, RuntimeBatchRequest, dict]] = None) → Optional[dict]
great_expectations.core.batch.get_batch_request_from_acceptable_arguments(datasource_name: Optional[str] = None, data_connector_name: Optional[str] = None, data_asset_name: Optional[str] = None, *, batch_request: Optional[Union[BatchRequest, RuntimeBatchRequest]] = None, batch_data: Optional[Any] = None, data_connector_query: Optional[dict] = None, batch_identifiers: Optional[dict] = None, limit: Optional[int] = None, index: Optional[Union[int, list, tuple, slice, str]] = None, custom_filter_function: Optional[Callable] = None, batch_spec_passthrough: Optional[dict] = None, sampling_method: Optional[str] = None, sampling_kwargs: Optional[dict] = None, splitter_method: Optional[str] = None, splitter_kwargs: Optional[dict] = None, runtime_parameters: Optional[dict] = None, query: Optional[str] = None, path: Optional[str] = None, batch_filter_parameters: Optional[dict] = None, **kwargs) → BatchRequest

Obtain formal BatchRequest typed object from allowed attributes (supplied as arguments). This method applies only to the new (V3) Datasource schema.

Parameters
  • datasource_name

  • data_connector_name

  • data_asset_name

  • batch_request

  • batch_data

  • query

  • path

  • runtime_parameters

  • data_connector_query

  • batch_identifiers

  • batch_filter_parameters

  • limit

  • index

  • custom_filter_function

  • sampling_method

  • sampling_kwargs

  • splitter_method

  • splitter_kwargs

  • batch_spec_passthrough

  • **kwargs

Returns

(BatchRequest) The formal BatchRequest object

great_expectations.core.batch.standardize_batch_request_display_ordering(batch_request: Dict[str, Union[str, int, Dict[str, Any]]]) → Dict[str, Union[str, Dict[str, Any]]]