great_expectations.core.batch
¶
Module Contents¶
Classes¶
|
A convenience class for migrating away from untyped dictionaries to stronger typed objects. |
|
This class is for internal inter-object protocol purposes only. |
|
This class contains all attributes of a batch_request. See the comments in BatchRequestBase for design specifics. |
|
This class is for internal inter-object protocol purposes only. |
|
A BatchMarkers is a special type of BatchKwargs (so that it has a batch_fingerprint) but it generally does |
|
A convenience class for migrating away from untyped dictionaries to stronger typed objects. |
Functions¶
|
|
|
|
|
|
|
|
|
Obtain formal BatchRequest typed object from allowed attributes (supplied as arguments). |
|
-
great_expectations.core.batch.
logger
¶
-
class
great_expectations.core.batch.
BatchDefinition
(datasource_name: str, data_connector_name: str, data_asset_name: str, batch_identifiers: IDDict, batch_spec_passthrough: Optional[dict] = None)¶ Bases:
great_expectations.types.SerializableDictDot
A convenience class for migrating away from untyped dictionaries to stronger typed objects.
Can be instantiated with arguments:
- my_A = MyClassA(
foo=”a string”, bar=1,
)
Can be instantiated from a dictionary:
- my_A = MyClassA(
- **{
“foo”: “a string”, “bar”: 1,
}
)
Can be accessed using both dictionary and dot notation
my_A.foo == “a string” my_A.bar == 1
my_A[“foo”] == “a string” my_A[“bar”] == 1
Pairs nicely with @dataclass:
@dataclass() class MyClassA(DictDot):
foo: str bar: int
Can be made immutable:
@dataclass(frozen=True) class MyClassA(DictDot):
foo: str bar: int
For more examples of usage, please see test_dataclass_serializable_dot_dict_pattern.py in the tests folder.
-
to_json_dict
(self)¶ # TODO: <Alex>2/4/2022</Alex> A reference implementation can be provided, once circular import dependencies, caused by relative locations of the “great_expectations/types/__init__.py” and “great_expectations/core/util.py” modules are resolved.
-
__repr__
(self)¶ Return repr(self).
-
static
_validate_batch_definition
(datasource_name: str, data_connector_name: str, data_asset_name: str, batch_identifiers: IDDict)¶
-
property
datasource_name
(self)¶
-
property
data_connector_name
(self)¶
-
property
data_asset_name
(self)¶
-
property
batch_identifiers
(self)¶
-
property
batch_spec_passthrough
(self)¶
-
property
id
(self)¶
-
__eq__
(self, other)¶ Return self==value.
-
__str__
(self)¶ Return str(self).
-
__hash__
(self)¶ Overrides the default implementation
-
class
great_expectations.core.batch.
BatchRequestBase
(datasource_name: str, data_connector_name: str, data_asset_name: str, data_connector_query: Optional[dict] = None, limit: Optional[int] = None, runtime_parameters: Optional[dict] = None, batch_identifiers: Optional[dict] = None, batch_spec_passthrough: Optional[dict] = None)¶ Bases:
great_expectations.types.SerializableDictDot
This class is for internal inter-object protocol purposes only. As such, it contains all attributes of a batch_request, but does not validate them. See the BatchRequest class, which extends BatchRequestBase and validates the attributes.
BatchRequestBase is used for the internal protocol purposes exclusively, not part of API for the developer users.
Previously, the very same BatchRequest was used for both the internal protocol purposes and as part of the API exposed to developers. However, while convenient for internal data interchange, using the same BatchRequest class as arguments to the externally-exported DataContext.get_batch(), DataContext.get_batch_list(), and DataContext.get_validator() API calls for obtaining batches and/or validators was insufficiently expressive to fulfill the needs of both. In the user-accessible API, BatchRequest, must enforce that all members of the triple, consisting of data_source_name, data_connector_name, and data_asset_name, are not NULL. Whereas for the internal protocol, BatchRequest is used as a flexible bag of attributes, in which any fields are allowed to be NULL. Hence, now, BatchRequestBase is dedicated for the use as the bag oof attributes for the internal protocol use, whereby NULL values are allowed as per the internal needs. The BatchRequest class extends BatchRequestBase and adds to it strong validation (described above plus additional attribute validation) so as to formally validate user specified fields.
-
property
datasource_name
(self)¶
-
property
data_connector_name
(self)¶
-
property
data_asset_name
(self)¶
-
property
data_connector_query
(self)¶
-
property
limit
(self)¶
-
property
runtime_parameters
(self)¶
-
property
batch_identifiers
(self)¶
-
property
batch_spec_passthrough
(self)¶
-
property
id
(self)¶
-
to_dict
(self)¶
-
to_json_dict
(self)¶ # TODO: <Alex>2/4/2022</Alex> This implementation of “SerializableDictDot.to_json_dict() occurs frequently and should ideally serve as the reference implementation in the “SerializableDictDot” class itself. However, the circular import dependencies, due to the location of the “great_expectations/types/__init__.py” and “great_expectations/core/util.py” modules make this refactoring infeasible at the present time.
-
__deepcopy__
(self, memo)¶
-
__eq__
(self, other)¶ Return self==value.
-
__repr__
(self)¶ # TODO: <Alex>2/4/2022</Alex> This implementation of a custom “__repr__()” occurs frequently and should ideally serve as the reference implementation in the “SerializableDictDot” class. However, the circular import dependencies, due to the location of the “great_expectations/types/__init__.py” and “great_expectations/core/util.py” modules make this refactoring infeasible at the present time.
-
__str__
(self)¶ # TODO: <Alex>2/4/2022</Alex> This implementation of a custom “__str__()” occurs frequently and should ideally serve as the reference implementation in the “SerializableDictDot” class. However, the circular import dependencies, due to the location of the “great_expectations/types/__init__.py” and “great_expectations/core/util.py” modules make this refactoring infeasible at the present time.
-
static
_validate_init_parameters
(datasource_name: str, data_connector_name: str, data_asset_name: str, data_connector_query: Optional[dict] = None, limit: Optional[int] = None)¶
-
property
-
class
great_expectations.core.batch.
BatchRequest
(datasource_name: str, data_connector_name: str, data_asset_name: str, data_connector_query: Optional[dict] = None, limit: Optional[int] = None, batch_spec_passthrough: Optional[dict] = None)¶ Bases:
great_expectations.core.batch.BatchRequestBase
This class contains all attributes of a batch_request. See the comments in BatchRequestBase for design specifics. limit: refers to the number of batches requested (not rows per batch)
-
include_field_names
:Set[str]¶
-
-
class
great_expectations.core.batch.
RuntimeBatchRequest
(datasource_name: str, data_connector_name: str, data_asset_name: str, runtime_parameters: dict, batch_identifiers: dict, batch_spec_passthrough: Optional[dict] = None)¶ Bases:
great_expectations.core.batch.BatchRequestBase
This class is for internal inter-object protocol purposes only. As such, it contains all attributes of a batch_request, but does not validate them. See the BatchRequest class, which extends BatchRequestBase and validates the attributes.
BatchRequestBase is used for the internal protocol purposes exclusively, not part of API for the developer users.
Previously, the very same BatchRequest was used for both the internal protocol purposes and as part of the API exposed to developers. However, while convenient for internal data interchange, using the same BatchRequest class as arguments to the externally-exported DataContext.get_batch(), DataContext.get_batch_list(), and DataContext.get_validator() API calls for obtaining batches and/or validators was insufficiently expressive to fulfill the needs of both. In the user-accessible API, BatchRequest, must enforce that all members of the triple, consisting of data_source_name, data_connector_name, and data_asset_name, are not NULL. Whereas for the internal protocol, BatchRequest is used as a flexible bag of attributes, in which any fields are allowed to be NULL. Hence, now, BatchRequestBase is dedicated for the use as the bag oof attributes for the internal protocol use, whereby NULL values are allowed as per the internal needs. The BatchRequest class extends BatchRequestBase and adds to it strong validation (described above plus additional attribute validation) so as to formally validate user specified fields.
-
include_field_names
:Set[str]¶
-
static
_validate_runtime_batch_request_specific_init_parameters
(runtime_parameters: dict, batch_identifiers: dict, batch_spec_passthrough: Optional[dict] = None)¶
-
-
class
great_expectations.core.batch.
BatchMarkers
(*args, **kwargs)¶ Bases:
great_expectations.core.id_dict.BatchKwargs
A BatchMarkers is a special type of BatchKwargs (so that it has a batch_fingerprint) but it generally does NOT require specific keys and instead captures information about the OUTPUT of a datasource’s fetch process, such as the timestamp at which a query was executed.
-
property
ge_load_time
(self)¶
-
property
-
class
great_expectations.core.batch.
Batch
(data, batch_request: Union[BatchRequest, RuntimeBatchRequest] = None, batch_definition: BatchDefinition = None, batch_spec: BatchSpec = None, batch_markers: BatchMarkers = None, data_context=None, datasource_name=None, batch_parameters=None, batch_kwargs=None)¶ Bases:
great_expectations.types.SerializableDictDot
A convenience class for migrating away from untyped dictionaries to stronger typed objects.
Can be instantiated with arguments:
- my_A = MyClassA(
foo=”a string”, bar=1,
)
Can be instantiated from a dictionary:
- my_A = MyClassA(
- **{
“foo”: “a string”, “bar”: 1,
}
)
Can be accessed using both dictionary and dot notation
my_A.foo == “a string” my_A.bar == 1
my_A[“foo”] == “a string” my_A[“bar”] == 1
Pairs nicely with @dataclass:
@dataclass() class MyClassA(DictDot):
foo: str bar: int
Can be made immutable:
@dataclass(frozen=True) class MyClassA(DictDot):
foo: str bar: int
For more examples of usage, please see test_dataclass_serializable_dot_dict_pattern.py in the tests folder.
-
property
data
(self)¶
-
property
batch_request
(self)¶
-
property
batch_definition
(self)¶
-
property
batch_spec
(self)¶
-
property
batch_markers
(self)¶
-
property
data_context
(self)¶
-
property
datasource_name
(self)¶
-
property
batch_parameters
(self)¶
-
property
batch_kwargs
(self)¶
-
to_dict
(self)¶
-
to_json_dict
(self)¶ # TODO: <Alex>2/4/2022</Alex> A reference implementation can be provided, once circular import dependencies, caused by relative locations of the “great_expectations/types/__init__.py” and “great_expectations/core/util.py” modules are resolved.
-
property
id
(self)¶
-
__str__
(self)¶ Return str(self).
-
head
(self, n_rows=5, fetch_all=False)¶
-
great_expectations.core.batch.
materialize_batch_request
(batch_request: Optional[Union[BatchRequest, RuntimeBatchRequest, dict]] = None) → Optional[Union[BatchRequest, RuntimeBatchRequest]]¶
-
great_expectations.core.batch.
batch_request_contains_batch_data
(batch_request: Optional[Union[BatchRequest, RuntimeBatchRequest, dict]] = None) → bool¶
-
great_expectations.core.batch.
batch_request_contains_runtime_parameters
(batch_request: Optional[Union[BatchRequest, RuntimeBatchRequest, dict]] = None) → bool¶
-
great_expectations.core.batch.
get_batch_request_as_dict
(batch_request: Optional[Union[BatchRequest, RuntimeBatchRequest, dict]] = None) → Optional[dict]¶
-
great_expectations.core.batch.
get_batch_request_from_acceptable_arguments
(datasource_name: Optional[str] = None, data_connector_name: Optional[str] = None, data_asset_name: Optional[str] = None, *, batch_request: Optional[Union[BatchRequest, RuntimeBatchRequest]] = None, batch_data: Optional[Any] = None, data_connector_query: Optional[dict] = None, batch_identifiers: Optional[dict] = None, limit: Optional[int] = None, index: Optional[Union[int, list, tuple, slice, str]] = None, custom_filter_function: Optional[Callable] = None, batch_spec_passthrough: Optional[dict] = None, sampling_method: Optional[str] = None, sampling_kwargs: Optional[dict] = None, splitter_method: Optional[str] = None, splitter_kwargs: Optional[dict] = None, runtime_parameters: Optional[dict] = None, query: Optional[str] = None, path: Optional[str] = None, batch_filter_parameters: Optional[dict] = None, **kwargs) → BatchRequest¶ Obtain formal BatchRequest typed object from allowed attributes (supplied as arguments). This method applies only to the new (V3) Datasource schema.
- Parameters
datasource_name –
data_connector_name –
data_asset_name –
batch_request –
batch_data –
query –
path –
runtime_parameters –
data_connector_query –
batch_identifiers –
batch_filter_parameters –
limit –
index –
custom_filter_function –
sampling_method –
sampling_kwargs –
splitter_method –
splitter_kwargs –
batch_spec_passthrough –
**kwargs –
- Returns
(BatchRequest) The formal BatchRequest object
-
great_expectations.core.batch.
standardize_batch_request_display_ordering
(batch_request: Dict[str, Union[str, int, Dict[str, Any]]]) → Dict[str, Union[str, Dict[str, Any]]]¶