great_expectations.datasource.data_connector.configured_asset_gcs_data_connector
¶
Module Contents¶
Classes¶
|
Extension of ConfiguredAssetFilePathDataConnector used to connect to GCS |
-
great_expectations.datasource.data_connector.configured_asset_gcs_data_connector.
logger
¶
-
great_expectations.datasource.data_connector.configured_asset_gcs_data_connector.
storage
¶
-
class
great_expectations.datasource.data_connector.configured_asset_gcs_data_connector.
ConfiguredAssetGCSDataConnector
(name: str, datasource_name: str, bucket_or_name: str, assets: dict, execution_engine: Optional[ExecutionEngine] = None, default_regex: Optional[dict] = None, sorters: Optional[list] = None, prefix: Optional[str] = None, delimiter: Optional[str] = None, max_results: Optional[int] = None, gcs_options: Optional[dict] = None, batch_spec_passthrough: Optional[dict] = None)¶ -
Extension of ConfiguredAssetFilePathDataConnector used to connect to GCS
DataConnectors produce identifying information, called “batch_spec” that ExecutionEngines can use to get individual batches of data. They add flexibility in how to obtain data such as with time-based partitioning, splitting and sampling, or other techniques appropriate for obtaining batches of data.
The ConfiguredAssetGCSDataConnector is one of two classes (InferredAssetGCSDataConnector being the other one) designed for connecting to data on GCS.
A ConfiguredAssetGCSDataConnector requires an explicit specification of each DataAsset you want to connect to. This allows more fine-tuning, but also requires more setup. Please note that in order to maintain consistency with Google’s official SDK, we utilize terms like “bucket_or_name” and “max_results”. Since we convert these keys from YAML to Python and directly pass them in to the GCS connection object, maintaining consistency is necessary for proper usage.
- This DataConnector supports the following methods of authentication:
Standard gcloud auth / GOOGLE_APPLICATION_CREDENTIALS environment variable workflow
Manual creation of credentials from google.oauth2.service_account.Credentials.from_service_account_file
Manual creation of credentials from google.oauth2.service_account.Credentials.from_service_account_info
As much of the interaction with the SDK is done through a GCS Storage Client, please refer to the official docs if a greater understanding of the supported authentication methods and general functionality is desired. Source: https://googleapis.dev/python/google-api-core/latest/auth.html
-
build_batch_spec
(self, batch_definition: BatchDefinition)¶ Build BatchSpec from batch_definition by calling DataConnector’s build_batch_spec function.
- Parameters
batch_definition (BatchDefinition) – to be used to build batch_spec
- Returns
BatchSpec built from batch_definition
-
_get_data_reference_list_for_asset
(self, asset: Optional[Asset])¶
-
_get_full_file_path_for_asset
(self, path: str, asset: Optional[Asset] = None)¶