great_expectations.datasource.data_connector.configured_asset_dbfs_data_connector

Module Contents

Classes

ConfiguredAssetDBFSDataConnector(name: str, datasource_name: str, base_directory: str, assets: dict, execution_engine: ExecutionEngine, default_regex: Optional[dict] = None, glob_directive: str = ‘**/*’, sorters: Optional[list] = None, batch_spec_passthrough: Optional[dict] = None)

Extension of ConfiguredAssetFilesystemDataConnector used to connect to the DataBricks File System (DBFS). Note: This works for the current implementation of DBFS. If in the future DBFS diverges from a Filesystem-like implementation, we should instead inherit from ConfiguredAssetFilePathDataConnector or another DataConnector.

great_expectations.datasource.data_connector.configured_asset_dbfs_data_connector.logger
class great_expectations.datasource.data_connector.configured_asset_dbfs_data_connector.ConfiguredAssetDBFSDataConnector(name: str, datasource_name: str, base_directory: str, assets: dict, execution_engine: ExecutionEngine, default_regex: Optional[dict] = None, glob_directive: str = '**/*', sorters: Optional[list] = None, batch_spec_passthrough: Optional[dict] = None)

Bases: great_expectations.datasource.data_connector.ConfiguredAssetFilesystemDataConnector

Extension of ConfiguredAssetFilesystemDataConnector used to connect to the DataBricks File System (DBFS). Note: This works for the current implementation of DBFS. If in the future DBFS diverges from a Filesystem-like implementation, we should instead inherit from ConfiguredAssetFilePathDataConnector or another DataConnector.

DataConnectors produce identifying information, called “batch_spec” that ExecutionEngines can use to get individual batches of data. They add flexibility in how to obtain data such as with time-based partitioning, splitting and sampling, or other techniques appropriate for obtaining batches of data.

The ConfiguredAssetDBFSDataConnector is one of two classes (InferredAssetDBFSDataConnector being the other one) designed for connecting to data on DBFS.

A ConfiguredAssetDBFSDataConnector requires an explicit specification of each DataAsset you want to connect to. This allows more fine-tuning, but also requires more setup.

_get_full_file_path_for_asset(self, path: str, asset: Optional[Asset] = None)