great_expectations.execution_engine.sqlalchemy_execution_engine
¶
Module Contents¶
Classes¶
|
Helper class that provides a standard way to create an ABC using |
Functions¶
|
Given a dialect, returns the dialect type, which is defines the engine/system that is used to communicates |
-
great_expectations.execution_engine.sqlalchemy_execution_engine.
__version__
¶
-
great_expectations.execution_engine.sqlalchemy_execution_engine.
logger
¶
-
great_expectations.execution_engine.sqlalchemy_execution_engine.
make_url
¶
-
great_expectations.execution_engine.sqlalchemy_execution_engine.
reflection
¶
-
great_expectations.execution_engine.sqlalchemy_execution_engine.
sqlalchemy_psycopg2
¶
-
great_expectations.execution_engine.sqlalchemy_execution_engine.
sqlalchemy_redshift
¶
-
great_expectations.execution_engine.sqlalchemy_execution_engine.
sqlalchemy_dremio
¶
-
great_expectations.execution_engine.sqlalchemy_execution_engine.
snowflake
¶
-
great_expectations.execution_engine.sqlalchemy_execution_engine.
_BIGQUERY_MODULE_NAME
= sqlalchemy_bigquery¶
-
great_expectations.execution_engine.sqlalchemy_execution_engine.
bigquery_types_tuple
¶
-
great_expectations.execution_engine.sqlalchemy_execution_engine.
teradatasqlalchemy
¶
-
great_expectations.execution_engine.sqlalchemy_execution_engine.
_get_dialect_type_module
(dialect)¶ Given a dialect, returns the dialect type, which is defines the engine/system that is used to communicates with the database/database implementation. Currently checks for RedShift/BigQuery dialects
-
class
great_expectations.execution_engine.sqlalchemy_execution_engine.
SqlAlchemyExecutionEngine
(name: Optional[str] = None, credentials: Optional[dict] = None, data_context: Optional[Any] = None, engine=None, connection_string: Optional[str] = None, url: Optional[str] = None, batch_data_dict: Optional[dict] = None, create_temp_table: bool = True, concurrency: Optional[ConcurrencyConfig] = None, **kwargs)¶ Bases:
great_expectations.execution_engine.ExecutionEngine
Helper class that provides a standard way to create an ABC using inheritance.
-
property
credentials
(self)¶
-
property
connection_string
(self)¶
-
property
url
(self)¶
-
_build_engine
(self, credentials: dict, **kwargs)¶ Using a set of given credentials, constructs an Execution Engine , connecting to a database using a URL or a private key path.
-
_get_sqlalchemy_key_pair_auth_url
(self, drivername: str, credentials: dict)¶ Utilizing a private key path and a passphrase in a given credentials dictionary, attempts to encode the provided values into a private key. If passphrase is incorrect, this will fail and an exception is raised.
- Parameters
drivername (str) –
credentials (dict) –
- Returns
a tuple consisting of a url with the serialized key-pair authentication, and a dictionary of engine kwargs.
-
get_domain_records
(self, domain_kwargs: Dict)¶ Uses the given domain kwargs (which include row_condition, condition_parser, and ignore_row_if directives) to obtain and/or query a batch. Returns in the format of an SqlAlchemy table/column(s) object.
- Parameters
domain_kwargs (dict) –
- Returns
An SqlAlchemy table/column(s) (the selectable object for obtaining data on which to compute)
-
get_compute_domain
(self, domain_kwargs: Dict, domain_type: Union[str, MetricDomainTypes], accessor_keys: Optional[Iterable[str]] = None)¶ Uses a given batch dictionary and domain kwargs to obtain a SqlAlchemy column object.
- Parameters
domain_kwargs (dict) –
domain_type (str or MetricDomainTypes) –
to be using, or a corresponding string value representing it. String types include "identity", (like) –
"column_pair", "table" and "other". Enum types include capitalized versions of these from the ("column",) –
MetricDomainTypes. (class) –
accessor_keys (str iterable) –
the domain and simply transferred with their associated values into accessor_domain_kwargs. (describing) –
- Returns
SqlAlchemy column
-
resolve_metric_bundle
(self, metric_fn_bundle: Iterable[Tuple[MetricConfiguration, Any, dict, dict]])¶ For every metric in a set of Metrics to resolve, obtains necessary metric keyword arguments and builds bundles of the metrics into one large query dictionary so that they are all executed simultaneously. Will fail if bundling the metrics together is not possible.
- Args:
- metric_fn_bundle (Iterable[Tuple[MetricConfiguration, Callable, dict]): A Dictionary containing a MetricProvider’s MetricConfiguration (its unique identifier), its metric provider function
(the function that actually executes the metric), and the arguments to pass to the metric provider function. A dictionary of metrics defined in the registry and corresponding arguments
- Returns:
A dictionary of metric names and their corresponding now-queried values.
-
close
(self)¶ Note: Will 20210729
This is a helper function that will close and dispose Sqlalchemy objects that are used to connect to a database. Databases like Snowflake require the connection and engine to be instantiated and closed separately, and not doing so has caused problems with hanging connections.
Currently the ExecutionEngine does not support handling connections and engine separately, and will actually override the engine with a connection in some cases, obfuscating what object is used to actually used by the ExecutionEngine to connect to the external database. This will be handled in an upcoming refactor, which will allow this function to eventually become:
self.connection.close() self.engine.dispose()
More background can be found here: https://github.com/great-expectations/great_expectations/pull/3104/
-
_split_on_whole_table
(self, table_name: str, batch_identifiers: dict)¶ ‘Split’ by returning the whole table
-
_split_on_column_value
(self, table_name: str, column_name: str, batch_identifiers: dict)¶ Split using the values in the named column
-
_split_on_converted_datetime
(self, table_name: str, column_name: str, batch_identifiers: dict, date_format_string: str = '%Y-%m-%d')¶ Convert the values in the named column to the given date_format, and split on that
-
_split_on_divided_integer
(self, table_name: str, column_name: str, divisor: int, batch_identifiers: dict)¶ Divide the values in the named column by divisor, and split on that
-
_split_on_mod_integer
(self, table_name: str, column_name: str, mod: int, batch_identifiers: dict)¶ Divide the values in the named column by divisor, and split on that
-
_split_on_multi_column_values
(self, table_name: str, column_names: List[str], batch_identifiers: dict)¶ Split on the joint values in the named columns
-
_split_on_hashed_column
(self, table_name: str, column_name: str, hash_digits: int, batch_identifiers: dict)¶ Split on the hashed value of the named column
-
_sample_using_mod
(self, column_name: str, mod: int, value: int)¶ Take the mod of named column, and only keep rows that match the given value
-
_sample_using_a_list
(self, column_name: str, value_list: list)¶ Match the values in the named column against value_list, and only keep the matches
-
_sample_using_md5
(self, column_name: str, hash_digits: int = 1, hash_value: str = 'f')¶ Hash the values in the named column, and split on that
-
_build_selectable_from_batch_spec
(self, batch_spec: BatchSpec)¶
-
get_batch_data_and_markers
(self, batch_spec: BatchSpec)¶
-
property