great_expectations.core.util
¶
Module Contents¶
Classes¶
|
Parses an Azure Blob Storage URL into its separate components. |
|
Parses a Google Cloud Storage URL into its separate components |
|
|
|
Methods for converting Databricks Filesystem (DBFS) paths |
Functions¶
|
Update d with items from u, recursively and joining elements. By default, list values are |
Tests whether we are in a Databricks environment. |
|
Helper function to convert an object to one that is json serializable |
|
|
Helper function to convert an object to one that is json serializable |
|
This utility function will iterate over input data and for all strings, replace any strftime format |
|
This utility function takes a string with strftime format elements and substitutes those elements using |
|
|
|
|
|
Attempts to get read_csv compression from s3_url |
|
|
|
|
|
|
|
-
great_expectations.core.util.
logger
¶
-
great_expectations.core.util.
sqlalchemy
¶
-
great_expectations.core.util.
SCHEMAS
¶
-
great_expectations.core.util.
pyspark
¶
-
great_expectations.core.util.
_SUFFIX_TO_PD_KWARG
¶
-
great_expectations.core.util.
nested_update
(d: Union[Iterable, dict], u: Union[Iterable, dict], dedup: bool = False, concat_lists: bool = True)¶ Update d with items from u, recursively and joining elements. By default, list values are concatenated without de-duplication. If concat_lists is set to False, lists in u (new dict) will replace those in d (base dict).
-
great_expectations.core.util.
in_jupyter_notebook
()¶
-
great_expectations.core.util.
in_databricks
() → bool¶ Tests whether we are in a Databricks environment.
- Returns
bool
-
great_expectations.core.util.
convert_to_json_serializable
(data)¶ Helper function to convert an object to one that is json serializable :param data: an object to attempt to convert a corresponding json-serializable object
- Returns
(dict) A converted test_object
Warning
test_obj may also be converted in place.
-
great_expectations.core.util.
ensure_json_serializable
(data)¶ Helper function to convert an object to one that is json serializable :param data: an object to attempt to convert a corresponding json-serializable object
- Returns
(dict) A converted test_object
Warning
test_obj may also be converted in place.
-
great_expectations.core.util.
requires_lossy_conversion
(d)¶
-
great_expectations.core.util.
substitute_all_strftime_format_strings
(data: Union[dict, list, str, Any], datetime_obj: Optional[datetime.datetime] = None) → Union[str, Any]¶ This utility function will iterate over input data and for all strings, replace any strftime format elements using either the provided datetime_obj or the current datetime
-
great_expectations.core.util.
get_datetime_string_from_strftime_format
(format_str: str, datetime_obj: Optional[datetime.datetime] = None) → str¶ This utility function takes a string with strftime format elements and substitutes those elements using either the provided datetime_obj or current datetime
-
great_expectations.core.util.
parse_string_to_datetime
(datetime_string: str, datetime_format_string: Optional[str] = None) → datetime.date¶
-
great_expectations.core.util.
datetime_to_int
(dt: datetime.date) → int¶
-
class
great_expectations.core.util.
AzureUrl
(url: str)¶ Parses an Azure Blob Storage URL into its separate components. Formats:
WASBS (for Spark): “wasbs://<CONTAINER>@<ACCOUNT_NAME>.blob.core.windows.net/<BLOB>” HTTP(S) (for Pandas) “<ACCOUNT_NAME>.blob.core.windows.net/<CONTAINER>/<BLOB>”
Reference: WASBS – Windows Azure Storage Blob (https://datacadamia.com/azure/wasb).
-
AZURE_BLOB_STORAGE_PROTOCOL_DETECTION_REGEX_PATTERN
:str = ^[^@]+@.+\.blob\.core\.windows\.net\/.+$¶
-
AZURE_BLOB_STORAGE_HTTPS_URL_REGEX_PATTERN
:str = ^(https?:\/\/)?(.+?)\.blob\.core\.windows\.net/([^/]+)/(.+)$¶
-
AZURE_BLOB_STORAGE_HTTPS_URL_TEMPLATE
:str = {account_name}.blob.core.windows.net/{container}/{path}¶
-
AZURE_BLOB_STORAGE_WASBS_URL_REGEX_PATTERN
:str = ^(wasbs?:\/\/)?([^/]+)@(.+?)\.blob\.core\.windows\.net/(.+)$¶
-
AZURE_BLOB_STORAGE_WASBS_URL_TEMPLATE
:str = wasbs://{container}@{account_name}.blob.core.windows.net/{path}¶
-
property
protocol
(self)¶
-
property
account_name
(self)¶
-
property
account_url
(self)¶
-
property
container
(self)¶
-
property
blob
(self)¶
-
-
class
great_expectations.core.util.
GCSUrl
(url: str)¶ Parses a Google Cloud Storage URL into its separate components Format: gs://<BUCKET_OR_NAME>/<BLOB>
-
URL_REGEX_PATTERN
:str = ^gs://([^/]+)/(.+)$¶
-
OBJECT_URL_TEMPLATE
:str = gs://{bucket_or_name}/{path}¶
-
property
bucket
(self)¶
-
property
blob
(self)¶
-
-
class
great_expectations.core.util.
S3Url
(url)¶ -
OBJECT_URL_TEMPLATE
:str = s3a://{bucket}/{path}¶ //bucket/hello/world”) >>> s.bucket ‘bucket’ >>> s.key ‘hello/world’ >>> s.url ‘s3://bucket/hello/world’
>>> s = S3Url("s3://bucket/hello/world?qwe1=3#ddd") >>> s.bucket 'bucket' >>> s.key 'hello/world?qwe1=3#ddd' >>> s.url 's3://bucket/hello/world?qwe1=3#ddd'
>>> s = S3Url("s3://bucket/hello/world#foo?bar=2") >>> s.key 'hello/world#foo?bar=2' >>> s.url 's3://bucket/hello/world#foo?bar=2'
- Type
>>> s = S3Url("s3
-
property
bucket
(self)¶
-
property
key
(self)¶
-
property
suffix
(self)¶ Attempts to get a file suffix from the S3 key. If can’t find one returns None.
-
property
url
(self)¶
-
-
class
great_expectations.core.util.
DBFSPath
¶ Methods for converting Databricks Filesystem (DBFS) paths
-
static
convert_to_protocol_version
(path: str)¶
-
static
convert_to_file_semantics_version
(path: str)¶
-
static
-
great_expectations.core.util.
sniff_s3_compression
(s3_url: S3Url) → str¶ Attempts to get read_csv compression from s3_url
-
great_expectations.core.util.
get_or_create_spark_application
(spark_config: Optional[Dict[str, str]] = None, force_reuse_spark_context: bool = False)¶
-
great_expectations.core.util.
get_or_create_spark_session
(spark_config: Optional[Dict[str, str]] = None)¶
-
great_expectations.core.util.
spark_restart_required
(current_spark_config: List[tuple], desired_spark_config: dict) → bool¶
-
great_expectations.core.util.
get_sql_dialect_floating_point_infinity_value
(schema: str, negative: bool = False) → float¶