great_expectations.util
¶
Module Contents¶
Classes¶
|
Bi-directional hashmap: https://stackoverflow.com/a/21894086 |
Functions¶
|
|
|
Borrowed from inflection.underscore |
|
|
|
|
|
Parameterizes template “execution_time_decorator” function with options, supplied as arguments. |
|
|
|
|
|
|
|
|
|
|
|
Convert a (pandas) dataframe to a great_expectations dataset, with (optional) expectation_suite |
|
Convert a (pandas) dataframe to a great_expectations dataset, with (optional) expectation_suite |
|
Read a file using Pandas read_csv and return a great_expectations dataset. |
|
Read a file using Pandas read_json and return a great_expectations dataset. |
|
Read a file using Pandas read_excel and return a great_expectations dataset. |
|
Read a file using Pandas read_table and return a great_expectations dataset. |
|
Read a file using Pandas read_feather and return a great_expectations dataset. |
|
Read a file using Pandas read_parquet and return a great_expectations dataset. |
|
Read a Pandas data frame and return a great_expectations dataset. |
|
Read a file using Pandas read_pickle and return a great_expectations dataset. |
|
Read a file using Pandas read_sas and return a great_expectations dataset. |
Create generic in-memory “BaseDataContext” context for manipulations as required by tests. |
|
|
Validate the provided data asset. Validate can accept an optional data_asset_name to apply, data_context to use |
|
Print the structure of directory as a tree: |
|
Lint strings of code passed in. Optional dependency “black” must be installed. |
|
Cleans JSON-formatted string to adhere to Python syntax |
|
|
|
|
|
Filter the entries of the source dictionary according to directives concerning the existing keys and values. |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
If value is an array, test element-wise for NaN and return result as a boolean array. |
|
This method convers “decimal.Decimal” to standard “float” type. |
|
This method determines whether or not conversion from “decimal.Decimal” to standard “float” type cannot be lossless. |
|
Checks whether or not two numbers (or timestamps) are approximately close to one another. |
|
This method checks whether or not candidate object is subset of target object. |
|
|
|
Determine whether or not all elements of 1-D “np.ndarray” argument are “datetime.datetime” type objects. |
|
Attempt to parse all elements of 1-D “np.ndarray” argument into “datetime.datetime” type objects. |
|
Convert all elements of 1-D “np.ndarray” argument from “datetime.datetime” type to “timestamp” “float” type objects. |
|
Convert all elements of 1-D “np.ndarray” argument from “float” type to “datetime.datetime” type objects. |
|
Convert all elements of 1-D “np.ndarray” argument from “float” type to “datetime.datetime” type tuple elements. |
|
Determine whether or not all elements of 1-D “np.ndarray” argument are “decimal.Decimal” type objects. |
|
Convert all elements of N-D “np.ndarray” argument from “decimal.Decimal” type to “float” type objects. |
|
Method to return the appropriate DataContext depending on parameters and environment. |
|
Really basic sanity checking. |
|
|
Generate the JSON object used to populate the public gallery |
|
|
|
|
|
|
|
|
|
|
Beginning from SQLAlchemy 1.4, a select() can no longer be embedded inside of another select() directly, |
Beginning from SQLAlchemy 1.4, sqlalchemy.sql.Alias has been deprecated in favor of sqlalchemy.sql.Subquery. |
|
|
|
Beginning from SQLAlchemy 1.4, make_url is accessed from sqlalchemy.engine; earlier versions must |
|
|
|
|
Leverage on Trino Package to return sqlalchemy sql type |
|
As of Pandas 1.3.0, the ‘inclusive’ arg in between() is an enum: {“left”, “right”, “neither”, “both”} |
|
As of NumPy 1.21.0, the ‘interpolation’ arg in quantile() has been renamed to method. |
-
great_expectations.util.
black
¶
-
great_expectations.util.
logger
¶
-
great_expectations.util.
sa
¶
-
great_expectations.util.
p1
¶
-
great_expectations.util.
p2
¶
-
class
great_expectations.util.
bidict
(*args: List[Any], **kwargs: Dict[str, Any])¶ Bases:
dict
Bi-directional hashmap: https://stackoverflow.com/a/21894086
-
__setitem__
(self, key: str, value: Any)¶ Set self[key] to value.
-
__delitem__
(self, key: str)¶ Delete self[key].
-
-
great_expectations.util.
camel_to_snake
(name: str) → str¶
-
great_expectations.util.
underscore
(word: str) → str¶ Borrowed from inflection.underscore Make an underscored, lowercase form from the expression in the string.
Example:
>>> underscore("DeviceType") 'device_type'
As a rule of thumb you can think of
underscore()
as the inverse ofcamelize()
, though there are cases where that does not hold:>>> camelize(underscore("IOError")) 'IoError'
-
great_expectations.util.
hyphen
(txt: str)¶
-
great_expectations.util.
profile
(func: Callable) → Callable¶
-
great_expectations.util.
measure_execution_time
(execution_time_holder_object_reference_name: str = 'execution_time_holder', execution_time_property_name: str = 'execution_time', method: str = 'process_time', pretty_print: bool = True, include_arguments: bool = True) → Callable¶ Parameterizes template “execution_time_decorator” function with options, supplied as arguments.
- Parameters
execution_time_holder_object_reference_name – Handle, provided in “kwargs”, holds execution time property setter.
execution_time_property_name – Property attribute nane, provided in “kwargs”, sets execution time value.
method – Name of method in “time” module (default: “process_time”) to be used for recording timestamps.
pretty_print – If True (default), prints execution time summary to standard output; if False, “silent” mode.
include_arguments – If True (default), prints arguments of function, whose execution time is measured.
Note: Method “time.perf_counter()” keeps going during sleep, while method “time.process_time()” does not. Using “time.process_time()” is the better suited method for measuring code computational efficiency.
- Returns
Callable – configured “execution_time_decorator” function.
-
great_expectations.util.
get_project_distribution
() → Optional[Distribution]¶
-
great_expectations.util.
get_currently_executing_function
() → Callable¶
-
great_expectations.util.
get_currently_executing_function_call_arguments
(include_module_name: bool = False, include_caller_names: bool = False, **kwargs) → dict¶ - Parameters
include_module_name – bool If True, module name will be determined and included in output dictionary (default is False)
include_caller_names – bool If True, arguments, such as “self” and “cls”, if present, will be included in output dictionary (default is False)
kwargs –
- Returns
dict Output dictionary, consisting of call arguments as attribute “name: value” pairs.
Example usage: # Gather the call arguments of the present function (include the “module_name” and add the “class_name”), filter # out the Falsy values, and set the instance “_config” variable equal to the resulting dictionary. self._config = get_currently_executing_function_call_arguments(
) filter_properties_dict(properties=self._config, clean_falsy=True, inplace=True)
-
great_expectations.util.
verify_dynamic_loading_support
(module_name: str, package_name: Optional[str] = None) → None¶ - Parameters
module_name – a possibly-relative name of a module
package_name – the name of a package, to which the given module belongs
-
great_expectations.util.
import_library_module
(module_name: str) → Optional[ModuleType]¶ - Parameters
module_name – a fully-qualified name of a module (e.g., “great_expectations.dataset.sqlalchemy_dataset”)
- Returns
raw source code of the module (if can be retrieved)
-
great_expectations.util.
is_library_loadable
(library_name: str) → bool¶
-
great_expectations.util.
load_class
(class_name: str, module_name: str)¶
-
great_expectations.util.
_convert_to_dataset_class
(df, dataset_class, expectation_suite=None, profiler=None)¶ Convert a (pandas) dataframe to a great_expectations dataset, with (optional) expectation_suite
- Parameters
df – the DataFrame object to convert
dataset_class – the class to which to convert the existing DataFrame
expectation_suite – the expectation suite that should be attached to the resulting dataset
profiler – the profiler to use to generate baseline expectations, if any
- Returns
A new Dataset object
-
great_expectations.util.
_load_and_convert_to_dataset_class
(df, class_name, module_name, expectation_suite=None, profiler=None)¶ Convert a (pandas) dataframe to a great_expectations dataset, with (optional) expectation_suite
- Parameters
df – the DataFrame object to convert
class_name (str) – class to which to convert resulting Pandas df
module_name (str) – dataset module from which to try to dynamically load the relevant module
expectation_suite – the expectation suite that should be attached to the resulting dataset
profiler – the profiler to use to generate baseline expectations, if any
- Returns
A new Dataset object
-
great_expectations.util.
read_csv
(filename, class_name='PandasDataset', module_name='great_expectations.dataset', dataset_class=None, expectation_suite=None, profiler=None, *args, **kwargs)¶ Read a file using Pandas read_csv and return a great_expectations dataset.
- Parameters
filename (string) – path to file to read
class_name (str) – class to which to convert resulting Pandas df
module_name (str) – dataset module from which to try to dynamically load the relevant module
dataset_class (Dataset) – If specified, the class to which to convert the resulting Dataset object; if not specified, try to load the class named via the class_name and module_name parameters
expectation_suite (string) – path to great_expectations expectation suite file
profiler (Profiler class) – profiler to use when creating the dataset (default is None)
- Returns
great_expectations dataset
-
great_expectations.util.
read_json
(filename, class_name='PandasDataset', module_name='great_expectations.dataset', dataset_class=None, expectation_suite=None, accessor_func=None, profiler=None, *args, **kwargs)¶ Read a file using Pandas read_json and return a great_expectations dataset.
- Parameters
filename (string) – path to file to read
class_name (str) – class to which to convert resulting Pandas df
module_name (str) – dataset module from which to try to dynamically load the relevant module
dataset_class (Dataset) – If specified, the class to which to convert the resulting Dataset object; if not specified, try to load the class named via the class_name and module_name parameters
expectation_suite (string) – path to great_expectations expectation suite file
accessor_func (Callable) – functions to transform the json object in the file
profiler (Profiler class) – profiler to use when creating the dataset (default is None)
- Returns
great_expectations dataset
-
great_expectations.util.
read_excel
(filename, class_name='PandasDataset', module_name='great_expectations.dataset', dataset_class=None, expectation_suite=None, profiler=None, *args, **kwargs)¶ Read a file using Pandas read_excel and return a great_expectations dataset.
- Parameters
filename (string) – path to file to read
class_name (str) – class to which to convert resulting Pandas df
module_name (str) – dataset module from which to try to dynamically load the relevant module
dataset_class (Dataset) – If specified, the class to which to convert the resulting Dataset object; if not specified, try to load the class named via the class_name and module_name parameters
expectation_suite (string) – path to great_expectations expectation suite file
profiler (Profiler class) – profiler to use when creating the dataset (default is None)
- Returns
great_expectations dataset or ordered dict of great_expectations datasets, if multiple worksheets are imported
-
great_expectations.util.
read_table
(filename, class_name='PandasDataset', module_name='great_expectations.dataset', dataset_class=None, expectation_suite=None, profiler=None, *args, **kwargs)¶ Read a file using Pandas read_table and return a great_expectations dataset.
- Parameters
filename (string) – path to file to read
class_name (str) – class to which to convert resulting Pandas df
module_name (str) – dataset module from which to try to dynamically load the relevant module
dataset_class (Dataset) – If specified, the class to which to convert the resulting Dataset object; if not specified, try to load the class named via the class_name and module_name parameters
expectation_suite (string) – path to great_expectations expectation suite file
profiler (Profiler class) – profiler to use when creating the dataset (default is None)
- Returns
great_expectations dataset
-
great_expectations.util.
read_feather
(filename, class_name='PandasDataset', module_name='great_expectations.dataset', dataset_class=None, expectation_suite=None, profiler=None, *args, **kwargs)¶ Read a file using Pandas read_feather and return a great_expectations dataset.
- Parameters
filename (string) – path to file to read
class_name (str) – class to which to convert resulting Pandas df
module_name (str) – dataset module from which to try to dynamically load the relevant module
dataset_class (Dataset) – If specified, the class to which to convert the resulting Dataset object; if not specified, try to load the class named via the class_name and module_name parameters
expectation_suite (string) – path to great_expectations expectation suite file
profiler (Profiler class) – profiler to use when creating the dataset (default is None)
- Returns
great_expectations dataset
-
great_expectations.util.
read_parquet
(filename, class_name='PandasDataset', module_name='great_expectations.dataset', dataset_class=None, expectation_suite=None, profiler=None, *args, **kwargs)¶ Read a file using Pandas read_parquet and return a great_expectations dataset.
- Parameters
filename (string) – path to file to read
class_name (str) – class to which to convert resulting Pandas df
module_name (str) – dataset module from which to try to dynamically load the relevant module
dataset_class (Dataset) – If specified, the class to which to convert the resulting Dataset object; if not specified, try to load the class named via the class_name and module_name parameters
expectation_suite (string) – path to great_expectations expectation suite file
profiler (Profiler class) – profiler to use when creating the dataset (default is None)
- Returns
great_expectations dataset
-
great_expectations.util.
from_pandas
(pandas_df, class_name='PandasDataset', module_name='great_expectations.dataset', dataset_class=None, expectation_suite=None, profiler=None)¶ Read a Pandas data frame and return a great_expectations dataset.
- Parameters
pandas_df (Pandas df) – Pandas data frame
class_name (str) – class to which to convert resulting Pandas df
module_name (str) – dataset module from which to try to dynamically load the relevant module
dataset_class (Dataset) – If specified, the class to which to convert the resulting Dataset object; if not specified, try to load the class named via the class_name and module_name parameters
expectation_suite (string) – path to great_expectations expectation suite file
profiler (profiler class) – The profiler that should be run on the dataset to establish a baseline expectation suite.
- Returns
great_expectations dataset
-
great_expectations.util.
read_pickle
(filename, class_name='PandasDataset', module_name='great_expectations.dataset', dataset_class=None, expectation_suite=None, profiler=None, *args, **kwargs)¶ Read a file using Pandas read_pickle and return a great_expectations dataset.
- Parameters
filename (string) – path to file to read
class_name (str) – class to which to convert resulting Pandas df
module_name (str) – dataset module from which to try to dynamically load the relevant module
dataset_class (Dataset) – If specified, the class to which to convert the resulting Dataset object; if not specified, try to load the class named via the class_name and module_name parameters
expectation_suite (string) – path to great_expectations expectation suite file
profiler (Profiler class) – profiler to use when creating the dataset (default is None)
- Returns
great_expectations dataset
-
great_expectations.util.
read_sas
(filename, class_name='PandasDataset', module_name='great_expectations.dataset', dataset_class=None, expectation_suite=None, profiler=None, *args, **kwargs)¶ Read a file using Pandas read_sas and return a great_expectations dataset.
- Parameters
filename (string) – path to file to read
class_name (str) – class to which to convert resulting Pandas df
module_name (str) – dataset module from which to try to dynamically load the relevant module
dataset_class (Dataset) – If specified, the class to which to convert the resulting Dataset object; if not specified, try to load the class named via the class_name and module_name parameters
expectation_suite (string) – path to great_expectations expectation suite file
profiler (Profiler class) – profiler to use when creating the dataset (default is None)
- Returns
great_expectations dataset
-
great_expectations.util.
build_in_memory_runtime_context
() → 'BaseDataContext'¶ Create generic in-memory “BaseDataContext” context for manipulations as required by tests.
-
great_expectations.util.
validate
(data_asset, expectation_suite=None, data_asset_name=None, expectation_suite_name=None, data_context=None, data_asset_class_name=None, data_asset_module_name='great_expectations.dataset', data_asset_class=None, *args, **kwargs)¶ Validate the provided data asset. Validate can accept an optional data_asset_name to apply, data_context to use to fetch an expectation_suite if one is not provided, and data_asset_class_name/data_asset_module_name or data_asset_class to use to provide custom expectations.
- Parameters
data_asset – the asset to validate
expectation_suite – the suite to use, or None to fetch one using a DataContext
data_asset_name – the name of the data asset to use
expectation_suite_name – the name of the expectation_suite to use
data_context – data context to use to fetch an an expectation suite, or the path from which to obtain one
data_asset_class_name – the name of a class to dynamically load a DataAsset class
data_asset_module_name – the name of the module to dynamically load a DataAsset class
data_asset_class – a class to use. overrides data_asset_class_name/ data_asset_module_name if provided
*args –
**kwargs –
Returns:
-
great_expectations.util.
gen_directory_tree_str
(startpath)¶ Print the structure of directory as a tree:
Ex: project_dir0/
AAA/ BBB/
aaa.txt bbb.txt
#Note: files and directories are sorted alphabetically, so that this method can be used for testing.
-
great_expectations.util.
lint_code
(code: str) → str¶ Lint strings of code passed in. Optional dependency “black” must be installed.
-
great_expectations.util.
convert_json_string_to_be_python_compliant
(code: str) → str¶ Cleans JSON-formatted string to adhere to Python syntax
Substitute instances of ‘null’ with ‘None’ in string representations of Python dictionaries. Additionally, substitutes instances of ‘true’ or ‘false’ with their Python equivalents.
- Parameters
code – JSON string to update
- Returns
Clean, Python-compliant string
-
great_expectations.util.
_convert_nulls_to_None
(code: str) → str¶
-
great_expectations.util.
_convert_json_bools_to_python_bools
(code: str) → str¶
-
great_expectations.util.
filter_properties_dict
(properties: Optional[dict] = None, keep_fields: Optional[Set[str]] = None, delete_fields: Optional[Set[str]] = None, clean_nulls: bool = True, clean_falsy: bool = False, keep_falsy_numerics: bool = True, inplace: bool = False) → Optional[dict]¶ Filter the entries of the source dictionary according to directives concerning the existing keys and values.
- Parameters
properties – source dictionary to be filtered according to the supplied filtering directives
keep_fields – list of keys that must be retained, with the understanding that all other entries will be deleted
delete_fields – list of keys that must be deleted, with the understanding that all other entries will be retained
clean_nulls – If True, then in addition to other filtering directives, delete entries, whose values are None
clean_falsy – If True, then in addition to other filtering directives, delete entries, whose values are Falsy
the "clean_falsy" argument is specified as "True", then "clean_nulls" is assumed to be "True" as well.) ((If) –
inplace – If True, then modify the source properties dictionary; otherwise, make a copy for filtering purposes
keep_falsy_numerics – If True, then in addition to other filtering directives, do not delete zero-valued numerics
- Returns
The (possibly) filtered properties dictionary (or None if no entries remain after filtering is performed)
-
great_expectations.util.
deep_filter_properties_iterable
(properties: dict, keep_fields: Optional[Set[str]] = ..., delete_fields: Optional[Set[str]] = ..., clean_nulls: bool = ..., clean_falsy: bool = ..., keep_falsy_numerics: bool = ..., inplace: bool = ...) → dict¶
-
great_expectations.util.
deep_filter_properties_iterable
(properties: list, keep_fields: Optional[Set[str]] = ..., delete_fields: Optional[Set[str]] = ..., clean_nulls: bool = ..., clean_falsy: bool = ..., keep_falsy_numerics: bool = ..., inplace: bool = ...) → list
-
great_expectations.util.
deep_filter_properties_iterable
(properties: set, keep_fields: Optional[Set[str]] = ..., delete_fields: Optional[Set[str]] = ..., clean_nulls: bool = ..., clean_falsy: bool = ..., keep_falsy_numerics: bool = ..., inplace: bool = ...) → set
-
great_expectations.util.
deep_filter_properties_iterable
(properties: tuple, keep_fields: Optional[Set[str]] = ..., delete_fields: Optional[Set[str]] = ..., clean_nulls: bool = ..., clean_falsy: bool = ..., keep_falsy_numerics: bool = ..., inplace: bool = ...) → tuple
-
great_expectations.util.
deep_filter_properties_iterable
(properties: None, keep_fields: Optional[Set[str]] = ..., delete_fields: Optional[Set[str]] = ..., clean_nulls: bool = ..., clean_falsy: bool = ..., keep_falsy_numerics: bool = ..., inplace: bool = ...) → None
-
great_expectations.util.
deep_filter_properties_iterable
(properties: Union[dict, list, set, tuple, None] = None, keep_fields: Optional[Set[str]] = None, delete_fields: Optional[Set[str]] = None, clean_nulls: bool = True, clean_falsy: bool = False, keep_falsy_numerics: bool = True, inplace: bool = False) → Union[dict, list, set, tuple, None]
-
great_expectations.util.
_is_to_be_removed_from_deep_filter_properties_iterable
(value: Any, clean_nulls: bool, clean_falsy: bool, keep_falsy_numerics: bool) → bool¶
-
great_expectations.util.
is_truthy
(value: Any) → bool¶
-
great_expectations.util.
is_numeric
(value: Any) → bool¶
-
great_expectations.util.
is_int
(value: Any) → bool¶
-
great_expectations.util.
is_float
(value: Any) → bool¶
-
great_expectations.util.
is_nan
(value: Any) → bool¶ If value is an array, test element-wise for NaN and return result as a boolean array. If value is a scalar, return boolean. :param value: The value to test
- Returns
The results of the test
-
great_expectations.util.
convert_decimal_to_float
(d: decimal.Decimal) → float¶ This method convers “decimal.Decimal” to standard “float” type.
-
great_expectations.util.
requires_lossy_conversion
(d: decimal.Decimal) → bool¶ This method determines whether or not conversion from “decimal.Decimal” to standard “float” type cannot be lossless.
-
great_expectations.util.
isclose
(operand_a: Union[datetime.datetime, datetime.timedelta, Number], operand_b: Union[datetime.datetime, datetime.timedelta, Number], rtol: float = 1e-05, atol: float = 1e-08, equal_nan: bool = False) → bool¶ Checks whether or not two numbers (or timestamps) are approximately close to one another.
- According to “https://numpy.org/doc/stable/reference/generated/numpy.isclose.html”,
For finite values, isclose uses the following equation to test whether two floating point values are equivalent: “absolute(a - b) <= (atol + rtol * absolute(b))”.
- This translates to:
“absolute(operand_a - operand_b) <= (atol + rtol * absolute(operand_b))”, where “operand_a” is “target” quantity
under evaluation for being close to a “control” value, and “operand_b” serves as the “control” (“reference”) value.
The values of the absolute tolerance (“atol”) parameter is chosen as a sufficiently small constant for most floating point machine representations (e.g., 1.0e-8), so that even if the “control” value is small in magnitude and “target” and “control” are close in absolute value, then the accuracy of the assessment can still be high up to the precision of the “atol” value (here, 8 digits as the default). However, when the “control” value is large in magnitude, the relative tolerance (“rtol”) parameter carries a greater weight in the comparison assessment, because the acceptable deviation between the two quantities can be relatively larger for them to be deemed as “close enough” in this case.
-
great_expectations.util.
is_candidate_subset_of_target
(candidate: Any, target: Any) → bool¶ This method checks whether or not candidate object is subset of target object.
-
great_expectations.util.
is_parseable_date
(value: Any, fuzzy: bool = False) → bool¶
-
great_expectations.util.
is_ndarray_datetime_dtype
(data: np.ndarray, parse_strings_as_datetimes: bool = False, fuzzy: bool = False) → bool¶ Determine whether or not all elements of 1-D “np.ndarray” argument are “datetime.datetime” type objects.
-
great_expectations.util.
convert_ndarray_to_datetime_dtype_best_effort
(data: np.ndarray, datetime_detected: bool = False, parse_strings_as_datetimes: bool = False, fuzzy: bool = False) → Tuple[bool, bool, np.ndarray]¶ Attempt to parse all elements of 1-D “np.ndarray” argument into “datetime.datetime” type objects.
- Returns
Boolean flag – True if all elements of original “data” were “datetime.datetime” type objects; False, otherwise. Boolean flag – True, if conversion was performed; False, otherwise. Output “np.ndarray” (converted, if necessary).
-
great_expectations.util.
convert_ndarray_datetime_to_float_dtype_utc_timezone
(data: np.ndarray) → np.ndarray¶ Convert all elements of 1-D “np.ndarray” argument from “datetime.datetime” type to “timestamp” “float” type objects.
Note: Conversion of “datetime.datetime” to “float” uses “UTC” TimeZone to normalize all “datetime.datetime” values.
-
great_expectations.util.
convert_ndarray_float_to_datetime_dtype
(data: np.ndarray) → np.ndarray¶ Convert all elements of 1-D “np.ndarray” argument from “float” type to “datetime.datetime” type objects.
Note: Converts to “naive” “datetime.datetime” values (assumes “UTC” TimeZone based floating point timestamps).
-
great_expectations.util.
convert_ndarray_float_to_datetime_tuple
(data: np.ndarray) → Tuple[datetime.datetime, ...]¶ Convert all elements of 1-D “np.ndarray” argument from “float” type to “datetime.datetime” type tuple elements.
Note: Converts to “naive” “datetime.datetime” values (assumes “UTC” TimeZone based floating point timestamps).
-
great_expectations.util.
is_ndarray_decimal_dtype
(data: npt.NDArray) → TypeGuard['npt.NDArray']¶ Determine whether or not all elements of 1-D “np.ndarray” argument are “decimal.Decimal” type objects.
-
great_expectations.util.
convert_ndarray_decimal_to_float_dtype
(data: np.ndarray) → np.ndarray¶ Convert all elements of N-D “np.ndarray” argument from “decimal.Decimal” type to “float” type objects.
-
great_expectations.util.
get_context
(project_config: Optional[Union['DataContextConfig', Mapping]] = None, context_root_dir: Optional[str] = None, runtime_environment: Optional[dict] = None, cloud_base_url: Optional[str] = None, cloud_access_token: Optional[str] = None, cloud_organization_id: Optional[str] = None, cloud_mode: Optional[bool] = None, ge_cloud_base_url: Optional[str] = None, ge_cloud_access_token: Optional[str] = None, ge_cloud_organization_id: Optional[str] = None, ge_cloud_mode: Optional[bool] = None) → Union['DataContext', 'BaseDataContext', 'CloudDataContext']¶ Method to return the appropriate DataContext depending on parameters and environment.
- Usage:
import great_expectations as gx my_context = gx.get_context([parameters])
- If gx.get_context() is run in a filesystem where great_expectations init has been run, then it will return a
DataContext
- If gx.get_context() is passed in a context_root_dir (which contains great_expectations.yml) then it will return
a DataContext
- If gx.get_context() is passed in an in-memory project_config then it will return BaseDataContext.
context_root_dir can also be passed in, but the configurations from the in-memory config will override the configurations in the great_expectations.yml file.
- If GX is being run in the cloud, and the information needed for ge_cloud_config (ie ge_cloud_base_url,
ge_cloud_access_token, ge_cloud_organization_id) are passed in as parameters to get_context(), configured as environment variables, or in a .conf file, then get_context() will return a CloudDataContext.
get_context params
Env Not Config’d
Env Config’d
() (cloud_mode=True) (cloud_mode=False)
Local Exception! Local
Cloud Cloud Local
TODO: This method will eventually return FileDataContext and EphemeralDataContext, rather than DataContext and Base
- Parameters
project_config (dict or DataContextConfig) – In-memory configuration for DataContext.
context_root_dir (str) – Path to directory that contains great_expectations.yml file
runtime_environment (dict) – A dictionary of values can be passed to a DataContext when it is instantiated. These values will override both values from the config variables file and from environment variables.
following parameters are relevant when running ge_cloud (The) –
cloud_base_url (str) – url for ge_cloud endpoint.
cloud_access_token (str) – access_token for ge_cloud account.
cloud_organization_id (str) – org_id for ge_cloud account.
cloud_mode (bool) – bool flag to specify whether to run GX in cloud mode (default is None).
- Returns
DataContext. Either a DataContext, BaseDataContext, or CloudDataContext depending on environment and/or parameters
-
great_expectations.util.
is_sane_slack_webhook
(url: str) → bool¶ Really basic sanity checking.
-
great_expectations.util.
is_list_of_strings
(_list) → TypeGuard[List[str]]¶
-
great_expectations.util.
generate_library_json_from_registered_expectations
()¶ Generate the JSON object used to populate the public gallery
-
great_expectations.util.
delete_blank_lines
(text: str) → str¶
-
great_expectations.util.
generate_temporary_table_name
(default_table_name_prefix: str = 'ge_temp_', num_digits: int = 8) → str¶
-
great_expectations.util.
get_sqlalchemy_inspector
(engine)¶
-
great_expectations.util.
get_sqlalchemy_url
(drivername, **credentials)¶
-
great_expectations.util.
get_sqlalchemy_selectable
(selectable: Union[Table, Select]) → Union[Table, Select]¶ Beginning from SQLAlchemy 1.4, a select() can no longer be embedded inside of another select() directly, without explicitly turning the inner select() into a subquery first. This helper method ensures that this conversion takes place.
For versions of SQLAlchemy < 1.4 the implicit conversion to a subquery may not always work, so that also needs to be handled here, using the old equivalent method.
https://docs.sqlalchemy.org/en/14/changelog/migration_14.html#change-4617
-
great_expectations.util.
get_sqlalchemy_subquery_type
()¶ Beginning from SQLAlchemy 1.4, sqlalchemy.sql.Alias has been deprecated in favor of sqlalchemy.sql.Subquery. This helper method ensures that the appropriate type is returned.
https://docs.sqlalchemy.org/en/14/changelog/migration_14.html#change-4617
-
great_expectations.util.
get_sqlalchemy_domain_data
(domain_data)¶
-
great_expectations.util.
import_make_url
()¶ Beginning from SQLAlchemy 1.4, make_url is accessed from sqlalchemy.engine; earlier versions must still be accessed from sqlalchemy.engine.url to avoid import errors.
-
great_expectations.util.
get_pyathena_potential_type
(type_module, type_) → str¶
-
great_expectations.util.
get_trino_potential_type
(type_module: ModuleType, type_: str) → object¶ Leverage on Trino Package to return sqlalchemy sql type
-
great_expectations.util.
pandas_series_between_inclusive
(series: pd.Series, min_value: int, max_value: int) → pd.Series¶ As of Pandas 1.3.0, the ‘inclusive’ arg in between() is an enum: {“left”, “right”, “neither”, “both”}
-
great_expectations.util.
numpy_quantile
(a: np.ndarray, q: float, method: str, axis: Optional[int] = None) → Union[np.float64, np.ndarray]¶ As of NumPy 1.21.0, the ‘interpolation’ arg in quantile() has been renamed to method. Source: https://numpy.org/doc/stable/reference/generated/numpy.quantile.html