great_expectations.util
¶
Module Contents¶
Functions¶
|
Pluralizes a Great Expectations singular noun |
|
Singularizes a Great Expectations plural noun |
|
Borrowed from inflection.underscore |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Convert a (pandas) dataframe to a great_expectations dataset, with (optional) expectation_suite |
|
Convert a (pandas) dataframe to a great_expectations dataset, with (optional) expectation_suite |
|
Read a file using Pandas read_csv and return a great_expectations dataset. |
|
Read a file using Pandas read_json and return a great_expectations dataset. |
|
Read a file using Pandas read_excel and return a great_expectations dataset. |
|
Read a file using Pandas read_table and return a great_expectations dataset. |
|
Read a file using Pandas read_feather and return a great_expectations dataset. |
|
Read a file using Pandas read_parquet and return a great_expectations dataset. |
|
Read a Pandas data frame and return a great_expectations dataset. |
|
Read a file using Pandas read_pickle and return a great_expectations dataset. |
|
Read a file using Pandas read_sas and return a great_expectations dataset. |
|
Validate the provided data asset. Validate can accept an optional data_asset_name to apply, data_context to use |
|
Print the structure of directory as a tree: |
|
Lint strings of code passed in. Optional dependency “black” must be installed. |
|
Cleans JSON-formatted string to adhere to Python syntax |
|
|
|
|
|
Filter the entries of the source dictionary according to directives concerning the existing keys and values. |
|
|
|
|
|
|
|
|
|
|
|
|
|
If value is an array, test element-wise for NaN and return result as a boolean array. |
|
|
|
Really basic sanity checking. |
|
|
Generate the JSON object used to populate the public gallery |
|
|
|
|
|
|
|
|
|
|
Beginning from SQLAlchemy 1.4, a select() can no longer be embedded inside of another select() directly, |
|
|
Beginning from SQLAlchemy 1.4, make_url is accessed from sqlalchemy.engine; earlier versions must |
|
|
-
great_expectations.util.
black
¶
-
great_expectations.util.
logger
¶
-
great_expectations.util.
sa
¶
-
great_expectations.util.
SINGULAR_TO_PLURAL_LOOKUP_DICT
:dict¶
-
great_expectations.util.
PLURAL_TO_SINGULAR_LOOKUP_DICT
:dict¶
-
great_expectations.util.
MAX_PROBABILISTIC_TEST_ASSERTION_RETRIES
:int = 3¶
-
great_expectations.util.
pluralize
(singular_ge_noun)¶ Pluralizes a Great Expectations singular noun
-
great_expectations.util.
singularize
(plural_ge_noun)¶ Singularizes a Great Expectations plural noun
-
great_expectations.util.
underscore
(word: str) → str¶ Borrowed from inflection.underscore Make an underscored, lowercase form from the expression in the string.
Example:
>>> underscore("DeviceType") 'device_type'
As a rule of thumb you can think of
underscore()
as the inverse ofcamelize()
, though there are cases where that does not hold:>>> camelize(underscore("IOError")) 'IoError'
-
great_expectations.util.
hyphen
(txt: str)¶
-
great_expectations.util.
profile
(func: Callable = None) → Callable¶
-
great_expectations.util.
measure_execution_time
(pretty_print: bool = False) → Callable¶
-
great_expectations.util.
get_project_distribution
() → Optional[Distribution]¶
-
great_expectations.util.
get_currently_executing_function
() → Callable¶
-
great_expectations.util.
get_currently_executing_function_call_arguments
(include_module_name: bool = False, include_caller_names: bool = False, **kwargs) → dict¶ - Parameters
include_module_name – bool If True, module name will be determined and included in output dictionary (default is False)
include_caller_names – bool If True, arguments, such as “self” and “cls”, if present, will be included in output dictionary (default is False)
kwargs –
- Returns
dict Output dictionary, consisting of call arguments as attribute “name: value” pairs.
Example usage: # Gather the call arguments of the present function (include the “module_name” and add the “class_name”), filter # out the Falsy values, and set the instance “_config” variable equal to the resulting dictionary. self._config = get_currently_executing_function_call_arguments(
) filter_properties_dict(properties=self._config, clean_falsy=True, inplace=True)
-
great_expectations.util.
verify_dynamic_loading_support
(module_name: str, package_name: str = None) → None¶ - Parameters
module_name – a possibly-relative name of a module
package_name – the name of a package, to which the given module belongs
-
great_expectations.util.
import_library_module
(module_name: str) → Optional[ModuleType]¶ - Parameters
module_name – a fully-qualified name of a module (e.g., “great_expectations.dataset.sqlalchemy_dataset”)
- Returns
raw source code of the module (if can be retrieved)
-
great_expectations.util.
is_library_loadable
(library_name: str) → bool¶
-
great_expectations.util.
load_class
(class_name: str, module_name: str)¶
-
great_expectations.util.
_convert_to_dataset_class
(df, dataset_class, expectation_suite=None, profiler=None)¶ Convert a (pandas) dataframe to a great_expectations dataset, with (optional) expectation_suite
- Parameters
df – the DataFrame object to convert
dataset_class – the class to which to convert the existing DataFrame
expectation_suite – the expectation suite that should be attached to the resulting dataset
profiler – the profiler to use to generate baseline expectations, if any
- Returns
A new Dataset object
-
great_expectations.util.
_load_and_convert_to_dataset_class
(df, class_name, module_name, expectation_suite=None, profiler=None)¶ Convert a (pandas) dataframe to a great_expectations dataset, with (optional) expectation_suite
- Parameters
df – the DataFrame object to convert
class_name (str) – class to which to convert resulting Pandas df
module_name (str) – dataset module from which to try to dynamically load the relevant module
expectation_suite – the expectation suite that should be attached to the resulting dataset
profiler – the profiler to use to generate baseline expectations, if any
- Returns
A new Dataset object
-
great_expectations.util.
read_csv
(filename, class_name='PandasDataset', module_name='great_expectations.dataset', dataset_class=None, expectation_suite=None, profiler=None, *args, **kwargs)¶ Read a file using Pandas read_csv and return a great_expectations dataset.
- Parameters
filename (string) – path to file to read
class_name (str) – class to which to convert resulting Pandas df
module_name (str) – dataset module from which to try to dynamically load the relevant module
dataset_class (Dataset) – If specified, the class to which to convert the resulting Dataset object; if not specified, try to load the class named via the class_name and module_name parameters
expectation_suite (string) – path to great_expectations expectation suite file
profiler (Profiler class) – profiler to use when creating the dataset (default is None)
- Returns
great_expectations dataset
-
great_expectations.util.
read_json
(filename, class_name='PandasDataset', module_name='great_expectations.dataset', dataset_class=None, expectation_suite=None, accessor_func=None, profiler=None, *args, **kwargs)¶ Read a file using Pandas read_json and return a great_expectations dataset.
- Parameters
filename (string) – path to file to read
class_name (str) – class to which to convert resulting Pandas df
module_name (str) – dataset module from which to try to dynamically load the relevant module
dataset_class (Dataset) – If specified, the class to which to convert the resulting Dataset object; if not specified, try to load the class named via the class_name and module_name parameters
expectation_suite (string) – path to great_expectations expectation suite file
accessor_func (Callable) – functions to transform the json object in the file
profiler (Profiler class) – profiler to use when creating the dataset (default is None)
- Returns
great_expectations dataset
-
great_expectations.util.
read_excel
(filename, class_name='PandasDataset', module_name='great_expectations.dataset', dataset_class=None, expectation_suite=None, profiler=None, *args, **kwargs)¶ Read a file using Pandas read_excel and return a great_expectations dataset.
- Parameters
filename (string) – path to file to read
class_name (str) – class to which to convert resulting Pandas df
module_name (str) – dataset module from which to try to dynamically load the relevant module
dataset_class (Dataset) – If specified, the class to which to convert the resulting Dataset object; if not specified, try to load the class named via the class_name and module_name parameters
expectation_suite (string) – path to great_expectations expectation suite file
profiler (Profiler class) – profiler to use when creating the dataset (default is None)
- Returns
great_expectations dataset or ordered dict of great_expectations datasets, if multiple worksheets are imported
-
great_expectations.util.
read_table
(filename, class_name='PandasDataset', module_name='great_expectations.dataset', dataset_class=None, expectation_suite=None, profiler=None, *args, **kwargs)¶ Read a file using Pandas read_table and return a great_expectations dataset.
- Parameters
filename (string) – path to file to read
class_name (str) – class to which to convert resulting Pandas df
module_name (str) – dataset module from which to try to dynamically load the relevant module
dataset_class (Dataset) – If specified, the class to which to convert the resulting Dataset object; if not specified, try to load the class named via the class_name and module_name parameters
expectation_suite (string) – path to great_expectations expectation suite file
profiler (Profiler class) – profiler to use when creating the dataset (default is None)
- Returns
great_expectations dataset
-
great_expectations.util.
read_feather
(filename, class_name='PandasDataset', module_name='great_expectations.dataset', dataset_class=None, expectation_suite=None, profiler=None, *args, **kwargs)¶ Read a file using Pandas read_feather and return a great_expectations dataset.
- Parameters
filename (string) – path to file to read
class_name (str) – class to which to convert resulting Pandas df
module_name (str) – dataset module from which to try to dynamically load the relevant module
dataset_class (Dataset) – If specified, the class to which to convert the resulting Dataset object; if not specified, try to load the class named via the class_name and module_name parameters
expectation_suite (string) – path to great_expectations expectation suite file
profiler (Profiler class) – profiler to use when creating the dataset (default is None)
- Returns
great_expectations dataset
-
great_expectations.util.
read_parquet
(filename, class_name='PandasDataset', module_name='great_expectations.dataset', dataset_class=None, expectation_suite=None, profiler=None, *args, **kwargs)¶ Read a file using Pandas read_parquet and return a great_expectations dataset.
- Parameters
filename (string) – path to file to read
class_name (str) – class to which to convert resulting Pandas df
module_name (str) – dataset module from which to try to dynamically load the relevant module
dataset_class (Dataset) – If specified, the class to which to convert the resulting Dataset object; if not specified, try to load the class named via the class_name and module_name parameters
expectation_suite (string) – path to great_expectations expectation suite file
profiler (Profiler class) – profiler to use when creating the dataset (default is None)
- Returns
great_expectations dataset
-
great_expectations.util.
from_pandas
(pandas_df, class_name='PandasDataset', module_name='great_expectations.dataset', dataset_class=None, expectation_suite=None, profiler=None)¶ Read a Pandas data frame and return a great_expectations dataset.
- Parameters
pandas_df (Pandas df) – Pandas data frame
class_name (str) – class to which to convert resulting Pandas df
module_name (str) – dataset module from which to try to dynamically load the relevant module
dataset_class (Dataset) – If specified, the class to which to convert the resulting Dataset object; if not specified, try to load the class named via the class_name and module_name parameters
expectation_suite (string) – path to great_expectations expectation suite file
profiler (profiler class) – The profiler that should be run on the dataset to establish a baseline expectation suite.
- Returns
great_expectations dataset
-
great_expectations.util.
read_pickle
(filename, class_name='PandasDataset', module_name='great_expectations.dataset', dataset_class=None, expectation_suite=None, profiler=None, *args, **kwargs)¶ Read a file using Pandas read_pickle and return a great_expectations dataset.
- Parameters
filename (string) – path to file to read
class_name (str) – class to which to convert resulting Pandas df
module_name (str) – dataset module from which to try to dynamically load the relevant module
dataset_class (Dataset) – If specified, the class to which to convert the resulting Dataset object; if not specified, try to load the class named via the class_name and module_name parameters
expectation_suite (string) – path to great_expectations expectation suite file
profiler (Profiler class) – profiler to use when creating the dataset (default is None)
- Returns
great_expectations dataset
-
great_expectations.util.
read_sas
(filename, class_name='PandasDataset', module_name='great_expectations.dataset', dataset_class=None, expectation_suite=None, profiler=None, *args, **kwargs)¶ Read a file using Pandas read_sas and return a great_expectations dataset.
- Parameters
filename (string) – path to file to read
class_name (str) – class to which to convert resulting Pandas df
module_name (str) – dataset module from which to try to dynamically load the relevant module
dataset_class (Dataset) – If specified, the class to which to convert the resulting Dataset object; if not specified, try to load the class named via the class_name and module_name parameters
expectation_suite (string) – path to great_expectations expectation suite file
profiler (Profiler class) – profiler to use when creating the dataset (default is None)
- Returns
great_expectations dataset
-
great_expectations.util.
validate
(data_asset, expectation_suite=None, data_asset_name=None, expectation_suite_name=None, data_context=None, data_asset_class_name=None, data_asset_module_name='great_expectations.dataset', data_asset_class=None, *args, **kwargs)¶ Validate the provided data asset. Validate can accept an optional data_asset_name to apply, data_context to use to fetch an expectation_suite if one is not provided, and data_asset_class_name/data_asset_module_name or data_asset_class to use to provide custom expectations.
- Parameters
data_asset – the asset to validate
expectation_suite – the suite to use, or None to fetch one using a DataContext
data_asset_name – the name of the data asset to use
expectation_suite_name – the name of the expectation_suite to use
data_context – data context to use to fetch an an expectation suite, or the path from which to obtain one
data_asset_class_name – the name of a class to dynamically load a DataAsset class
data_asset_module_name – the name of the module to dynamically load a DataAsset class
data_asset_class – a class to use. overrides data_asset_class_name/ data_asset_module_name if provided
*args –
**kwargs –
Returns:
-
great_expectations.util.
gen_directory_tree_str
(startpath)¶ Print the structure of directory as a tree:
Ex: project_dir0/
AAA/ BBB/
aaa.txt bbb.txt
#Note: files and directories are sorted alphabetically, so that this method can be used for testing.
-
great_expectations.util.
lint_code
(code: str) → str¶ Lint strings of code passed in. Optional dependency “black” must be installed.
-
great_expectations.util.
convert_json_string_to_be_python_compliant
(code: str) → str¶ Cleans JSON-formatted string to adhere to Python syntax
Substitute instances of ‘null’ with ‘None’ in string representations of Python dictionaries. Additionally, substitutes instances of ‘true’ or ‘false’ with their Python equivalents.
- Parameters
code – JSON string to update
- Returns
Clean, Python-compliant string
-
great_expectations.util.
_convert_nulls_to_None
(code: str) → str¶
-
great_expectations.util.
_convert_json_bools_to_python_bools
(code: str) → str¶
-
great_expectations.util.
filter_properties_dict
(properties: Optional[dict] = None, keep_fields: Optional[Set[str]] = None, delete_fields: Optional[Set[str]] = None, clean_nulls: bool = True, clean_falsy: bool = False, keep_falsy_numerics: bool = True, inplace: bool = False) → Optional[dict]¶ Filter the entries of the source dictionary according to directives concerning the existing keys and values.
- Parameters
properties – source dictionary to be filtered according to the supplied filtering directives
keep_fields – list of keys that must be retained, with the understanding that all other entries will be deleted
delete_fields – list of keys that must be deleted, with the understanding that all other entries will be retained
clean_nulls – If True, then in addition to other filtering directives, delete entries, whose values are None
clean_falsy – If True, then in addition to other filtering directives, delete entries, whose values are Falsy
the "clean_falsy" argument is specified at "True", then "clean_nulls" is assumed to be "True" as well.) ((If) –
inplace – If True, then modify the source properties dictionary; otherwise, make a copy for filtering purposes
keep_falsy_numerics – If True, then in addition to other filtering directives, do not delete zero-valued numerics
- Returns
The (possibly) filtered properties dictionary (or None if no entries remain after filtering is performed)
-
great_expectations.util.
deep_filter_properties_iterable
(properties: Optional[Union[dict, list, set, tuple]] = None, keep_fields: Optional[Set[str]] = None, delete_fields: Optional[Set[str]] = None, clean_nulls: bool = True, clean_falsy: bool = False, keep_falsy_numerics: bool = True, inplace: bool = False) → Optional[Union[dict, list, set]]¶
-
great_expectations.util.
_is_to_be_removed_from_deep_filter_properties_iterable
(value: Any, clean_nulls: bool, clean_falsy: bool, keep_falsy_numerics: bool) → bool¶
-
great_expectations.util.
is_truthy
(value: Any) → bool¶
-
great_expectations.util.
is_numeric
(value: Any) → bool¶
-
great_expectations.util.
is_int
(value: Any) → bool¶
-
great_expectations.util.
is_float
(value: Any) → bool¶
-
great_expectations.util.
is_nan
(value: Any) → bool¶ If value is an array, test element-wise for NaN and return result as a boolean array. If value is a scalar, return boolean. :param value: The value to test
- Returns
The results of the test
-
great_expectations.util.
is_parseable_date
(value: Any, fuzzy: bool = False) → bool¶
-
great_expectations.util.
get_context
()¶
-
great_expectations.util.
is_sane_slack_webhook
(url: str) → bool¶ Really basic sanity checking.
-
great_expectations.util.
is_list_of_strings
(_list) → bool¶
-
great_expectations.util.
generate_library_json_from_registered_expectations
()¶ Generate the JSON object used to populate the public gallery
-
great_expectations.util.
delete_blank_lines
(text: str) → str¶
-
great_expectations.util.
generate_temporary_table_name
(default_table_name_prefix: str = 'ge_temp_', num_digits: int = 8) → str¶
-
great_expectations.util.
get_sqlalchemy_inspector
(engine)¶
-
great_expectations.util.
get_sqlalchemy_url
(drivername, **credentials)¶
-
great_expectations.util.
get_sqlalchemy_selectable
(selectable: Union[Table, Select]) → Union[Table, Select]¶ Beginning from SQLAlchemy 1.4, a select() can no longer be embedded inside of another select() directly, without explicitly turning the inner select() into a subquery first. This helper method ensures that this conversion takes place.
https://docs.sqlalchemy.org/en/14/changelog/migration_14.html#change-4617
-
great_expectations.util.
get_sqlalchemy_domain_data
(domain_data)¶
-
great_expectations.util.
import_make_url
()¶ Beginning from SQLAlchemy 1.4, make_url is accessed from sqlalchemy.engine; earlier versions must still be accessed from sqlalchemy.engine.url to avoid import errors.
-
great_expectations.util.
probabilistic_test
(func: Callable = None, max_num_retries: int = MAX_PROBABILISTIC_TEST_ASSERTION_RETRIES) → Callable¶