Profile Module

class great_expectations.profile.base.DataAssetProfiler

Bases: object

classmethod validate(data_asset)
class great_expectations.profile.base.DatasetProfiler

Bases: great_expectations.profile.base.DataAssetProfiler

classmethod validate(dataset)
classmethod add_expectation_meta(expectation)
classmethod add_meta(expectation_suite, batch_kwargs=None)
classmethod profile(data_asset, run_id=None)
class great_expectations.profile.basic_dataset_profiler.BasicDatasetProfilerBase

Bases: great_expectations.profile.base.DatasetProfiler

BasicDatasetProfilerBase provides basic logic of inferring the type and the cardinality of columns that is used by the dataset profiler classes that extend this class.

INT_TYPE_NAMES = {'BIGINT', 'BYTEINT', 'DECIMAL', 'INT', 'INTEGER', 'IntegerType', 'LongType', 'SMALLINT', 'TINYINT', 'int'}
FLOAT_TYPE_NAMES = {'DOUBLE_PRECISION', 'DoubleType', 'FLOAT', 'FLOAT4', 'FLOAT8', 'FloatType', 'NUMERIC', 'float'}
STRING_TYPE_NAMES = {'CHAR', 'StringType', 'TEXT', 'VARCHAR', 'str', 'string'}
BOOLEAN_TYPE_NAMES = {'BOOL', 'BOOLEAN', 'BooleanType', 'bool'}
DATETIME_TYPE_NAMES = {'DATE', 'DATETIME', 'DateType', 'TIMESTAMP', 'Timestamp', 'TimestampType', 'datetime64'}
class great_expectations.profile.basic_dataset_profiler.BasicDatasetProfiler

Bases: great_expectations.profile.basic_dataset_profiler.BasicDatasetProfilerBase

BasicDatasetProfiler is inspired by the beloved pandas_profiling project.

The profiler examines a batch of data and creates a report that answers the basic questions most data practitioners would ask about a dataset during exploratory data analysis. The profiler reports how unique the values in the column are, as well as the percentage of empty values in it. Based on the column’s type it provides a description of the column by computing a number of statistics, such as min, max, mean and median, for numeric columns, and distribution of values, when appropriate.

last updated: Aug 13, 2020