great_expectations.cli.v012.datasource
¶
Module Contents¶
Classes¶
Generic enumeration. |
|
Generic enumeration. |
Functions¶
Datasource operations |
|
|
Add a new datasource to the data context. |
|
Delete the datasource specified as an argument |
|
List known datasources. |
|
|
|
Profile a datasource (Experimental) |
|
Interactive flow for adding a datasource to an existing context. |
|
|
|
|
This is a workaround to help identify Windows and adjust the prompts accordingly |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
This method manages the interaction with user necessary to obtain batch_kwargs for a batch of a data asset. |
|
|
|
|
|
|
|
|
|
Profile a named datasource using the specified context |
-
great_expectations.cli.v012.datasource.
parse_bigquery_url
¶
-
great_expectations.cli.v012.datasource.
logger
¶
-
great_expectations.cli.v012.datasource.
sqlalchemy
¶
-
class
great_expectations.cli.v012.datasource.
DatasourceTypes
¶ Bases:
enum.Enum
Generic enumeration.
Derive from this class to define new enumerations.
-
PANDAS
= pandas¶
-
SQL
= sql¶
-
SPARK
= spark¶
-
-
great_expectations.cli.v012.datasource.
MANUAL_GENERATOR_CLASSES
¶
-
class
great_expectations.cli.v012.datasource.
SupportedDatabases
¶ Bases:
enum.Enum
Generic enumeration.
Derive from this class to define new enumerations.
-
MYSQL
= MySQL¶
-
POSTGRES
= Postgres¶
-
REDSHIFT
= Redshift¶
-
SNOWFLAKE
= Snowflake¶
-
BIGQUERY
= BigQuery¶
-
OTHER
= other - Do you have a working SQLAlchemy connection string?¶
-
-
great_expectations.cli.v012.datasource.
datasource
() → None¶ Datasource operations
-
great_expectations.cli.v012.datasource.
datasource_new
(directory) → None¶ Add a new datasource to the data context.
-
great_expectations.cli.v012.datasource.
delete_datasource
(directory, datasource) → None¶ Delete the datasource specified as an argument
-
great_expectations.cli.v012.datasource.
datasource_list
(directory) → None¶ List known datasources.
-
great_expectations.cli.v012.datasource.
_build_datasource_intro_string
(datasource_count)¶
-
great_expectations.cli.v012.datasource.
datasource_profile
(datasource, batch_kwargs_generator_name, data_assets, profile_all_data_assets, directory, view, additional_batch_kwargs, assume_yes) → None¶ Profile a datasource (Experimental)
If the optional data_assets and profile_all_data_assets arguments are not specified, the profiler will check if the number of data assets in the datasource exceeds the internally defined limit. If it does, it will prompt the user to either specify the list of data assets to profile or to profile all. If the limit is not exceeded, the profiler will profile all data assets in the datasource.
-
great_expectations.cli.v012.datasource.
add_datasource
(context, choose_one_data_asset=False)¶ Interactive flow for adding a datasource to an existing context.
- Parameters
context –
choose_one_data_asset – optional - if True, this signals the method that the intent is to let user choose just one data asset (e.g., a file) and there is no need to configure a batch kwargs generator that comprehensively scans the datasource for data assets
- Returns
a tuple: datasource_name, data_source_type
-
great_expectations.cli.v012.datasource.
_add_pandas_datasource
(context, passthrough_generator_only=True, prompt_for_datasource_name=True)¶
-
great_expectations.cli.v012.datasource.
_add_sqlalchemy_datasource
(context, prompt_for_datasource_name=True)¶
-
great_expectations.cli.v012.datasource.
_should_hide_input
()¶ This is a workaround to help identify Windows and adjust the prompts accordingly since hidden prompts may freeze in certain Windows terminals
-
great_expectations.cli.v012.datasource.
_collect_postgres_credentials
(default_credentials=None)¶
-
great_expectations.cli.v012.datasource.
_collect_snowflake_credentials
(default_credentials=None)¶
-
great_expectations.cli.v012.datasource.
_collect_snowflake_credentials_user_password
()¶
-
great_expectations.cli.v012.datasource.
_collect_snowflake_credentials_sso
()¶
-
great_expectations.cli.v012.datasource.
_collect_snowflake_credentials_key_pair
()¶
-
great_expectations.cli.v012.datasource.
_collect_bigquery_credentials
(default_credentials=None)¶
-
great_expectations.cli.v012.datasource.
_collect_mysql_credentials
(default_credentials=None)¶
-
great_expectations.cli.v012.datasource.
_collect_redshift_credentials
(default_credentials=None)¶
-
great_expectations.cli.v012.datasource.
_add_spark_datasource
(context, passthrough_generator_only=True, prompt_for_datasource_name=True)¶
-
great_expectations.cli.v012.datasource.
select_batch_kwargs_generator
(context, datasource_name, available_data_assets_dict=None)¶
-
great_expectations.cli.v012.datasource.
get_batch_kwargs
(context, datasource_name=None, batch_kwargs_generator_name=None, data_asset_name=None, additional_batch_kwargs=None)¶ This method manages the interaction with user necessary to obtain batch_kwargs for a batch of a data asset.
In order to get batch_kwargs this method needs datasource_name, batch_kwargs_generator_name and data_asset_name to combine them into a fully-qualified data asset identifier(datasource_name/batch_kwargs_generator_name/data_asset_name). All three arguments are optional. If they are present, the method uses their values. Otherwise, the method prompts user to enter them interactively. Since it is possible for any of these three components to be passed to this method as empty values and to get their values after interacting with user, this method returns these components’ values in case they changed.
If the datasource has batch_kwargs_generators that can list available data asset names, the method lets user choose a name from that list (note: if there are multiple batch_kwargs_generators, user has to choose one first). If a name known to the chosen batch_kwargs_generator is selected, the batch_kwargs_generators will be able to yield batch_kwargs. The method also gives user an alternative to selecting the data asset name from the batch_kwargs_generators’s list - user can type in a name for their data asset. In this case a passthrough batch kwargs batch_kwargs_generators will be used to construct a fully-qualified data asset identifier (note: if the datasource has no passthrough batch_kwargs_generators configured, the method will exist with a failure). Since no batch_kwargs_generators can yield batch_kwargs for this data asset name, the method prompts user to specify batch_kwargs by choosing a file (if the datasource is pandas or spark) or by writing a SQL query (if the datasource points to a database).
- Parameters
context –
datasource_name –
batch_kwargs_generator_name –
data_asset_name –
additional_batch_kwargs –
- Returns
a tuple: (datasource_name, batch_kwargs_generator_name, data_asset_name, batch_kwargs). The components of the tuple were passed into the methods as optional arguments, but their values might have changed after this method’s execution. If the returned batch_kwargs is None, it means that the batch_kwargs_generator will know to yield batch_kwargs when called.
-
great_expectations.cli.v012.datasource.
_get_batch_kwargs_from_generator_or_from_file_path
(context, datasource_name, batch_kwargs_generator_name=None, additional_batch_kwargs=None)¶
-
great_expectations.cli.v012.datasource.
_get_default_schema
(datasource)¶
-
great_expectations.cli.v012.datasource.
_get_batch_kwargs_for_sqlalchemy_datasource
(context, datasource_name, additional_batch_kwargs=None)¶
-
great_expectations.cli.v012.datasource.
_verify_sqlalchemy_dependent_modules
() → bool¶
-
great_expectations.cli.v012.datasource.
_verify_mysql_dependent_modules
() → bool¶
-
great_expectations.cli.v012.datasource.
_verify_postgresql_dependent_modules
() → bool¶
-
great_expectations.cli.v012.datasource.
_verify_redshift_dependent_modules
() → bool¶
-
great_expectations.cli.v012.datasource.
_verify_snowflake_dependent_modules
() → bool¶
-
great_expectations.cli.v012.datasource.
_verify_bigquery_dependent_modules
() → bool¶
-
great_expectations.cli.v012.datasource.
_verify_pyspark_dependent_modules
() → bool¶
-
great_expectations.cli.v012.datasource.
skip_prompt_message
(skip_flag, prompt_message_text) → bool¶
-
great_expectations.cli.v012.datasource.
profile_datasource
(context, datasource_name, batch_kwargs_generator_name=None, data_assets=None, profile_all_data_assets=False, max_data_assets=20, additional_batch_kwargs=None, open_docs=False, skip_prompt_flag=False)¶ Profile a named datasource using the specified context
-
great_expectations.cli.v012.datasource.
msg_prompt_choose_datasource
= Configure a datasource: 1. Pandas DataFrame 2. Relational database (SQL) 3. Spark DataFrame 4. Skip datasource configuration¶
-
great_expectations.cli.v012.datasource.
msg_prompt_choose_database
¶
-
great_expectations.cli.v012.datasource.
msg_prompt_filesys_enter_base_path
=¶
Enter the path (relative or absolute) of the root directory where the data files are stored.
-
great_expectations.cli.v012.datasource.
msg_prompt_datasource_name
=¶
Give your new Datasource a short name.
-
great_expectations.cli.v012.datasource.
msg_db_config
=¶
Next, we will configure database credentials and store them in the {0:s} section of this config file: great_expectations/uncommitted/config_variables.yml:
-
great_expectations.cli.v012.datasource.
msg_unknown_data_source
=¶
- Do we not have the type of data source you want?
Please create a GitHub issue here so we can discuss it!
<blue>https://github.com/great-expectations/great_expectations/issues/new</blue>