I am currently working with Great Expectations (version) to build a dashboard on top to interactively provide the expectation configuration.
As an example, the JSON version of the expectation might look like this:
from great_expectations.core.expectation_configuration import ExpectationConfiguration
expectation_configuration = ExpectationConfiguration(
expectation_type = "expect_column_values_to_be_in_set",
kwargs = {
"column": "col_a",
"value_set": ["test"],
},
)
Instead of coding this, I want to provide a dashboard to provide this information.
In order to prevent hardcoding of variables, I am currently trying to gather the following information programmatically:
- List of all expectation types.
- Which of these are available for Pandas and and which for Spark.
- For each type, the possible arguments and their type (in the end, computationally for the arguments also listed in the expectation catalog, e.g., here).
One partial idea is currently is to use ExpectationConfiguration.kwarg_lookup_dict which partially contains this information - but it seems to be missing some information:
{
'domain_kwargs': ('column', 'row_condition', 'condition_parser'),
'success_kwargs': ('value_set', 'mostly', 'parse_strings_as_datetimes'),
'default_kwarg_values': {
'row_condition': None,
'condition_parser': 'pandas',
'mostly': None,
'parse_strings_as_datetimes': None,
'result_format': 'BASIC',
'include_config': True,
'catch_exceptions': False
}
}
I also tried playing around with the Validator class but it only returns the complete list of names and no further information:
from great_expectations.validator.validator import Validator
from great_expectations.execution_engine import SparkDFExecutionEngine, PandasExecutionEngine
validator_ = Validator(SparkDFExecutionEngine())
# 53 - the same for Pandas and Spark
print(len(validator_.list_available_expectation_types()))