Metadata-Version: 2.1
Name: kedro-great
Version: 0.2.2
Summary: Kedro Great makes integrating Great Expectations with Kedro easy!
Home-page: https://github.com/tamsanh/kedro-great.git
Author: Tam-Sanh Nguyen
Author-email: tamsanh@gmail.com
License: MIT
Description: # Kedro Great
        
        As Seen on [DataEngineerOne](https://www.youtube.com/c/DataEngineerOne)  
        *Watch the Video*: [Kedro Great: Use Great Expectations with Ease!](https://www.youtube.com/watch?v=VY_AO0__oIE)
        
        <p align="center">
          <img width="255" src="https://github.com/tamsanh/kedro-great/blob/master/images/great.png">
        </p>
        
        
        Kedro Great is an easy-to-use plugin for kedro that makes integration with Great Expectations fast and simple.
        
        Hold yourself accountable to [Great Expectations](https://github.com/great-expectations/great_expectations).  
        Never have fear of data silently changing ever again.
        
        ## Quick Start
        
        #### Install
        Kedro Great is available on pypi, and is installed with [kedro hooks](https://kedro.readthedocs.io/en/latest/04_user_guide/15_hooks.html).
        
        ```console
        pip install kedro-great
        ```
        
        #### Setup
        
        Once installed, `kedro great` becomes available as a kedro command.
        
        You can use `kedro great init` to initialize a Great Expectations project, and then automatically generate its project context.
        
        Furthermore, by using `kedro great init`, you also generate Great Expectations `Datasource`s and `Suite`s to use with your `catalog.yml` DataSets.
        
        By default, expectation suites are named for the `catalog.yml` name and a `basic.json` is generated for each.
        
        ```console
        kedro great init
        ```
        
        #### Use
        
        After the Great Expectations project has been setup and configured, you can now use the `KedroGreat` hook to run all your data validations every time the pipeline runs.
        
        ```python
        # run.py
        from kedro_great import KedroGreat
        
        class ProjectContext(KedroContext):
            hooks = (
                KedroGreat(),
            )
        ```
        
        
        Then just run the kedro pipeline to run the suites.
        
        ```console
        kedro run
        ```
        
        #### Results
        
        Finally, you can use `great_expectations` itself to generate documentation and view the results of your pipeline.
        
        Love seeing those green ticks!
        
        ```console
        great_expectations docs build
        ```
        
        ## Hook Options
        
        The `KedroGreat` hook supports a few options currently. If you wish to 
        
        ### expectations_map: Dict[str, Union[str, List[str]]]
        
        If you have multiple expectation suites you wish to run, or expectation suites that do not have the same name
        as the catalog dataset, these mappings can be specified in the `expectations_map` argument for `KedroGreat`
        
        **Default:** The catalog name is the expectation name.
        
        *Note:* Specifying a suite type such as `.basic` will override all other suite types
        
        ```python
        KedroGreat(expectations_map={
            'pandas_iris_data': 'pandas_iris_data',
            'spark_iris_data': ['spark_iris_data',
                                'other_expectation',
                                'another_expectation.basic'],
        
        })
        ```
        
        ### suite_types: List[Optional[str]]
        
        If your suites have multiple types, you can choose exactly which types to run.
        
        A `None` means that a suite will not have the type appended to the name.
        
        **Default:** The `KedroGreat.DEFAULT_SUITE_TYPES`.
        
        *Node:* If a suite type is already specified in the `expectations_map`, that will override this list.
        
        ```python
        KedroGreat(suite_types=[
            'warning',
            'basic',
            None
        ])
        ```
        
        ### run_before_node:bool, run_after_node: bool
        
        You can decide when the suites run, before or after a node or both before and after a node.
        
        It will operate on the node `inputs` and `outputs` respectively.
        
        **Default:** Only runs before a node runs.
        
        ```python
        KedroGreat(run_before_node=True, run_after_node=False)
        ```
        
        ### fail_fast: bool, fail_after_pipeline_run: bool
        
        You can also have `KedroGreat` throw a `SuiteValidationFailure` when a Great Expectations validation fails.
        
        Either the exception can be throw immediately, or the exceptions can be aggregated over the whole pipeline run, and thrown at the end.
        
        This is useful for when you wish to run validation on your pipeline in a CI/CD way.
        
        **Default:** Neither are set
        
        ```python
        KedroGreat(fail_fast=True, fail_after_pipeline_run=True)
        ```
        
Platform: UNKNOWN
Classifier: License :: OSI Approved :: MIT License
Classifier: Programming Language :: Python :: 3.6
Classifier: Programming Language :: Python :: 3.7
Description-Content-Type: text/markdown
