Metadata-Version: 2.1
Name: ecco
Version: 0.1.0
Summary: Visualization tools for NLP machine learning models.
Home-page: https://github.com/jalammar/ecco
Author: Jay Alammar
Author-email: alammar@gmail.com
License: BSD-3-Clause
Project-URL: Changelog, https://github.com/jalammar/ecco/blob/master/CHANGELOG.rst
Project-URL: Issue Tracker, https://github.com/jalammar/ecco/issues
Description: 
        ..  image:: https://ar.pegg.io/img/ecco-logo-w-800.png
            :alt: Ecco Logo
        
        
        
        
        Ecco is a python library for explaining Natural Language Processing models using interactive visualizations.
        
        It provides multiple interfaces to aid the explanation and intuition of `Transformer
        <https://jalammar.github.io/illustrated-transformer/>`_-based language models. Read: `Interfaces for Explaining Transformer Language Models <https://jalammar.github.io/explaining-transformers/>`_.
        
        Ecco runs inside Jupyter notebooks. It is built on top of `pytorch
        <https://pytorch.org/>`_ and `transformers
        <https://github.com/huggingface/transformers>`_.
        
        The library is currently an alpha release of a research project. Not production ready. You're welcome to contribute to make it better!
        
        Installation
        ============
        
        
        .. code-block:: python
        
            # Assuming you had PyTorch previously installed
            pip install ecco
        
        
        Documentation
        =============
        
        
        To use the project:
        
        .. code-block:: python
        
            import ecco
        
            # Load pre-trained language model. Setting 'activations' to True tells Ecco to capture neuron activations.
            lm = ecco.from_pretrained('distilgpt2', activations=True)
        
            # Input text
            text = "The countries of the European Union are:\n1. Austria\n2. Belgium\n3. Bulgaria\n4."
        
            # Generate 20 tokens to complete the input text.
            output = lm.generate(text, generate=20, do_sample=True)
            
            # Ecco will output each token as it is generated.
            
            # 'output' now contains the data captured from this run, including the input and output tokens
            # as well as neuron activations and input saliency values. 
            
            # To view the input saliency
            output.saliency()
        
        This does the following:
        
        1. It loads a pretrained Huggingface DistilGPT2 model. It wraps it an ecco ``LM`` object that does useful things (e.g. it calculates input saliency, can collect neuron activations).
        2. We tell the model to generate 20 tokens.
        3. The model returns an ecco ``OutputSeq`` object. This object holds the output sequence, but also a lot of data generated by the generation run, including the input sequence and input saliency values. If we set ``activations=True`` in ``from_pretrained()``, then this would also contain neuron activation values.
        4. ``output`` can now produce various interactive explorables. Examples include:
        
        - ``output.saliency()`` to generate input saliency explorable [`Input Saliency Colab Notebook <https://colab.research.google.com/github/jalammar/ecco/blob/main/notebooks/Ecco_Input_Saliency.ipynb>`_]
        - ``output.run_nmf()`` to to explore non-negative matrix factorization of neuron activations  [`Neuron Activation Colab Notebook <https://colab.research.google.com/github/jalammar/ecco/blob/main/notebooks/Ecco_Neuron_Factors.ipynb>`_]
        
        
        .. code-block:: python
        
            # To view the input saliency explorable
            output.saliency()
            
            # to view input saliency with more details (a bar and % value for each token)
            output.saliency(style="detailed")
            
            # output.activations contains the neuron activation values. it has the shape: (layer, neuron, token position)
            
            # We can run non-negative matrix factorization using run_nmf. We pass the number of factors/components to break down into
            nmf_1 = output.run_nmf(n_components=10) 
        
            # nmf_1 now contains the necessary data to create the interactive nmf explorable:
            nmf_1.explore()
        
        
        
        
        Changelog
        =========
        
        0.0.8 (2020-11-20)
        ------------------
        
        * Allowing the project some fresh air.
Keywords: Natural Language Processing,Explainable AI,keyword3
Platform: UNKNOWN
Classifier: Development Status :: 5 - Production/Stable
Classifier: Intended Audience :: Developers
Classifier: License :: OSI Approved :: BSD License
Classifier: Operating System :: Unix
Classifier: Operating System :: POSIX
Classifier: Operating System :: Microsoft :: Windows
Classifier: Programming Language :: Python
Classifier: Programming Language :: Python :: 3.7
Classifier: Programming Language :: Python :: 3.8
Classifier: Programming Language :: Python :: 3.9
Classifier: Programming Language :: Python :: Implementation :: CPython
Classifier: Programming Language :: Python :: Implementation :: PyPy
Classifier: Topic :: Utilities
Requires-Python: !=3.0.*, !=3.1.*, !=3.2.*, !=3.3.*, !=3.4.*
Provides-Extra: dev
