Metadata-Version: 1.1
Name: django-pandas
Version: 0.6.4
Summary: Tools for working with pydata.pandas in your Django projects
Home-page: https://github.com/chrisdev/django-pandas/
Author: Christopher Clarke
Author-email: cclarke@chrisdev.com
License: UNKNOWN
Description: ==============
        Django Pandas
        ==============
        .. image:: https://secure.travis-ci.org/chrisdev/django-pandas.png?branch=master
           :target: http://travis-ci.org/chrisdev/django-pandas
        
        .. image:: https://coveralls.io/repos/chrisdev/django-pandas/badge.png?branch=master
           :target: https://coveralls.io/r/chrisdev/django-pandas
        
        Tools for working with `pandas <http://pandas.pydata.org>`_ in your Django
        projects
        
        Contributors
        ============
        * `Christopher Clarke <https://github.com/chrisdev>`_
        * `Bertrand Bordage <https://github.com/BertrandBordage>`_
        * `Guillaume Thomas <https://github.com/gtnx>`_
        * `Parbhat Puri  <https://parbhatpuri.com/>`_
        * `Fredrik Burman (coachHIPPO) <https://www.coachhippo.com>`_
        * `Safe Hammad <http://safehammad.com>`_
        * `Jeff Sternber <https://www.linkedin.com/in/jeffsternberg>`_
        * `@MiddleFork <https://github.com/MiddleFork>`_
        * `Daniel Andrlik <https://github.com/andrlik>`_
        * `Kevin Abbot <https://github.com/kgabbott>`_
        * `Yousuf Jawwad <https://github.com/ysfjwd>`_
        * `@henhuy <https://github.com/henhuy>`_
        * `Hélio Meira Lins <https://github.com/meiralins>`_
        * `@utpyngo <https://github.com/utpyngo>`_
        * `Anthony Monthe <https://github.com/ZuluPro>`_
        * `Vincent Toupet <https://github.com/vtoupet>`_
        * `Anton Ian Sipos <https://github.com/aisipos>`_
        
        What's New
        ===========
        Version 0.6.4 supports the latest release of Pandas (1.2.1)
        
        Dependencies
        =============
        ``django-pandas`` supports `Django`_ (>=1.4.5) or later
        and requires `django-model-utils`_ (>= 1.4.0) and `Pandas`_ (>= 0.12.0).
        **Note** because of problems with the ``requires`` directive of setuptools
        you probably need to install ``numpy`` in your virtualenv  before you install
        this package or if you want to run the test suite ::
        
            pip install numpy
            python setup.py test
        
        Some ``pandas`` functionality requires parts of the Scipy stack.
        You may wish to consult http://www.scipy.org/install.html
        for more information on installing the ``Scipy`` stack.
        
        You need to install your preferred version of Django.
        as that Django 2 does not support Python 2.
        
        
        .. _Django: http://djangoproject.com/
        .. _django-model-utils: http://pypi.python.org/pypi/django-model-utils
        .. _Pandas: http://pandas.pydata.org
        
        Contributing
        ============
        
        Please file bugs and send pull requests to the `GitHub repository`_ and `issue
        tracker`_.
        
        .. _GitHub repository: https://github.com/chrisdev/django-pandas/
        .. _issue tracker: https://github.com/chrisdev/django-pandas/issues
        
        
        Installation
        =============
        Start by creating a new ``virtualenv`` for your project ::
        
            mkvirtualenv myproject
        
        Next install ``numpy`` and ``pandas`` and optionally ``scipy`` ::
        
            pip install numpy
            pip install pandas
        
        You may want to consult  the `scipy documentation`_ for more information
        on installing the ``Scipy`` stack.
        
        .. _scipy documentation: http://www.scipy.org/install.html
        
        Finally, install ``django-pandas`` using ``pip``::
        
            pip install django-pandas
        
        or install the development version from ``github`` ::
        
            pip install https://github.com/chrisdev/django-pandas/tarball/master
        
        Usage
        ======
        
        
        IO Module
        ----------
        The ``django-pandas.io`` module provides some convenience methods to
        facilitate the creation of DataFrames from Django QuerySets.
        
        read_frame
        ^^^^^^^^^^^
        
        **Parameters**
        
            - qs: A Django QuerySet.
        
            - fieldnames: A list of model field names to use in creating the ``DataFrame``.
                          You can span a relationship in the usual Django way
                          by using  double underscores to specify a related field
                          in another model
        
            - index_col: Use specify the field name to use  for the ``DataFrame`` index.
                         If the index
                         field is not in the field list it will be appended
        
            - coerce_float : Boolean, defaults to True
                             Attempt to convert values to non-string,
                             non-numeric objects (like decimal.Decimal)
                             to floating point.
        
            - verbose:  If  this is ``True`` then populate the DataFrame with the
                        human readable versions of any foreign key or choice fields
                        else use the actual values set in the model.
        
            - column_names: If not None, use to override the column names in the
                            DateFrame
        
        Examples
        ^^^^^^^^^
        Assume that this is your model::
        
            class MyModel(models.Model):
        
                full_name = models.CharField(max_length=25)
                age = models.IntegerField()
                department = models.CharField(max_length=3)
                wage = models.FloatField()
        
        First create a query set::
        
            from django_pandas.io import read_frame
            qs = MyModel.objects.all()
        
        To create a dataframe using all the fields in the underlying model ::
        
            df = read_frame(qs)
        
        The `df` will contain human readable column values for foreign key and choice
        fields. The `DataFrame` will include all the fields in the underlying
        model including the primary key.
        To create a DataFrame using specified field names::
        
             df = read_frame(qs, fieldnames=['age', 'wage', 'full_name'])
        
        To set ``full_name`` as the ``DataFrame`` index ::
        
            qs.to_dataframe(['age', 'wage'], index_col='full_name'])
        
        You can use filters and excludes ::
        
            qs.filter(age__gt=20, department='IT').to_dataframe(index_col='full_name')
        
        
        DataFrameManager
        -----------------
        ``django-pandas`` provides a custom manager to use with models that
        you want to render as Pandas Dataframes. The ``DataFrameManager``
        manager provides the ``to_dataframe`` method that returns
        your models queryset as a Pandas DataFrame. To use the DataFrameManager, first
        override the default manager (`objects`) in your model's definition
        as shown in the example below ::
        
            #models.py
        
            from django_pandas.managers import DataFrameManager
        
            class MyModel(models.Model):
        
                full_name = models.CharField(max_length=25)
                age = models.IntegerField()
                department = models.CharField(max_length=3)
                wage = models.FloatField()
        
                objects = DataFrameManager()
        
        
        This will give you access to the following QuerySet methods:
        
            - ``to_dataframe``
            - ``to_timeseries``
            - ``to_pivot_table``
        
        to_dataframe
        ^^^^^^^^^^^^^
        
        Returns a DataFrame from the QuerySet
        
        **Parameters**
        
            - fieldnames:  The model field names to utilise in creating the frame.
                        to span a relationship, use the field name of related
                        fields across models, separated by double underscores,
        
        
            - index: specify the field to use  for the index. If the index
                        field is not in the field list it will be appended
        
            - coerce_float: Attempt to convert the numeric non-string data
                            like object, decimal etc. to float if possible
        
            - verbose:  If  this is ``True`` then populate the DataFrame with the
                        human readable versions of any foreign key or choice fields
                        else use the actual value set in the model.
        
        Examples
        ^^^^^^^^^
        
        Create a dataframe using all the fields  in your model as follows ::
        
            qs = MyModel.objects.all()
        
            df = qs.to_dataframe()
        
        This will include your primary key. To create a DataFrame using specified
        field names::
        
             df = qs.to_dataframe(fieldnames=['age', 'department', 'wage'])
        
        To set ``full_name`` as the index ::
        
            qs.to_dataframe(['age', 'department', 'wage'], index='full_name'])
        
        You can use filters and excludes ::
        
            qs.filter(age__gt=20, department='IT').to_dataframe(index='full_name')
        
        to_timeseries
        --------------
        
        A convenience method for creating a time series i.e the
        DataFrame index is instance of a DateTime or PeriodIndex
        
        **Parameters**
        
            - fieldnames:  The model field names to utilise in creating the frame.
                to span a relationship, just use the field name of related
                fields across models, separated by double underscores,
        
            - index: specify the field to use  for the index. If the index
                field is not in the field list it will be appended. This
                is mandatory.
        
            - storage:  Specify if the queryset uses the `wide` or `long` format
                for data.
        
            -  pivot_columns: Required once the you specify `long` format
                storage. This could either be a list or string identifying
                the field name or combination of field. If the pivot_column
                is a single column then the unique values in this column become
                a new columns in the DataFrame
                If the pivot column is a list the values in these columns are
                concatenated (using the '-' as a separator)
                and these values are used for the new timeseries columns
        
            - values: Also required if you utilize the `long` storage the
                values column name is use for populating new frame values
        
            - freq: the offset string or object representing a target conversion
        
            - rs_kwargs: Arguments based on pandas.DataFrame.resample
        
            - verbose:  If  this is ``True`` then populate the DataFrame with the
                        human readable versions of any foreign key or choice fields
                        else use the actual value set in the model.
        
        Examples
        ^^^^^^^^^
        
        Using a *long* storage format ::
        
            #models.py
        
            class LongTimeSeries(models.Model):
                date_ix = models.DateTimeField()
                series_name = models.CharField(max_length=100)
                value = models.FloatField()
        
                objects = DataFrameManager()
        
        Some sample data:::
        
            ========   =====       =====
            date_ix    series_name value
            ========   =====       ======
            2010-01-01  gdp        204699
        
            2010-01-01  inflation  2.0
        
            2010-01-01  wages      100.7
        
            2010-02-01  gdp        204704
        
            2010-02-01  inflation  2.4
        
            2010-03-01  wages      100.4
        
            2010-02-01  gdp        205966
        
            2010-02-01  inflation  2.5
        
            2010-03-01  wages      100.5
            ==========  ========== ======
        
        
        Create a QuerySet ::
        
            qs = LongTimeSeries.objects.filter(date_ix__year__gte=2010)
        
        Create a timeseries dataframe ::
        
            df = qs.to_timeseries(index='date_ix',
                                  pivot_columns='series_name',
                                  values='value',
                                  storage='long')
            df.head()
        
            date_ix      gdp     inflation     wages
        
            2010-01-01   204966     2.0       100.7
        
            2010-02-01   204704      2.4       100.4
        
            2010-03-01   205966      2.5       100.5
        
        
        Using a *wide* storage format ::
        
            class WideTimeSeries(models.Model):
                date_ix = models.DateTimeField()
                col1 = models.FloatField()
                col2 = models.FloatField()
                col3 = models.FloatField()
                col4 = models.FloatField()
        
                objects = DataFrameManager()
        
            qs = WideTimeSeries.objects.all()
        
            rs_kwargs = {'how': 'sum', 'kind': 'period'}
            df = qs.to_timeseries(index='date_ix', pivot_columns='series_name',
                                  values='value', storage='long',
                                  freq='M', rs_kwargs=rs_kwargs)
        
        to_pivot_table
        --------------
        A convenience method for creating a pivot table from a QuerySet
        
        **Parameters**
        
           - fieldnames:  The model field names to utilise in creating the frame.
                to span a relationship, just use the field name of related
                fields across models, separated by double underscores,
           - values : column to aggregate, optional
           - rows : list of column names or arrays to group on
                Keys to group on the x-axis of the pivot table
           - cols : list of column names or arrays to group on
                Keys to group on the y-axis of the pivot table
           - aggfunc : function, default numpy.mean, or list of functions
                If list of functions passed, the resulting pivot table will have
                hierarchical columns whose top level are the function names
                (inferred from the function objects themselves)
           - fill_value : scalar, default None
                Value to replace missing values with
           - margins : boolean, default False
                Add all row / columns (e.g. for subtotal / grand totals)
           - dropna : boolean, default True
        
        **Example**
        ::
        
            # models.py
            class PivotData(models.Model):
                row_col_a = models.CharField(max_length=15)
                row_col_b = models.CharField(max_length=15)
                row_col_c = models.CharField(max_length=15)
                value_col_d = models.FloatField()
                value_col_e = models.FloatField()
                value_col_f = models.FloatField()
        
                objects = DataFrameManager()
        
        Usage ::
        
                rows = ['row_col_a', 'row_col_b']
                cols = ['row_col_c']
        
                pt = qs.to_pivot_table(values='value_col_d', rows=rows, cols=cols)
        
        
        .. end-here
        
        
        CHANGES
        ========
        
        0.6.4 (2021-02-08)
        ------------------
        Bumped version number as the previous release was incorrectly uploaded
        to pypi
        
        0.6.1 (2020-05-26)
        ------------------
        Supports the latest release of Pandas 1.0.3
        
        0.6.0 (2019-01-11)
        ------------------
        Removes compatibility with Django versions < 1.8
        
        
        0.5.2 (2019-01-3)
        -----------------
        **This is the last version that supports Django < 1.8**
        
        - Improved coerce_float option (thanks @ZuluPro )
        - Ensure compatibility with legacy versions of Django ( < 1.8)
        - Test pass with Django 2+ and python 3.7
        
        0.5.1 (2018-01-26)
        -------------------
        - Address Unicode decode error when installing with pip3 on docker (Thanks @utapyngo)
        
        0.5.0 (2018-01-20)
        ------------------
        - Django 2.0 compatibility (Thanks @meirains)
        
        0.4.5 (2017-10-4)
        -----------------
        - A Fix for fieldname deduplication bug thanks to @kgabbott
        
        0.4.4 (2017-07-16)
        -------------------
        - The `verbose` argument now handles more use cases (Thanks to @henhuy and
          Kevin Abbott)
        - Corece float argument add to ```to_timeseries()``` method (Thanks Yousuf Jawwad)
        
        0.4.3 (2017-06-02)
        --------------------
        - Fix doc typos and formatting
        - Prevent column duplication in read_frame (Thanks Kevin Abbott)
        
        0.4.2 (2017-05-22)
        --------------------
        - Compatibility with `pandas 0.20.1`
        - Support for Python 2.7 and 3.5 with Django versions 1.8+
        - Suport for Python 3.6 and Django 1.11
        - We still support legacy versions (Python 2.7 and Django 1.4)
        
        0.4.1 (2016-02-05)
        -------------------
        - Address the incompatibility with Django 1.9 due to the removal of
          specialized query sets like the
          `ValuesQuerySet <https://code.djangoproject.com/ticket/24211>`_
        - Address the removal of the ``PassThrougManager`` from  ``django-model-utils``
          version ``2.4``.  We've removed the dependency on django-model-utils and
          included the PassThroughManger (which was always a standalone tool
          distributed a part of django-model-utils) for compatibility with
          earlier versions of Django (<= 1.8). For more recent versions of
          Django we're using Django's built in ``QuerySet.as_manager()``.
        - Now supports Pandas 0.14.1 and above
        - The fall in Coverage in this release largely reflects the integration of
          the PassThroughManager into the code base. We'll add the required test
          coverage for the PassThroughManager in subsequent releases.
        
        0.3.1 (2015-10-25)
        -------------------
        - Extends the ability to span a ForeignKey relationship with double underscores
          to OneToOneField too thanks to Safe Hammad
        - Provide better support for  ManyToMany and OneToMany relations thanks to
          Jeff Sternberg and @MiddleFork
        
        0.3.0 (2015-06-16)
        ---------------------
        - This version supports Django 1.8
        - Support for Pandas 0.16
        
        0.2.2 (2015-03-02)
        ---------------------
        - Added Support for Django 1.7
        
        0.2.1 (2015-01-28)
        ---------------------
        - Added Support for Values QuerySets
        - Support for Python 2.6
        - Note we still have limited support for Django 1.7 but this will be coming in
          the next release
        
        0.2.0 (2014-06-15)
        --------------------
        
        - Added the ``io`` module so that DataFrames can be created from any
          queryset so you don't need to to add a ``DataFrame manager`` to your
          models. This is good for working with legacy projects.
        - added a Boolean ``verbose`` argument to all methods (which defaults to ``True``)
          This populate the DataFrame columns with the human readable versions of
          foreign key or choice fields.
        - Improved the performance DataFrame creation by removing dependency on
          ``np.core.records.fromrecords``
        - Loads of bug fixes, more tests and improved coverage and better
          documentation
        
Platform: UNKNOWN
Classifier: Development Status :: 3 - Alpha
Classifier: Environment :: Web Environment
Classifier: Intended Audience :: Developers
Classifier: License :: OSI Approved :: BSD License
Classifier: Operating System :: OS Independent
Classifier: Programming Language :: Python
Classifier: Programming Language :: Python :: 2.7
Classifier: Programming Language :: Python :: 3.3
Classifier: Programming Language :: Python :: 3.4
Classifier: Programming Language :: Python :: 3.6
Classifier: Programming Language :: Python :: 3.7
Classifier: Framework :: Django
