Metadata-Version: 1.1
Name: adenine
Version: 0.1.3
Summary: A Data ExploratioN pIpeliNE
Home-page: https://github.com/slipguru/adenine
Author: Samuele Fiorini, Federico Tomasi
Author-email: {samuele.fiorini, federico.tomasi}@dibris.unige.it
License: FreeBSD
Download-URL: https://github.com/slipguru/adenine/tarball/0.1.3
Description: =====================================
        ADENINE (A Data ExploratioN pIpeliNE)
        =====================================
        
        **ADENINE** is a machine learning and data mining Python pipeline that helps you to answer this tedious question: are my data relevant with the problem I'm dealing with?
        
        The main structure of adenine can be summarized in the following 4 steps.
        
        1. **Imputing:** Does your dataset have missing entries? In the first step you can fill the missing values choosing between different strategies: feature-wise median, mean and most frequent value or a more stable k-NN imputing.
        
        2. **Preprocessing:** Have you ever wondered what would have changed if only  your data have been preprocessed in a different way? Or is data preprocessing is a good idea at all? ADENINE offers several preprocessing procedures, such as: data centering, Min-Max scaling, standardization or normalization and allows you to compare the results of the analysis made with different preprocessing step as starting point.
        
        3. **Dimensionality Reduction:** In the context of data exploration, this  phase becomes particularly helpful for high dimensional data (e.g. -omics scenario). This step includes some manifold learning (such as isomap, multidimensional scaling, etc) and unsupervised dimensionality reduction (principal component analysis, kernel PCA) techniques.
        
        4. **Clustering:** This step aims at grouping data into clusters in an unsupervised manner. Several techniques such as k-means, spectral or hierarchical clustering are offered.
        
        The final output of adenine is a compact and textual representation of the results obtained from the pipelines made with each possible combination of the algorithms implemented at each step.
        
Platform: UNKNOWN
Classifier: Development Status :: 4 - Beta
Classifier: Environment :: Console
Classifier: Intended Audience :: Science/Research
Classifier: Intended Audience :: Developers
Classifier: Programming Language :: Python
Classifier: License :: OSI Approved :: BSD License
Classifier: Topic :: Software Development
Classifier: Topic :: Scientific/Engineering :: Bio-Informatics
Classifier: Operating System :: POSIX
Classifier: Operating System :: Unix
Classifier: Operating System :: MacOS
Requires: numpy (>=1.10.1)
Requires: scipy (>=0.16.1)
Requires: sklearn (>=0.17)
Requires: matplotlib (>=1.5.1)
Requires: seaborn (>=0.7.0)
Requires: fastcluster (>=1.1.20)
