Metadata-Version: 2.1
Name: familyanalyzer
Version: 0.7.2
Summary: A tool to analyse gene family evolution from orthoxml
Home-page: UNKNOWN
Author: Adrian Altenhoff
Author-email: adrian.altenhoff@inf.ethz.ch
License: MIT
Description: 
        
        WARNING:
        ========
        Family-Analyzer is outdated and have been replace by pyHam 
        available at https://github.com/DessimozLab/pyham . 
        
        
        
        Family-Analyzer: summarize gene family evolution from orthoxml 
        ==============================================================
        
        
        Motivation 
        ----------
        Family-Analyzer is a tool to further analyze the hierarchical orthologous
        groups from an orthoXML file. More informations on the schema of orthoxml and
        some examples are available at http://orthoxml.org.
        
        Family-Analyzer report to the user a summary of the evolutionary history acting
        on the gene families. The summary reports with respect to one or two levels
        taxonomic levels what happens after respectively between the specified
        taxonimic levels which genes were maintained, got lost, duplicated, were gained
        in that period.
        
        
        Installation
        ------------
        Family-Analyzer is written in python3, with little external dependencies, i.e.
        currently only the lxml library. The setup script should resolve these 
        dependencies automatically. 
        Consider using pip to install the package directly from a checked out git repo
        
        .. code-block:: sh
        
           pip install -e </path/to/family-analyzer-repo/>
        
        
        
        Running Family-Analyzer
        -----------------------
        So far running the family analyzer on a specific dataset is relatively easy.
        The main entry point for it is the 'main' section in 
        familyanalyzer/familyanalyzer.py
        
        If this script is called with -h as argument, it gives a short description 
        of the required and optional arguments and what they are used for. Here is
        what the usage output reports as of now. Since this is still work in progress,
        make sure the current usage did not change.
        
        .. code-block:: sh
        
           python familyanalyzer/familyanalyzer.py -h
           
           usage: familyanalyzer.py [-h] [--xreftag XREFTAG] [--show_levels] [-r]
                                    [--taxonomy TAXONOMY] [--propagate_top]
                                    [--show_taxonomy]
                                    [--store_augmented_xml STORE_AUGMENTED_XML]
                                    [--compare_second_level COMPARE_SECOND_LEVEL]
                                    orthoxml level species [species ...]
           
           Analyze Hierarchical OrthoXML families.
           
           positional arguments:
             orthoxml              path to orthoxml file to be analyzed
             level                 taxonomic level at which analysis should be done
             species               (list of) species to be analyzed. Note that only genes
                                   of the selected species are reported. In order for the
                                   output to make sense, the selected species all must be
                                   part of the linages specified in 'level' (and
                                   --compare_second_level).
           
           optional arguments:
             -h, --help            show this help message and exit
             --xreftag XREFTAG     xref tag of genes to report. OrthoXML allows to store
                                   multiple ids and xref annotations per gene as
                                   attributes in the species section. If not set, the
                                   internal (purely numerical) ids are reported.
             --show_levels         print the levels and species found in the orthoXML
                                   file and quit
             -r, --use-recursion   DEPRECATED: Use recursion to sample families that are
                                   a subset of the query
             --taxonomy TAXONOMY   Taxonomy used to reconstruct intermediate levels. Has
                                   to be either 'implicit' (default) or a path to a file
                                   in Newick format. The taxonomy might be
                                   multifurcating. If set to 'implicit', the taxonomy is
                                   extracted from the input OrthoXML file. The orthoXML
                                   level do not have to cover all the levels for all
                                   families. In order to infer gene losses Family-
                                   Analyzer needs to infer these skipped levels and
                                   reconcile each family with the complete taxonomy.
             --propagate_top       propagate taxonomy levels up to the toplevel. As an
                                   illustration, consider a gene family in an eukaryotic
                                   analysis that has only mammalian genes. Its topmost
                                   taxonomic level will therefor be 'Mammalia' and an
                                   ancestral gene was gained at that level. However, if
                                   '--propagete-top' is set, the family is assumed to
                                   have already be present in the topmost taxonomic
                                   level, i.e. Eukaryota in this example, and non-
                                   mammalian species have all lost this gene.
             --show_taxonomy       write the taxonomy used to standard out.
             --store_augmented_xml STORE_AUGMENTED_XML
                                   filename to which the input orthoxml file with
                                   augmented annotations is written. The augmented
                                   annotations include for example the additional
                                   taxonomic levels of orthologGroup and unique HOG IDs.
             --compare_second_level COMPARE_SECOND_LEVEL
                                   Compare secondary level with primary one, i.e. report
                                   what happend between the secondary and primary level
                                   to the individual histories. Note that the Second
                                   level needs to be younger than the primary.
        
        
        Code organisation
        -----------------
        
        OrthoXMLParser: class which holds the orthoxml file and gives access to its 
                        data and keeps internal mappings to speed up lookups.
        
        
        Taxonomy: class wich provides a basic navigation through the species taxonomy.
                  Objects will be constructed using the TaxonomyFactory and can be 
                  either based on the orthoxml or a newick tree. 
        
Platform: UNKNOWN
Classifier: Development Status :: 3 - Alpha
Classifier: Intended Audience :: Developers
Classifier: Intended Audience :: Science/Research
Classifier: Topic :: Scientific/Engineering :: Bio-Informatics
Classifier: License :: OSI Approved :: MIT License
Classifier: Programming Language :: Python :: 2
Classifier: Programming Language :: Python :: 2.7
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.5
Classifier: Programming Language :: Python :: 3.6
Classifier: Programming Language :: Python :: 3.7
Classifier: Programming Language :: Python :: 3.8
Description-Content-Type: text/x-rst
