Metadata-Version: 2.1
Name: openGraphMatching
Version: 0.2.0
Summary: A subgraph matching programming library.
Home-page: https://github.com/chang2000/openGraphMatching
Author: WANG Tianchang, LI Yuxiang
Author-email: tianchang.wang.00@gmail.com
License: UNKNOWN
Description: 
        # openGraphMatching
        
        A Python Graph/Subgraph Matching programming library. Based on openGraphMatching and the framework provided, you can develop and test different subgraph matching algorithms efficiently. What's more, since the codebase is Python-based, algorithms with neural networks can be integrated easily with traditional subgraph matching algorithms. 
        
        # Usage
        
        Prerequisite `pytorch>=1.6, networkx, pytorch-geometric, deepsnap`.
        
        A detailed environment configration can be found in `env.yml`.
        
        The `BaseMatcher` is the minium implementation of subgraph matching algorithm.
        
        Here is the demo code for running GraphQL algorithm: 
        
        ```python
        import openGraphMatching.matcher as matcher
        
        # Prepare query graph q and target graph G in advance. q and G are networkx instance.
        
        m = matcher.GQLMatcher(G) # Initialize the object with targer graph G
        m.is_subgraph_match(q) # Run the check match process
        ```
        
        # Quickstart
        
        - Clone this repository: `git clone https://github.com/chang2000/openGraphMatching`
        - Install all prerequisite
          - [`networkx`](https://networkx.org/)
          - [`pytorch-geometric`](https://github.com/rusty1s/pytorch_geometric)
          - [`deepsnap`](https://github.com/snap-stanford/deepsnap)
        -  `pip install openGraphMatching` to install this package.
        - Go to `examples` and enjoy~
        
        
        
        For Conda users, the conda environment file `env.yml` is also provided.  This [link](https://docs.conda.io/projects/conda/en/latest/user-guide/tasks/manage-environments.html#creating-an-environment-from-an-environment-yml-file) provides the method to install conda environment by a environment file. In short, use this commend `conda env create -f env.yml`. A conda environment named `ogm` will be created.
        
        ## Internals
        
        Each matcher follows the `filtering->ordering->enumerating` design pattern. In the directory `matcher` there are three files: `filters.py`, `orders.py`,`enumeraters.py`, where each file contains the general algorithms. 
        
        For example, in the graphQL matcher, the code pipeline is like:
        
        ```python
        from openGraphMatching import BaseMatcher
        from openGraphMatching import filters as f
        from openGraphMatching import orders as o
        from openGraphMatching import enumeraters as e
        
        class GQLMatcher(BaseMatcher):
            def __init__(self, G):
                super().__init__(G)
        
            def filtering(self, q):
                prefilter = f.Filter(self.G)
                return prefilter.gql_filtering(q)
        
            def ordering(self, q, candidates):
                orderer = o.Order(self.G)
                return orderer.gql_order(q, candidates)
        
            def enumerate(self, q, imd, order, i):
                enu = e.Enumerater(self.G)
                enu.normal_enum(q, imd, order, i)
                return enu.res_getter()
        
            def is_subgraph_match(self, q):
              	# Some verbose and statistics are skipped here
                candidates = self.filtering(q)
                order = self.ordering(q, candidates)
                match_list = self.enumerate(q, candidates, order, 1)
                return match_list
        ```
        
        # Misc
        
        1. Validate the correctness of algorithm.
        
           - We use HPRD (9460 nodes, 34998 edges) to validate the correctness considering the time consuming and program performance.
           - We provides 200 quries(dense, 16 nodes).
           - A correctness checker is implemented in `utils.py`.
           - Expected result of all the matching are provided in `expected.res`. Refer to the dataset `validate` for more information.
        
        2. The data format of the datasets
        
           1. For each `.graph` file, the first line will always be `t x y` where `x` and `y` are two int indicates the number of nodes and edges
        
           2. Vertex data  `v v_id v_label v_degree`
        3. Edge data `e v_id v_id`
           
        
        
        
        ## Side word
        
        - Cannot handle yeast well even it has a smaller dataset size, since around 10% of nodes left for 3000 nodes.
        
        
Platform: UNKNOWN
Classifier: Programming Language :: Python :: 3
Classifier: License :: OSI Approved :: MIT License
Classifier: Operating System :: OS Independent
Requires-Python: >=3.6
Description-Content-Type: text/markdown
