Metadata-Version: 2.1
Name: spydy
Version: 0.1.23
Summary: light-weight high-level web-crawling framework
Home-page: https://github.com/superjcd/spydy
Author: Jiang Chaodi
Author-email: 929760274@qq.com
License: BSD
Download-URL: https://github.com/superjcd/spydy/tarball/0.1.23
Description: 
        # Spydy
        
        spydy is a light-weight high-level web-crawling framework for fast-devlopment and high performance, which is inspired by unix pipeline.
        
        ---
        
        [Code](https://github.com/superjcd/spydy)
        
        [Document](https://superjcd.github.io/spydy/)
        
        ---
        
        ## Install
        
        ```
        pip install spydy
        ```
        
        
        
        ## How to use
        
        There are two ways of running spydy:
        
        - one way is to prepare a configuration file, and run spydy from cmd:
        
        ```
        spydy myconfig.cfg
        ```
        
        `myconfig.cfg` may looks like below:
        
        ```
        [Globals]
        run_mode = async_forever
        nworkers = 4
        
        [PipeLine]
        url = DummyUrls
        request = AsyncHttpRequest
        parser = DmozParser
        log = MessageLog
        store = CsvStore
        
        [url]
        url = https://dmoz-odp.org
        repeat = 10
        
        [store]
        file_name = result.csv
        ```
        
        
        
        - or run it from a python file(e.g. ` spider.py`):
        
        ```
        from spydy.engine import Engine
        from spydy.utils import check_configs
        from spydy import urls, request, parsers, logs, store
        
        myconfig = {
            "Globals":{
                "run_mode": "async_forever",
                "nworkers": "4"
            },
            "PipeLine":[urls.DummyUrls(url="https://dmoz-odp.org", repeat=10),
                        request.AsyncHttpRequest(), parsers.DmozParser(), logs.MessageLog(), store.CsvStore(file_name=FILE_NAME)]
            }
        
        check_configs(myconfig)
        spider = Engine.from_dict(myconfig)
        spider.run()
        ```
        
        then run it :
        
        ```
        $ python spider.py
        ```
        
Platform: UNKNOWN
Classifier: Development Status :: 3 - Alpha
Classifier: Intended Audience :: Developers
Classifier: Programming Language :: Python :: 3
Description-Content-Type: text/markdown
