Metadata-Version: 2.1
Name: os-spage
Version: 0.5.1
Summary: Read and write spage.
Home-page: https://github.com/cfhamlet/os-spage
Author: Ozzy
Author-email: cfhamlet@gmail.com
License: MIT License
Description: # os-spage
        
        [![Build Status](https://www.travis-ci.org/cfhamlet/os-spage.svg?branch=master)](https://www.travis-ci.org/cfhamlet/os-spage)
        [![codecov](https://codecov.io/gh/cfhamlet/os-spage/branch/master/graph/badge.svg)](https://codecov.io/gh/cfhamlet/os-spage)
        [![PyPI - Python Version](https://img.shields.io/pypi/pyversions/os-spage.svg)](https://pypi.python.org/pypi/os-spage)
        [![PyPI](https://img.shields.io/pypi/v/os-spage.svg)](https://pypi.python.org/pypi/os-spage)
        
        
        Read and write Spage.
        
        Spage is an incompact data structure to specify fetched record. Generally speaking, it contains four sub-blocks: *url*, *inner_header*, *http_header*, and *data*.
        
        Spage:
        - __url__: the URL.
        - __inner_header__: key-values, can be used to record fetch/process info, such as fetch-time, data-digest, record-type,  ect.
        - __http_header__: key-values, server's response HTTP Header as you know.
        - __data__: fetched data, can be flat or compressed html.
        
        We use dict type to implements Spage. A predefined [schema](https://github.com/cfhamlet/os-spage/blob/master/src/os_spage/default_schema.py) can be used for validating.
        
        It is common to write Spage to size-rotate-file, we choice [os-rotatefile](https://github.com/cfhamlet/os-rotatefile.git) as default back-end.
        
        __Notice__: 
        1. os-spage should not be used for strict serialization/deserialization purpose, it will lose type info when written, all data will be read as string(unicode python2) after all.
        2. Usually, the data stored in compressed format. You can use ``zlib.decompress`` method to decompress.
         
        -------------------------
        Offpage:
        
        From v0.4, this libaray support reading from offpage. Offpage is another data storage format, include url, headers and series data. You can use ``read/open_file`` methods with ``page_type="offpage"`` to read from offpage.
        
        
        
        From v0.5, support transform spage into offpage. You can use ``read/open_file`` methods with ``page_type="s2o"`` to read from spage and transform the record into offpage format. (Not fully tested yet)
        
        
        
        Example:
        
        ```
        from os_spage import read
        
        f = open('your_spage', 'rb')
        for offpage in read(f, page_type='s2o'):
            print(offpage )
        ```
        
        
        
        
        
        # Install
        
        `pip install os-spage`
        
        # Usage
        
          * Write to size-rotate-file
        
          ```
            from os_spage import open_file
        
            url = 'http://www.google.com/'
            inner_header = {'User-Agent': 'Mozilla/5.0', 'batchID': 'test'}
            http_header = {'Content-Type': 'text/html'}
            data = b"Hello world!"
        
            f = open_file('file', 'w', roll_size='1G', compress=True)
            f.write(url, inner_header=inner_header, http_header=http_header, data=data, flush=True)
            f.close()
          ```
        
          * Read from size-rotate-file
        
          ```
            from os_spage import open_file
        
            f = open_file('file', 'r')
        
            for record in f.read():
                print(record)
            f.close()
          ```
        
          * R/W with other file-like object
        
          ```
            from io import BytesIO
            from os_spage import read, write
        
            s = BytesIO()
            write(s, "http://www.google.com/")
        
            s.seek(0)
            for record in read(s):
                print(record)
          ```
        
        
        # Unit Tests
        
        `$ tox`
        
        # License
        
        MIT licensed.
        
Platform: UNKNOWN
Classifier: Development Status :: 2 - Pre-Alpha
Classifier: Intended Audience :: Developers
Classifier: License :: OSI Approved :: MIT License
Classifier: Natural Language :: English
Classifier: Programming Language :: Python :: 2
Classifier: Programming Language :: Python :: 2.7
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.6
Classifier: Programming Language :: Python :: 3.7
Classifier: Programming Language :: Python :: Implementation :: CPython
Classifier: Programming Language :: Python :: Implementation :: PyPy
Description-Content-Type: text/markdown
