Metadata-Version: 2.1
Name: s3bz
Version: 0.1.19
Summary: for saving dictionaries using s3 with bz2 compression
Home-page: https://github.com/thanakijwanavit/s3bz/tree/master/
Author: nic wanavit
Author-email: nwanavit@hatari.cc
License: Apache Software License 2.0
Description: # S3Bz
        > save and load dictionary to s3 using bz compression
        
        
        full docs here https://thanakijwanavit.github.io/s3bz/
        
        ## Install
        
        `pip install s3bz`
        
        ## How to use
        
        ### Create a bucket and make sure that it has transfer acceleration enabled
        #### create a buket
        `aws s3 mb s3://<bucketname>`
        #### put transfer acceleration
        `aws s3api put-bucket-accelerate-configuration --bucket <bucketname> --accelerate-configuration Status=Enabled`
        
        First, import the s3 module
        
        ## import package
        
        ```python
        from importlib import reload
        from s3bz.s3bz import S3
        ```
        
        ### set up dummy data
        
        ## BZ2 compression
        
        ### save object using bz2 compression
        
        ```python
        result = S3.save(key = key, 
               objectToSave = sampleDict,
               bucket = bucket,
               user=USER,
               pw = PW,
               accelerate = True)
        print(('failed', 'success')[result])
        ```
        
            success
        
        
        ### load object with bz2 compression
        
        ```python
        result = S3.load(key = key,
               bucket = bucket,
               user = USER,
               pw = PW,
               accelerate = True)
        print(result[0])
        ```
        
            {'ib_prcode': '57450', 'ib_brcode': '1007', 'ib_cf_qty': '298', 'new_ib_vs_stock_cv': '939'}
        
        
        ## other compressions
        Zl : zlib compression with json string encoding
        pklzl : zlib compression with pickle encoding
        
        ```python
        print(bucket)
        %time S3.saveZl(key,sampleDict,bucket)
        %time S3.loadZl(key,bucket)
        %time S3.savePklZl(key,sampleDict,bucket)
        %time result =S3.loadPklZl(key,bucket)
        ```
        
            pybz-test
            CPU times: user 18.8 ms, sys: 3.88 ms, total: 22.7 ms
            Wall time: 73.6 ms
            CPU times: user 32.2 ms, sys: 0 ns, total: 32.2 ms
            Wall time: 267 ms
            CPU times: user 17.8 ms, sys: 3.76 ms, total: 21.6 ms
            Wall time: 94.1 ms
            CPU times: user 31.2 ms, sys: 298 µs, total: 31.5 ms
            Wall time: 156 ms
        
        
        ## Bring your own compressor and encoder
        
        ```python
        import gzip, json
        compressor=lambda x: gzip.compress(x)
        encoder=lambda x: json.dumps(x).encode()
        decompressor=lambda x: gzip.decompress(x)
        decoder=lambda x: json.loads(x.decode())
        
        %time S3.generalSave(key, sampleDict, bucket = bucket, compressor=compressor, encoder=encoder )
        %time result = S3.generalLoad(key, bucket , decompressor=decompressor, decoder=decoder)
        assert result == sampleDict, 'not the same as sample dict'
        ```
        
            CPU times: user 29.2 ms, sys: 68 µs, total: 29.3 ms
            Wall time: 217 ms
            CPU times: user 31 ms, sys: 265 µs, total: 31.3 ms
            Wall time: 103 ms
        
        
        ## check if an object exist
        
        ```python
        result = S3.exist('', bucket, user=USER, pw=PW, accelerate = True)
        print(('doesnt exist', 'exist')[result])
        ```
        
            exist
        
        
        ## presign download object
        
        ```python
        url = S3.presign(key=key,
                      bucket=bucket,
                      expiry = 1000,
                      user=USER,
                      pw=PW)
        print(url)
        ```
        
            https://pybz-test.s3-accelerate.amazonaws.com/test.dict?AWSAccessKeyId=AKIAVX4Z5TKDSNNNULGB&Signature=JNcnO2HKa%2FFf9JUklr5F8II7KS4%3D&Expires=1616656869
        
        
        ### download using signed link
        
        ```python
        from s3bz.s3bz import Requests
        result = Requests.getContentFromUrl(url)
        ```
        
        
            ---------------------------------------------------------------------------
        
            OSError                                   Traceback (most recent call last)
        
            <ipython-input-16-238c0520ce77> in <module>
                  1 from s3bz.s3bz import Requests
            ----> 2 result = Requests.getContentFromUrl(url)
            
        
            /mnt/efs/pip/s3bz/s3bz/s3bz.py in getContentFromUrl(url)
                221         return result.content
                222       content = result.content
            --> 223       decompressedContent = bz2.decompress(content)
                224       contentDict = json.loads(decompressedContent)
                225       return contentDict
        
        
            ~/SageMaker/.persisted_conda/python38/lib/python3.8/bz2.py in decompress(data)
                348         decomp = BZ2Decompressor()
                349         try:
            --> 350             res = decomp.decompress(data)
                351         except OSError:
                352             if results:
        
        
            OSError: Invalid data stream
        
        
        ## File operations
        
        ### save without compression
        
        ```python
        inputPath = '/tmp/tmpFile.txt'
        key = 'tmpFile'
        downloadPath = '/tmp/downloadTmpFile.txt'
        with open(inputPath , 'w')as f:
          f.write('hello world')
        ```
        
        ```python
        S3.saveFile(key =key ,path = inputPath,bucket = bucket)
        ##test
        S3.exist(key,bucket)
        ```
        
        ### load without compression
        
        ```python
        S3.loadFile(key= key , path = downloadPath, bucket = bucket)
        ```
        
        ```python
        ##test
        with open(downloadPath, 'r') as f:
          print(f.read())
        ```
        
        ### delete
        
        ```python
        result = S3.deleteFile(key, bucket)
        ## test
        S3.exist(key,bucket)
        ```
        
        ## save and load pandas dataframe
        
        ```python
        ### please install in pandas, 
        ### this is not include in the requirements to minimize the size impact
        import pandas as pd
        df = pd.DataFrame({'test':[1,2,3,4,5],'test2':[2,3,4,5,6]})
        S3.saveDataFrame(bucket,key,df)
        S3.loadDataFrame(bucket,key)
        ```
        
        # presign post with conditions
        
        ```python
        from s3bz.s3bz import ExtraArgs, S3
        ```
        
        ```python
        bucket = 'pybz-test'
        key = 'test.dict'
        fields = {**ExtraArgs.jpeg}
        S3.presignUpload(bucket, key, fields=fields)
        ```
        
        
        
        
            {'url': 'https://pybz-test.s3-accelerate.amazonaws.com/',
             'fields': {'Content-Type': 'image/jpeg',
              'key': 'test.dict',
              'AWSAccessKeyId': 'AKIAVX4Z5TKDSNNNULGB',
              'policy': 'eyJleHBpcmF0aW9uIjogIjIwMjEtMDMtMjVUMDc6MjM6MDVaIiwgImNvbmRpdGlvbnMiOiBbeyJidWNrZXQiOiAicHliei10ZXN0In0sIHsia2V5IjogInRlc3QuZGljdCJ9XX0=',
              'signature': '4qdnfHf0AiLSOywEUAbukYTLNcw='}}
        
        
        
Keywords: s3 bz2
Platform: UNKNOWN
Classifier: Development Status :: 3 - Alpha
Classifier: Intended Audience :: Developers
Classifier: License :: OSI Approved :: Apache Software License
Classifier: Natural Language :: English
Classifier: Programming Language :: Python :: 3.6
Classifier: Programming Language :: Python :: 3.7
Classifier: Programming Language :: Python :: 3.8
Requires-Python: >=3.6
Description-Content-Type: text/markdown
