Metadata-Version: 2.1
Name: cas-manifest
Version: 0.1.1
Summary: cas-manifest allows developers to store artifacts in a _content-addressable_ store using a self-describing _manifest_
Home-page: https://github.com/danielhfrank/cas-manifest
License: MIT
Author: Dan Frank
Requires-Python: >=3.7,<4.0
Classifier: License :: OSI Approved :: MIT License
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.7
Classifier: Programming Language :: Python :: 3.8
Classifier: Programming Language :: Python :: 3.9
Requires-Dist: hashfs (>=0.7.2,<0.8.0)
Requires-Dist: pydantic (>=1.6.1,<2.0.0)
Project-URL: Repository, https://github.com/danielhfrank/cas-manifest
Description-Content-Type: text/markdown

# CAS-Manifest

This package facilitates storing artifacts in Content Addressable Storage via the `hashfs` library. In a CAS regime, the hash of the artifact's contents is used as the key.

It further requires that artifacts are `pydantic` models - this allows for stable serialization of the artifacts, and for data to be self-describing.

Consider an example usage profile: let's say that your application works with datasets, some of which are serialized as csv files, others of which are serialized as tsv files. Some have header rows, and some do not. Rather than write data-loading code that tries to infer the correct way to deserialize a dataset file, `cas-manifest` serializes all relevant
attributes of the dataset along with the data file itself. Your code might look like this:
```python
from hashfs import HashFS
from cas_manifest.registry import Registry
from my_classes import CSVDataset, TSVDataset

fs = HashFS('/path/to/data')
dataset_hash = '5fef4a'
registry = Registry(fs, [CSVDataset, TSVDataset])
obj = registry.load(dataset_hash)
# obj is an instance of either CSVDataset or TSVDataset
```

