Metadata-Version: 2.4
Name: lazydag
Version: 0.2.0
Summary: A lazy data processing pipeline framework
License: MIT
License-File: LICENSE
Author: Majid Garoosi
Author-email: amoomajid99@gmail.com
Requires-Python: >=3.12,<4.0
Classifier: License :: OSI Approved :: MIT License
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.12
Classifier: Programming Language :: Python :: 3.13
Classifier: Programming Language :: Python :: 3.14
Requires-Dist: pytest (>=9.0.2,<10.0.0)
Requires-Dist: pyyaml (>=6.0.3,<7.0.0)
Requires-Dist: typer (>=0.21.0,<0.22.0)
Description-Content-Type: text/markdown

# LazyDAG

LazyDAG is a python framework that manages a pipeline for data processing.
The focus is on incremental data processing and reactive execution.
That means, if an input asset consists of multiple entries and one of them
changes, the process can only recompute the changed entry.

## Features
- **ObjectCollection**: Managed data storage with incremental updates and filesystem persistence.
- **Process**: Units of computation that react to changes in input collections.
- **Lazy/Reactive Execution**: Processes run only when their inputs change.
- **Daemons**: Background processes that continuously feed data into the pipeline.

## Usage

```python
from lazydag.schematic import define_process, define_collection
from lazydag.collections import ListObjectCollection
from lazydag.core import Process
from lazydag.scheduler import get_runtime

# Define your collections
oc1 = define_collection(ListObjectCollection, name="Numbers", save_path="./data/nums")
oc2 = define_collection(ListObjectCollection, name="Results", save_path="./data/results")

# Define your processes
class MyProcess(Process):
    inputs = ["nums"]
    outputs = ["results"]
    def run(self, nums: ListObjectCollection, results: ListObjectCollection):
        # ... logic ...
        pass

define_process(MyProcess, "my_proc", inputs={"nums": oc1}, outputs={"results": oc2})

# Start the scheduler
get_runtime().start()
```

