Metadata-Version: 2.1
Name: python-pyper
Version: 0.4.0
Summary: Concurrent Python made simple
Author-email: Richard Zhu <richard.zhu2@gmail.com>
Project-URL: Documentation, https://pyper-dev.github.io/pyper/
Project-URL: Repository, https://github.com/pyper-dev/pyper
Project-URL: Issues, https://github.com/pyper-dev/pyper/issues
Classifier: Intended Audience :: Developers
Classifier: Operating System :: OS Independent
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python
Classifier: Topic :: Software Development :: Libraries
Classifier: Topic :: Software Development
Classifier: Framework :: AsyncIO
Classifier: License :: OSI Approved :: MIT License
Classifier: Programming Language :: Python :: 3 :: Only
Classifier: Programming Language :: Python :: 3.8
Classifier: Programming Language :: Python :: 3.9
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Programming Language :: Python :: 3.13
Requires-Python: >=3.8
Description-Content-Type: text/markdown
License-File: LICENSE
Requires-Dist: typing_extensions; python_version < "3.10"

<p align="center">
  <img src="https://raw.githubusercontent.com/pyper-dev/pyper/refs/heads/main/docs/src/assets/img/pyper.png" alt="Pyper" style="width: 500px;">
</p>
<p align="center" style="font-size: 1.5em;">
    <em>Concurrent Python made simple</em>
</p>

<p align="center">
<a href="https://github.com/pyper-dev/pyper/actions/workflows/test.yml" target="_blank">
    <img src="https://github.com/pyper-dev/pyper/actions/workflows/test.yml/badge.svg" alt="Test">
</a>
<a href="https://coveralls.io/github/pyper-dev/pyper" target="_blank">
    <img src="https://coveralls.io/repos/github/pyper-dev/pyper/badge.svg" alt="Coverage">
</a>
<a href="https://pypi.org/project/python-pyper" target="_blank">
    <img src="https://img.shields.io/pypi/v/python-pyper?color=%2334D058&label=pypi%20package" alt="Package version">
</a>
<a href="https://pypi.org/project/python-pyper" target="_blank">
    <img src="https://img.shields.io/pypi/pyversions/python-pyper.svg?color=%2334D058" alt="Supported Python versions">
</a>
</p>

---

Pyper is a comprehensive framework for concurrent and parallel data-processing, based on functional programming patterns. Used for 🌐 **Data Collection**, 🔀 **ETL Systems**, and general-purpose 🛠️ **Python Scripting**

See the [Documentation](https://pyper-dev.github.io/pyper/)

Key features:

* 💡**Intuitive API**: Easy to learn, easy to think about. Implements clean abstractions to seamlessly unify threaded, multiprocessed, and asynchronous work.
* 🚀 **Functional Paradigm**: Python functions are the building blocks of data pipelines. Let's you write clean, reusable code naturally.
* 🛡️ **Safety**: Hides the heavy lifting of underlying task execution and resource clean-up. No more worrying about race conditions, memory leaks, or thread-level error handling.
* ⚡ **Efficiency**: Designed from the ground up for lazy execution, using queues, workers, and generators.
* ✨ **Pure Python**: Lightweight, with zero sub-dependencies.

## Installation

Install the latest version using `pip`:

```console
$ pip install python-pyper
```

Note that `python-pyper` is the [pypi](https://pypi.org/project/python-pyper) registered package.

## Usage

In Pyper, the `task` decorator is used to transform functions into composable pipelines.

Let's simulate a pipeline that performs a series of transformations on some data. 

```python
import asyncio
import time

from pyper import task


def get_data(limit: int):
    for i in range(limit):
        yield i


async def step1(data: int):
    await asyncio.sleep(1)
    print("Finished async wait", data)
    return data


def step2(data: int):
    time.sleep(1)
    print("Finished sync wait", data)
    return data


def step3(data: int):
    for i in range(10_000_000):
        _ = i*i
    print("Finished heavy computation", data)
    return data


async def main():
    # Define a pipeline of tasks using `pyper.task`
    pipeline = task(get_data, branch=True) \
        | task(step1, workers=20) \
        | task(step2, workers=20) \
        | task(step3, workers=20, multiprocess=True)

    # Call the pipeline
    total = 0
    async for output in pipeline(limit=20):
        total += output
    print("Total:", total)


if __name__ == "__main__":
    asyncio.run(main())
```

Pyper provides an elegant abstraction of the execution of each function via `pyper.task`, allowing you to focus on building out the **logical** functions of your program. In the `main` function:

* `pipeline` defines a function; this takes the parameters of its first task (`get_data`) and yields each output from its last task (`step3`)
* Tasks are piped together using the `|` operator (motivated by Unix's pipe operator) as a syntactic representation of passing inputs/outputs between tasks.

In the pipeline, we are executing three different types of work:

* `task(step1, workers=20)` spins up 20 `asyncio.Task`s to handle asynchronous IO-bound work

* `task(step2, workers=20)` spins up 20 `threads` to handle synchronous IO-bound work 

* `task(step3, workers=20, multiprocess=True)` spins up 20 `processes` to handle synchronous CPU-bound work

`task` acts as one intuitive API for unifying the execution of each different type of function.

Each task submits their outputs to the next task within the pipeline via queue-based data structures, which is the mechanism underpinning how concurrency and parallelism are achieved. See the [docs](https://pyper-dev.github.io/pyper/docs/UserGuide/BasicConcepts) for a breakdown of what a pipeline looks like under the hood.

---

</details>

<details markdown="1">
<summary><u>See a non-async example</u></summary>

<br>

Pyper pipelines are by default non-async, as long as their tasks are defined as synchronous functions. For example:

```python
import time

from pyper import task


def get_data(limit: int):
    for i in range(limit):
        yield i

def step1(data: int):
    time.sleep(1)
    print("Finished sync wait", data)
    return data

def step2(data: int):
    for i in range(10_000_000):
        _ = i*i
    print("Finished heavy computation", data)
    return data


def main():
    pipeline = task(get_data, branch=True) \
        | task(step1, workers=20) \
        | task(step2, workers=20, multiprocess=True)
    total = 0
    for output in pipeline(limit=20):
        total += output
    print("Total:", total)


if __name__ == "__main__":
    main()
```

A pipeline consisting of _at least one asynchronous function_ becomes an `AsyncPipeline`, which exposes the same usage API, provided `async` and `await` syntax in the obvious places. This makes it effortless to combine synchronously defined and asynchronously defined functions where need be.

</details>

## Examples

To explore more of Pyper's features, see some further [examples](https://pyper-dev.github.io/pyper/docs/Examples)

## Dependencies

Pyper is implemented in pure Python, with no sub-dependencies. It is built on top of the well-established built-in Python modules:
* [threading](https://docs.python.org/3/library/threading.html) for thread-based concurrency
* [multiprocessing](https://docs.python.org/3/library/multiprocessing.html) for parallelism
* [asyncio](https://docs.python.org/3/library/asyncio.html) for async-based concurrency
* [concurrent.futures](https://docs.python.org/3/library/concurrent.futures.html) for unifying threads, processes, and async code

## License

This project is licensed under the terms of the MIT license.
