Metadata-Version: 2.1
Name: acl-anthology-py
Version: 0.4.0
Summary: A library for accessing the ACL Anthology
Home-page: https://github.com/mbollmann/acl-anthology-py
License: Apache-2.0
Author: Marcel Bollmann
Author-email: marcel@bollmann.me
Requires-Python: >=3.10, !=2.7.*, !=3.0.*, !=3.1.*, !=3.2.*, !=3.3.*, !=3.4.*, !=3.5.*, !=3.6.*, !=3.7.*, !=3.8.*, !=3.9.*
Classifier: Development Status :: 4 - Beta
Classifier: Intended Audience :: Developers
Classifier: Intended Audience :: Science/Research
Classifier: License :: OSI Approved :: Apache Software License
Classifier: Operating System :: MacOS :: MacOS X
Classifier: Operating System :: Microsoft :: Windows
Classifier: Operating System :: POSIX
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Topic :: Scientific/Engineering
Classifier: Topic :: Scientific/Engineering :: Interface Engine/Protocol Translator
Classifier: Topic :: Software Development :: Libraries :: Python Modules
Requires-Dist: PyYAML (>=6.0,<7.0)
Requires-Dist: app-paths (>=0.0.7,<0.0.8)
Requires-Dist: attrs (>=23.1.0,<24.0.0)
Requires-Dist: diskcache (>=5.6.1,<6.0.0)
Requires-Dist: docopt (>=0.6.2,<0.7.0)
Requires-Dist: latexcodec (>=2.0.1,<3.0.0)
Requires-Dist: lxml (>=4.9.2,<5.0.0)
Requires-Dist: numpy (>=1.26.0,<2.0.0)
Requires-Dist: omegaconf (>=2.3.0,<3.0.0)
Requires-Dist: python-slugify[unidecode] (>=8.0.1,<9.0.0)
Requires-Dist: rich (>=13.3.5,<14.0.0)
Requires-Dist: scipy (>=1.6.0,<2.0.0)
Requires-Dist: texsoup (>=0.3.1,<0.4.0)
Requires-Dist: typing-extensions (>=4.6.0,<5.0.0) ; python_version < "3.11"
Project-URL: Repository, https://github.com/mbollmann/acl-anthology-py
Description-Content-Type: text/markdown

# acl-anthology-py

[![License](https://img.shields.io/github/license/mbollmann/acl-anthology-py)](LICENSE)
[![Build Status](https://img.shields.io/github/actions/workflow/status/mbollmann/acl-anthology-py/code-quality.yml)](https://github.com/mbollmann/acl-anthology-py/actions/workflows/code-quality.yml)
[![Code Coverage](https://img.shields.io/codecov/c/gh/mbollmann/acl-anthology-py)](https://codecov.io/gh/mbollmann/acl-anthology-py)
![Supported Python Versions](https://img.shields.io/pypi/pyversions/acl-anthology-py)
![Development Status](https://img.shields.io/pypi/status/acl-anthology-py)
[![Package on PyPI](https://img.shields.io/pypi/v/acl-anthology-py)](https://pypi.org/project/acl-anthology-py/)

This package accesses data from the [ACL
Anthology](https://github.com/acl-org/acl-anthology).

API documentation can already be generated locally (see below for instructions),
more documentation (included a web-hosted version) is coming.

## How to use

Install via `pip`:

```bash
$ pip install acl-anthology-py
```

Clone the [ACL Anthology](https://github.com/acl-org/acl-anthology) repo to
obtain the data files _(there will be an option to automate this step in a
future version)_:

```bash
$ git clone https://github.com/acl-org/acl-anthology
```

Afterwards, the library can be instantiated as follows:

```python
from acl_anthology import Anthology

# "datadir" needs to point to the "data/" folder of the acl-anthology repo
anthology = Anthology(datadir="acl-anthology/data")
```

Some usage examples:

```python
paper = anthology.get("C92-1025")

print(str(paper.title))
# Two-Level Morphology with Composition

print([author.name for author in paper.authors])
# [Name(first='Lauri', last='Karttunen'), Name(first='Ronald M.', last='Kaplan'), Name(first='Annie', last='Zaenen')]

from acl_anthology.people import Name
print(anthology.people.get_by_name(Name("Lauri", "Karttunen")))
# [Person(id='lauri-karttunen', names=[Name(first='Lauri', last='Karttunen')],
#         item_ids={('C94', '2', '206'), ('W05', '12', '6'), ('C69', '70', '1'),
#                   ('J83', '2', '5'), ('C86', '1', '16'), ('C92', '1', '25'), ...})]
```

## Developing

This package uses **Python 3.10+** with the
[**Poetry**](https://python-poetry.org/) packaging system.

Cloning the repository and running `make` will install all dependencies via
Poetry, run all style and type checks, run all tests, and generate the
documentation.

### Install dependencies and pre-commit hooks

`make setup` will install all package dependencies in development mode, as well
as install the pre-commit hooks that run on every attempted git commit.

If you only want the dependencies, but not the hooks, run `make dependencies`.

### Running checks

`make check` will run [black](https://github.com/psf/black),
[ruff](https://github.com/charliermarsh/ruff), and [some other pre-commit
hooks](.pre-commit-config.yaml), as well as the
[mypy](https://mypy.readthedocs.io/) type checker on all files in the repo.

### Running tests

`make test` will run Python unit tests and integration tests.

### Running benchmarks

The [`benchmarks/`](benchmarks/) folder collects some benchmarks intended to be
run with the [richbench](https://github.com/tonybaloney/rich-bench) tool:

```bash
poetry run richbench benchmarks/
```

### Generating and writing documentation

- `make docs` (to generate in `site/`)
- `make docs-serve` (to serve locally)

Docstrings are written in [Google
style](https://github.com/google/styleguide/blob/gh-pages/pyguide.md#38-comments-and-docstrings)
as this [supports the most
features](https://mkdocstrings.github.io/griffe/docstrings/#parsers-features)
with the mkdocstrings handler (particularly compared to Sphinx/reST).

