Metadata-Version: 2.1
Name: metaphor-connectors
Version: 0.14.2
Summary: A collection of Python-based 'connectors' that extract metadata from various sources to ingest into the Metaphor app.
Home-page: https://metaphor.io
License: Apache-2.0
Author: Metaphor
Author-email: dev@metaphor.io
Requires-Python: >=3.8.1,<3.12
Classifier: License :: OSI Approved :: Apache Software License
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.9
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Provides-Extra: all
Provides-Extra: bigquery
Provides-Extra: confluence
Provides-Extra: datafactory
Provides-Extra: datahub
Provides-Extra: dbt
Provides-Extra: hive
Provides-Extra: kafka
Provides-Extra: looker
Provides-Extra: metabase
Provides-Extra: monday
Provides-Extra: monte-carlo
Provides-Extra: mssql
Provides-Extra: mysql
Provides-Extra: notion
Provides-Extra: postgresql
Provides-Extra: power-bi
Provides-Extra: redshift
Provides-Extra: s3
Provides-Extra: snowflake
Provides-Extra: static-web
Provides-Extra: synapse
Provides-Extra: tableau
Provides-Extra: thought-spot
Provides-Extra: trino
Provides-Extra: unity-catalog
Requires-Dist: GitPython (>=3.1.37,<4.0.0); extra == "all" or extra == "looker"
Requires-Dist: PyYAML (>=6.0,<7.0)
Requires-Dist: SQLAlchemy (>=1.4.46,<2.0.0); extra == "all" or extra == "mysql"
Requires-Dist: asyncpg (>=0.29.0,<0.30.0); extra == "all" or extra == "postgresql" or extra == "redshift"
Requires-Dist: avro (>=1.11.3,<2.0.0); extra == "all" or extra == "kafka"
Requires-Dist: aws-assume-role-lib (>=2.10.0,<3.0.0)
Requires-Dist: azure-identity (>=1.14.0,<2.0.0); extra == "all" or extra == "datafactory"
Requires-Dist: azure-mgmt-datafactory (>=6.0.0,<7.0.0); extra == "all" or extra == "datafactory"
Requires-Dist: beautifulsoup4 (>=4.12.3,<5.0.0); extra == "all" or extra == "static-web"
Requires-Dist: boto3 (>=1.34.64,<2.0.0)
Requires-Dist: botocore (>=1.34.64,<2.0.0)
Requires-Dist: canonicaljson (>=2.0.0,<3.0.0)
Requires-Dist: confluent-kafka (>=2.3.0,<3.0.0); extra == "all" or extra == "kafka"
Requires-Dist: databricks-sdk (>=0.14.0,<0.15.0); extra == "all" or extra == "unity-catalog"
Requires-Dist: databricks-sql-connector (>=3.0.0,<4.0.0); extra == "all" or extra == "unity-catalog"
Requires-Dist: fastavro (>=1.9.2,<2.0.0); extra == "all" or extra == "s3"
Requires-Dist: google-cloud-bigquery (>=3.1.0,<4.0.0); extra == "all" or extra == "bigquery"
Requires-Dist: google-cloud-logging (>=3.5.0,<4.0.0); extra == "all" or extra == "bigquery"
Requires-Dist: gql[requests] (>=3.4.1,<4.0.0); extra == "all" or extra == "datahub"
Requires-Dist: grpcio-tools (>=1.59.3,<2.0.0); extra == "all" or extra == "kafka"
Requires-Dist: jsonschema (>=4.18.6,<5.0.0)
Requires-Dist: lkml (>=1.3.1,<2.0.0); extra == "all" or extra == "looker"
Requires-Dist: llama-index (>=0.10.19,<0.11.0); extra == "all" or extra == "confluence" or extra == "monday" or extra == "notion" or extra == "static-web"
Requires-Dist: llama-index-embeddings-azure-openai (>=0.1.6,<0.2.0); extra == "all" or extra == "confluence" or extra == "monday" or extra == "notion" or extra == "static-web"
Requires-Dist: llama-index-readers-confluence (>=0.1.4,<0.2.0); extra == "all" or extra == "confluence"
Requires-Dist: llama-index-readers-notion (>=0.1.6,<0.2.0); extra == "all" or extra == "notion"
Requires-Dist: looker-sdk (>=24.2.0,<25.0.0); extra == "all" or extra == "looker"
Requires-Dist: lxml (>=5.0.0,<5.1.0); extra == "all" or extra == "static-web"
Requires-Dist: metaphor-models (==0.33.6)
Requires-Dist: more-itertools (>=10.1.0,<11.0.0); extra == "all" or extra == "s3"
Requires-Dist: msal (>=1.28.0,<2.0.0); extra == "all" or extra == "power-bi"
Requires-Dist: msgraph-beta-sdk (==1.2.0); extra == "all" or extra == "power-bi"
Requires-Dist: parse (>=1.20.0,<2.0.0); extra == "all" or extra == "s3"
Requires-Dist: pathvalidate (>=3.2.0,<4.0.0)
Requires-Dist: pyarrow[pandas] (>=14.0.1,<15.0.0)
Requires-Dist: pycarlo (>=0.8.1,<0.9.0); extra == "all" or extra == "monte-carlo"
Requires-Dist: pydantic[email] (==2.6.4)
Requires-Dist: pyhive (>=0.7.0,<0.8.0); extra == "all" or extra == "hive"
Requires-Dist: pymssql (>=2.2.11,<2.3.0); extra == "all" or extra == "mssql" or extra == "synapse"
Requires-Dist: pymysql (>=1.0.2,<2.0.0); extra == "all" or extra == "mysql"
Requires-Dist: python-dateutil (>=2.8.1,<3.0.0)
Requires-Dist: requests (>=2.28.1,<3.0.0)
Requires-Dist: sasl (>=0.3.1,<0.4.0); extra == "all" or extra == "hive"
Requires-Dist: setuptools (>=69.2.0,<70.0.0)
Requires-Dist: smart-open (>=7.0.1,<8.0.0)
Requires-Dist: snowflake-connector-python (>=3.7.1,<4.0.0); extra == "all" or extra == "snowflake"
Requires-Dist: sql-metadata (>=2.10.0,<3.0.0); extra == "all" or extra == "bigquery"
Requires-Dist: sqllineage (>=1.3.8,<1.4.0); extra == "all" or extra == "tableau" or extra == "thought-spot"
Requires-Dist: tableauserverclient (>=0.25,<0.26); extra == "all" or extra == "tableau"
Requires-Dist: thoughtspot_rest_api_v1 (==1.5.3); extra == "all" or extra == "thought-spot"
Requires-Dist: thrift (>=0.16.0,<0.17.0); extra == "all" or extra == "hive"
Requires-Dist: thrift-sasl (>=0.4.3,<0.5.0); extra == "all" or extra == "hive"
Requires-Dist: trino (>=0.327.0,<0.328.0); extra == "all" or extra == "trino"
Project-URL: Repository, https://github.com/MetaphorData/connectors
Description-Content-Type: text/markdown

<a href="https://metaphor.io"><img src="https://github.com/MetaphorData/connectors/raw/main/logo.png" width="300" /></a>

# Metaphor Connectors

[![Codecov](https://img.shields.io/codecov/c/github/MetaphorData/connectors)](https://app.codecov.io/gh/MetaphorData/connectors/tree/main)
[![CodeQL](https://github.com/MetaphorData/connectors/workflows/CodeQL/badge.svg)](https://github.com/MetaphorData/connectors/actions/workflows/codeql-analysis.yml)
[![PyPI Version](https://img.shields.io/pypi/v/metaphor-connectors)](https://pypi.org/project/metaphor-connectors/)
![Python version 3.8+](https://img.shields.io/badge/python-3.8%2B-blue)
![PyPI Downloads](https://img.shields.io/pypi/dm/metaphor-connectors)
[![Docker Pulls](https://img.shields.io/docker/pulls/metaphordata/connectors)](https://hub.docker.com/r/metaphordata/connectors)
[![License](https://img.shields.io/github/license/MetaphorData/connectors)](https://github.com/MetaphorData/connectors/blob/master/LICENSE)

This repository contains a collection of Python-based "connectors" that extract metadata from various sources to ingest into the [Metaphor](https://metaphor.io) platform.

## Installation

This package requires Python 3.8+ installed. You can verify the version on your system by running the following command,

```shell
python -V  # or python3 on some systems
```

Once verified, you can install the package using [pip](https://docs.python.org/3/installing/index.html),

```shell
pip install "metaphor-connectors[all]"  # or pip3 on some systems
```

This will install all the connectors and required dependencies. You can also choose to install only a subset of the dependencies by installing the specific [extra](https://packaging.python.org/tutorials/installing-packages/#installing-setuptools-extras), e.g.

```shell
pip install "metaphor-connectors[snowflake]"
```

Similarly, you can also install the package using `requirements.txt` or `pyproject.toml`.

## Docker

We automatically push a [docker image](https://hub.docker.com/r/metaphordata/connectors) to Docker Hub as part of the CI/CD. See [this page](./docs/docker.md) for more details.

## GitHub Action

You can also run the connectors in your CI/CD pipeline using the [Metaphor Connectors](https://github.com/marketplace/actions/metaphor-connectors-github-action) GitHub Action.

## Connectors

Each connector is placed under its own directory under [metaphor](./metaphor) and extends the `metaphor.common.BaseExtractor` class.

| Connector Name                                                    | Metadata                                 |
|-------------------------------------------------------------------|------------------------------------------|  
| [azure_data_factory](metaphor/azure_data_factory/)                | Lineage, Pipeline                        |
| [bigquery](metaphor/bigquery/)                                    | Schema, description, statistics, queries |
| [bigquery.lineage](metaphor/bigquery/lineage/)                    | Lineage                                  |
| [bigquery.profile](metaphor/bigquery/profile/)                    | Data profile                             |
| [confluence](metaphor/confluence/)                                | Document embeddings                      |
| [custom.data_quality](metaphor/custom/data_quality/)              | Data quality                             |
| [custom.governance](metaphor/custom/governance/)                  | Ownership, tags, description             |
| [custom.lineage](metaphor/custom/lineage/)                        | Lineage                                  |
| [custom.metadata](metaphor/custom/metadata/)                      | Custom metadata                          |
| [custom.query_attributions](metaphor/custom/query_attributions/)  | Query attritutions                       |
| [datahub](metaphor/datahub/)                                      | Description, tag, ownership              |
| [dbt](metaphor/dbt/)                                              | dbt model, test, lineage                 |
| [dbt.cloud](metaphor/dbt/cloud/)                                  | dbt model, test, lineage                 |
| [fivetran](metaphor/fivetran/)                                    | Lineage, Pipeline                        |
| [glue](metaphor/glue/)                                            | Schema, description                      |
| [looker](metaphor/looker/)                                        | Looker view, explore, dashboard, lineage |
| [kafka](metaphor/kafka/)                                          | Schema, description                      |
| [metabase](metaphor/metabase/)                                    | Dashboard, lineage                       |
| [monte_carlo](metaphor/monte_carlo/)                              | Data monitor                             |
| [mssql](metaphor/mssql/)                                          | Schema                                   |
| [mysql](metaphor/mysql/)                                          | Schema, description                      |
| [notion](metaphor/notion/)                                        | Document embeddings                      |
| [postgresql](metaphor/postgresql/)                                | Schema, description, statistics          |
| [postgresql.profile](metaphor/postgresql/profile/)                | Data profile                             |
| [postgresql.usage](metaphor/postgresql/usage/)                    | Usage                                    |
| [power_bi](metaphor/power_bi/)                                    | Dashboard, lineage                       |
| [redshift](metaphor/redshift/)                                    | Schema, description, statistics, queries |
| [redshift.profile](metaphor/redshift/profile/)                    | Data profile                             |
| [snowflake](metaphor/snowflake/)                                  | Schema, description, statistics, queries |
| [snowflake.lineage](metaphor/snowflake/lineage/)                  | Lineage                                  |
| [snowflake.profile](metaphor/snowflake/profile/)                  | Data profile                             |
| [static_web](metaphor/static_web/)                                | Document embeddings                      |
| [synapse](metaphor/synapse/)                                      | Schema, queries                          |
| [tableau](metaphor/tableau/)                                      | Dashboard, lineage                       |
| [thought_spot](metaphor/thought_spot/)                            | Dashboard, lineage                       |
| [trino](metaphor/trino/)                                          | Schema, description, queries             |
| [unity_catalog](metaphor/unity_catalog/)                          | Schema, description                      |
| [unity_catalog.profile](metaphor/unity_catalog/profile/)          | Data profile, statistics                 |

## Development

See [Development Environment](docs/develop.md) for more instructions on how to set up your local development environment.

## Custom Connectors

See [Adding a Custom Connector](docs/custom.md) for instructions and a full example of creating your custom connectors.

