Metadata-Version: 2.1
Name: python-dlt
Version: 0.1.0rc1
Summary: DLT is an open-source python-native scalable data loading framework that does not require any devops efforts to run.
Home-page: https://github.com/scale-vector
License: Apache-2.0
Keywords: etl
Author: ScaleVector
Author-email: services@scalevector.ai
Maintainer: Marcin Rudolf
Maintainer-email: marcin@scalevector.ai
Requires-Python: >=3.8,<3.11
Classifier: Development Status :: 3 - Alpha
Classifier: Intended Audience :: Developers
Classifier: License :: OSI Approved :: Apache Software License
Classifier: Operating System :: MacOS :: MacOS X
Classifier: Operating System :: POSIX :: Linux
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.8
Classifier: Programming Language :: Python :: 3.9
Classifier: Topic :: Software Development :: Libraries
Provides-Extra: dbt
Provides-Extra: gcp
Provides-Extra: postgres
Provides-Extra: redshift
Requires-Dist: GitPython (>=3.1.26,<4.0.0); extra == "dbt"
Requires-Dist: PyYAML (>=5.4.1,<6.0.0)
Requires-Dist: cachetools (>=5.2.0,<6.0.0)
Requires-Dist: dbt-bigquery (==1.0.0); extra == "dbt"
Requires-Dist: dbt-core (==1.0.6); extra == "dbt"
Requires-Dist: dbt-redshift (==1.0.1); extra == "dbt"
Requires-Dist: google-cloud-bigquery (>=2.26.0,<3.0.0); extra == "gcp"
Requires-Dist: grpcio (==1.43.0); extra == "gcp"
Requires-Dist: hexbytes (>=0.2.2,<0.3.0)
Requires-Dist: json-logging (==1.4.1rc0)
Requires-Dist: jsonlines (>=2.0.0,<3.0.0)
Requires-Dist: pendulum (>=2.1.2,<3.0.0)
Requires-Dist: prometheus-client (>=0.11.0,<0.12.0)
Requires-Dist: psycopg2-binary (>=2.9.1,<3.0.0); extra == "postgres" or extra == "redshift"
Requires-Dist: requests (>=2.26.0,<3.0.0)
Requires-Dist: semver (>=2.13.0,<3.0.0)
Requires-Dist: sentry-sdk (>=1.4.3,<2.0.0)
Requires-Dist: simplejson (>=3.17.5,<4.0.0)
Project-URL: Repository, https://github.com/scale-vector/dlt
Description-Content-Type: text/markdown

Follow this quick guide to implement DLT in your project

## Simple loading of one row:

### Install DLT
DLT is available in PyPi and can be installed with `pip install python-dlt`. Support for target warehouses is provided in extra packages:

`pip install python-dlt[redshift]` for Redshift

`pip install python-dlt[gcp]` for BigQuery

### Create a target credential
```
credential = {'type':'redshift',
                'host': '123.456.789.101'
                'port': '5439'
                'user': 'loader'
                'password': 'dolphins'

                }


```

### Initialise the loader with your credentials and load one json row
```
import dlt

loader = dlt(credential)

json_row = "{"name":"Gabe", "age":30}"

table_name = 'users'

loader.load(table_name, json_row)

```

## Loading a nested json object

```
import dlt

loader = dlt(credential)

json_row = "{"name":"Gabe", "age":30, "id":456, "children":[{"name": "Bill", "id": 625},
                                                            {"name": "Cill", "id": 666},
                                                            {"name": "Dill", "id": 777}
                                                            ]
            }"


table_name = 'users'


#unpack the nested json. To be able to re-pack it, we create the parent - child join keys via row / parent row hashes.

rows = loader.utils.unpack(table_name, json_row)

# rows are a generator that outputs the parent or child table name and the data row such as:

#("users", "{"name":"Gabe", "age":30, "id":456, "row_hash":"parent_row_md5"}")
#("users__children", "{"name":"Bill", "id":625, "parent_row_hash":"parent_row_md5", "row_hash":"child1_row_md5"}")
#("users__children", "{"name":"Cill", "id":666, "parent_row_hash":"parent_row_md5", "row_hash":"child2_row_md5"}")
#("users__children", "{"name":"Dill", "id":777, "parent_row_hash":"parent_row_md5", "row_hash":"child3_row_md5"}")


#loading the tables users, and users__children
for table, row in rows:
    loader.load(table_name, row)


#to recreate the original structure
select users.*, users__children.*
from users
left join users__children
    on users.row_hash = users__children.parent_row_hash
```

