Skip to content

Python Library

PyPI - Version

tl;dr

 View Docs

Overview

The DBRepo Python library is using some of the most pupular and maintained Python packages for Data Scientists under the hood. For example: requests to interact with the HTTP API endpoints, pandas for data operations and pydantic for information representation from/to the HTTP API.

Installing

1.4.2

$ python -m pip install dbrepo

To use DBRepo in your Jupyter notebook, install the dbrepo library` directly in a code cell and type:

!pip install dbrepo

This package supports Python 3.11+.

Quickstart

Get public data from a table as pandas DataFrame:

from dbrepo.RestClient import RestClient

client = RestClient(endpoint="https://dbrepo1.ec.tuwien.ac.at")
# Get a small data slice of just three rows
df = client.get_table_data(database_id=7, table_id=13, page=0, size=3, df=True)
print(df)
#     x_coord         component   unit  ... value stationid meantype
# 0  16.52617  Feinstaub (PM10)  µg/m³  ...  21.0   01:0001      HMW
# 1  16.52617  Feinstaub (PM10)  µg/m³  ...  23.0   01:0001      HMW
# 2  16.52617  Feinstaub (PM10)  µg/m³  ...  26.0   01:0001      HMW
#
# [3 rows x 12 columns]

Import data into a table:

import pandas as pd
from dbrepo.RestClient import RestClient

client = RestClient(endpoint="https://dbrepo1.ec.tuwien.ac.at", username="foo",
                    password="bar")
df = pd.DataFrame(data={'x_coord': 16.52617, 'component': 'Feinstaub (PM10)',
                        'unit': 'µg/m³', ...})
client.import_table_data(database_id=7, table_id=13, file_name_or_data_frame=df)

Supported Features & Best-Practices

  • Manage user account (docs)
  • Manage databases (docs)
  • Manage database access & visibility (docs)
  • Import dataset (docs)
  • Create persistent identifiers (docs)
  • Execute queries (docs)
  • Get data from tables/views/subsets

Configure

All credentials can optionally be set/overridden with environment variables. This is especially useful when sharing Jupyter Notebooks by creating an invisible .env file and loading it:

.env
REST_API_ENDPOINT="https://dbrepo1.ec.tuwien.ac.at"
REST_API_USERNAME="foo"
REST_API_PASSWORD="bar"
REST_API_SECURE="True"
AMQP_API_HOST="https://dbrepo1.ec.tuwien.ac.at"
AMQP_API_PORT="5672"
AMQP_API_USERNAME="foo"
AMQP_API_PASSWORD="bar"
AMQP_API_VIRTUAL_HOST="dbrepo"
REST_UPLOAD_ENDPOINT="https://dbrepo1.ec.tuwien.ac.at/api/upload/files"

You can disable logging by setting the log level to e.g. INFO:

from dbrepo.RestClient import RestClient
import logging
logging.getLogger().setLevel(logging.INFO)
...
client = RestClient(...)

Future

  • Searching

This information is also mirrored on PyPI.