DataPress logo
Client Libraries

Python Client

Version:  1.1.1
Updated:  14th September 2025

Overview

The DataPress Python client provides a simple interface for interacting with the DataPress API. It handles authentication, chunked uploads, and provides client-side validation for common operations.

Installation

pip install datapress

Authentication

Set your API credentials as environment variables:

export DATAPRESS_API_KEY="your-api-key"
export DATAPRESS_URL="https://your-datapress-instance.com"

Basic Usage

Initialize the Client

from datapress import DataPressClient

# Initialize using environment variables
client = DataPressClient()

# Or pass credentials explicitly
client = DataPressClient(
    api_key="your-api-key",
    base_url="https://your-datapress-instance.com"
)

# Verify authentication
user_info = client.whoami()
print(f"Logged in as: {user_info['title']}")

Get Dataset Information

# Retrieve a dataset by ID
dataset = client.get_dataset("ab12x")
print(f"Dataset: {dataset['title']}")
print(f"Resources: {len(dataset['resources'])}")

Common Operations

Renaming a Dataset

Use JSON Patch operations to modify dataset metadata:

patch = [{"op": "replace", "path": "/title", "value": "New Dataset Name"}]
result = client.patch_dataset("ab12x", patch)
print(f"Dataset renamed to: {result['dataset']['title']}")

Adding a File

Upload new files using chunked uploads (automatically handles large files):

result = client.upload_file(
    dataset_id="ab12x",
    file_path="data/sales.csv",
    # Optional fields:
    title="Sales Data",
    description="Monthly sales figures",
    timeframe={ "from": "2020-01", "to": "2024-11" }
)
print(f"File uploaded with ID: {result['resource_id']}")

Replacing a File

Replace an existing file by providing its resource ID:

result = client.upload_file(
    dataset_id="ab12x",
    file_path="data/updated_sales.csv",
    # Optional fields:
    resource_id="xyz",  # ID of existing file to replace
    title="Updated Sales Data"
)
print(f"File replaced: {result['resource_id']}")

Downloading Files

Download files with proper authentication:

file_data = client.download_file(
    dataset_id="ab12x",
    resource_id="xyz"
)

# Save to disk
with open("downloaded_file.csv", "wb") as f:
    f.write(file_data)

Advanced Usage

Batch Operations with Patch

Perform multiple updates in a single request:

patch_operations = [
    {"op": "replace", "path": "/title", "value": "Updated Title"},
    {"op": "replace", "path": "/description", "value": "Updated description"},
    {"op": "add", "path": "/links/new_link", "value": {
        "url": "https://example.com",
        "title": "Related Data"
    }}
]

result = client.patch_dataset("ab12x", patch_operations)

Remove Resources

Remove uploaded files using patch operations:

patch = [{"op": "remove", "path": "/resources/xyz89"}]
client.patch_dataset("ab12x", patch)

Uploading a File from S3

You can transfer a file from S3 to a dataset by passing your boto3.client instance to the upload_file_from_s3 method. This will download the file in 5MB chunks and incrementally upload it to the dataset.

import boto3

session = boto3.session.Session()
s3_client = session.client(
    "s3",
    # ... credentials, endpoints, regions etc
    aws_access_key_id=os.getenv("AWS_ACCESS_KEY_ID"),
    aws_secret_access_key=os.getenv("AWS_SECRET_ACCESS_KEY"),
)

# Upload a file from S3 to a dataset
result = client.upload_file_from_s3(
    s3_client=s3_client,
    bucket="your-bucket-name",
    key="your-object-key",
    dataset_id="ab12x",
    # Optional parameters:
    resource_id="xyz",  # ID of existing file to replace
    title="Updated Spending Data",
    description="Monthly spending figures",
    order=1,
    timeframe={"from": "2024-01", "to": "2025-04"}
)
print(f"File uploaded from S3: {result['resource_id']}")

Error Handling

The client provides specific exception types for common errors:

from datapress.client import AuthenticationError, NotFoundError, PermissionError

try:
    dataset = client.get_dataset("ab12x")
except AuthenticationError:
    print("Invalid API key")
except NotFoundError:
    print("Dataset not found")
except PermissionError:
    print("No access to this dataset")

Source Code

The Python client is open source and available on GitHub. For development setup, testing, and detailed API reference, see the DEVELOPMENT.md file.