DataPress logo
API Reference

Uploading Large Files

Version:  1.0
Updated:  26th August 2025

This API uses chunked uploads to handle files over 100MB.

  • This avoids problems with server timeouts and limits on the size of the request body.
  • It makes multiple requests to the server, uploading 5MB chunks at a time.

1. Create a new upload

POST
https://data.website/api/v3/dataset/:id/upload
Path Parameters
:id5-character ID of the dataset, eg. "ab12x"stringRequired
HTTP Headers
AuthorizationAPI key with editor permissions.stringRequired
Content-TypeMust be "application/json". Most clients will set this automatically. stringRequired
HTTP Return Codes
200Returns the callback URLs below.
400Returns a JSON object explaining the error.
403API key does not have editor permissions.
404Dataset does not exist, or resource_id is invalid.
upload_request_body.json
// Example POST body to the API:
{
  filename: "example.csv", // Filename to upload
  parts: 4,                // Number of chunks to upload;
                           // eg. a 23mb file will be 5 parts.

  // Optional fields:
  resource_id: "rz1",      // If replacing an existing file.
  order: 8,                // Position in the dataset list. Default: 0
  title: "Spending Data",  // Title of the resource on the page.
  description: "..."       // Optional description.
}

Response JSON: Upload has started

The response contains two URLs, which are used to upload file chunks and complete the upload.

upload_response_body.json
{
  "urlChunk": "https://data.website/api/v3/upload/:uuid/chunk",
  "urlComplete": "https://data.website/api/v3/upload/:uuid/complete"
}

2. Send the file chunks

POST
https://data.website/api/v3/upload/:uuid/chunk
Path Parameters
:uuidThe upload ID, given in the above response.stringRequired
HTTP Headers
AuthorizationAPI key with editor permissions.stringRequired
Content-TypeMust be "multipart/form-data". Most clients will set this automatically.stringRequired
Form Fields
chunkChunk number, starting at 0.numberRequired
file5MB chunk of the file being uploaded.fileRequired
HTTP Return Codes
200Returns the etag of the chunk for confirmation.
400Returns a JSON object explaining the error.
403API key does not have editor permissions.
404Upload does not exist.

Once all the chunks are uploaded, you are ready to complete the upload.

3. Complete the upload

POST
https://data.website/api/v3/upload/:uuid/complete
Path Parameters
:uuidThe upload ID, given in the above response.stringRequired
HTTP Headers
AuthorizationAPI key with editor permissions.stringRequired
HTTP Return Codes
200Returns JSON object summarising the upload.
400Returns a JSON object explaining the error. eg. not all chunks have been uploaded.
403API key does not have editor permissions.
404Upload does not exist.

Response JSON: Upload is complete

The response contains information about the uploaded file.

upload_response_body.json
{
  // The full dataset JSON, with the new file added:
  dataset: {
    "id": "20abc",
    "resources": {
      "rz1": {
        "title": "Spending Data",
        "description": "...",
        "origin": "site/dataset/20abc/.../example.csv",
        // [other resource fields...]
      }
      // [other resources...]
    }
    // [other dataset fields...]
  },
  // The effective JSON patch applied to the dataset:
  applied: [ ... ],
  // The uploaded file's S3 storage location:
  new_key: "site/dataset/20abc/.../example.csv",
  // The resource_id that was created or overwritten:
  resource_id: "rz1"
}

Example: Python client

upload_large_file.py
# Note: There is a Python client library that is simpler than this approach.
import os
import requests

API_KEY = os.getenv("DATAPRESS_API_KEY")
SITE_URL = "https://data.website"

def do_chunked_upload(
  dataset_id,
  file_path,
  resource_id=None,
  title=None,
  description=None
):
    # Get file size
    file_size = os.path.getsize(file_path)
    print(f"File size: {file_size}")
    CHUNK_SIZE = 1024 * 1024 * 5  # 5MB
    num_chunks = (file_size + CHUNK_SIZE - 1) // CHUNK_SIZE
    print(f"Number of chunks: {num_chunks}")

    # Create upload session
    create_url = f"{SITE_URL}/api/v3/dataset/{dataset_id}/upload"
    create_body = {
        "filename": os.path.basename(file_path),
        "parts": num_chunks,
    }
    if resource_id:
        create_body["resource_id"] = resource_id
    if title:
        create_body["title"] = title
    if description:
        create_body["description"] = description

    create_response = requests.post(
        create_url,
        headers={"Authorization": API_KEY},
        json=create_body,
    )
    create_response.raise_for_status()
    create_response_json = create_response.json()
    url_chunk = create_response_json["urlChunk"]
    url_complete = create_response_json["urlComplete"]

    # Upload chunks
    with open(file_path, "rb") as f:
        for i in range(num_chunks):
            print(f"Uploading chunk {i + 1} of {num_chunks}")
            chunk_start = i * CHUNK_SIZE
            chunk_end = min(chunk_start + CHUNK_SIZE, file_size)
            chunk_response = requests.post(
                url_chunk,
                headers={"Authorization": API_KEY},
                files={"file": f.read(chunk_end - chunk_start)},
                data={"part": i},
            )
            chunk_response.raise_for_status()

    # Complete upload
    complete_response = requests.post(
        url_complete,
        headers={"Authorization": API_KEY},
    )
    complete_response.raise_for_status()
    complete_response_json = complete_response.json()
    return complete_response_json