API Reference

Uploading Large Files

Version: 1.0
Updated: 26th August 2025

This API uses chunked uploads to handle files over 100MB.

This avoids problems with server timeouts and limits on the size of the request body.
It makes multiple requests to the server, uploading 5MB chunks at a time.

1. Create a new upload

POST

https://data.website/api/v3/dataset/:id/upload

Path Parameters

:id

5-character ID of the dataset, eg. "ab12x"

string

Required

HTTP Headers

Authorization	API key with editor permissions.	string	Required
Content-Type	Must be "application/json". Most clients will set this automatically.	string	Required

HTTP Return Codes

200	Returns the callback URLs below.
400	Returns a JSON object explaining the error.
403	API key does not have editor permissions.
404	Dataset does not exist, or resource_id is invalid.

upload_request_body.json

// Example POST body to the API:
{
  filename: "example.csv", // Filename to upload
  parts: 4,                // Number of chunks to upload;
                           // eg. a 23mb file will be 5 parts.

  // Optional fields:
  resource_id: "rz1",      // If replacing an existing file.
  order: 8,                // Position in the dataset list. Default: 0
  title: "Spending Data",  // Title of the resource on the page.
  description: "..."       // Optional description.
}

Response JSON: Upload has started

The response contains two URLs, which are used to upload file chunks and complete the upload.

upload_response_body.json

{
  "urlChunk": "https://data.website/api/v3/upload/:uuid/chunk",
  "urlComplete": "https://data.website/api/v3/upload/:uuid/complete"
}

2. Send the file chunks

POST

https://data.website/api/v3/upload/:uuid/chunk

Path Parameters

:uuid

The upload ID, given in the above response.

string

Required

HTTP Headers

Authorization	API key with editor permissions.	string	Required
Content-Type	Must be "multipart/form-data". Most clients will set this automatically.	string	Required

Form Fields

chunk	Chunk number, starting at 0.	number	Required
file	5MB chunk of the file being uploaded.	file	Required

HTTP Return Codes

200	Returns the etag of the chunk for confirmation.
400	Returns a JSON object explaining the error.
403	API key does not have editor permissions.
404	Upload does not exist.

Once all the chunks are uploaded, you are ready to complete the upload.

3. Complete the upload

POST

https://data.website/api/v3/upload/:uuid/complete

Path Parameters

:uuid

The upload ID, given in the above response.

string

Required

HTTP Headers

Authorization

API key with editor permissions.

string

Required

HTTP Return Codes

200	Returns JSON object summarising the upload.
400	Returns a JSON object explaining the error. eg. not all chunks have been uploaded.
403	API key does not have editor permissions.
404	Upload does not exist.

Response JSON: Upload is complete

The response contains information about the uploaded file.

upload_response_body.json

{
  // The full dataset JSON, with the new file added:
  dataset: {
    "id": "20abc",
    "resources": {
      "rz1": {
        "title": "Spending Data",
        "description": "...",
        "origin": "site/dataset/20abc/.../example.csv",
        // [other resource fields...]
      }
      // [other resources...]
    }
    // [other dataset fields...]
  },
  // The effective JSON patch applied to the dataset:
  applied: [ ... ],
  // The uploaded file's S3 storage location:
  new_key: "site/dataset/20abc/.../example.csv",
  // The resource_id that was created or overwritten:
  resource_id: "rz1"
}

Example: Python client

upload_large_file.py

# Note: There is a Python client library that is simpler than this approach.
import os
import requests

API_KEY = os.getenv("DATAPRESS_API_KEY")
SITE_URL = "https://data.website"

def do_chunked_upload(
  dataset_id,
  file_path,
  resource_id=None,
  title=None,
  description=None
):
    # Get file size
    file_size = os.path.getsize(file_path)
    print(f"File size: {file_size}")
    CHUNK_SIZE = 1024 * 1024 * 5  # 5MB
    num_chunks = (file_size + CHUNK_SIZE - 1) // CHUNK_SIZE
    print(f"Number of chunks: {num_chunks}")

    # Create upload session
    create_url = f"{SITE_URL}/api/v3/dataset/{dataset_id}/upload"
    create_body = {
        "filename": os.path.basename(file_path),
        "parts": num_chunks,
    }
    if resource_id:
        create_body["resource_id"] = resource_id
    if title:
        create_body["title"] = title
    if description:
        create_body["description"] = description

    create_response = requests.post(
        create_url,
        headers={"Authorization": API_KEY},
        json=create_body,
    )
    create_response.raise_for_status()
    create_response_json = create_response.json()
    url_chunk = create_response_json["urlChunk"]
    url_complete = create_response_json["urlComplete"]

    # Upload chunks
    with open(file_path, "rb") as f:
        for i in range(num_chunks):
            print(f"Uploading chunk {i + 1} of {num_chunks}")
            chunk_start = i * CHUNK_SIZE
            chunk_end = min(chunk_start + CHUNK_SIZE, file_size)
            chunk_response = requests.post(
                url_chunk,
                headers={"Authorization": API_KEY},
                files={"file": f.read(chunk_end - chunk_start)},
                data={"part": i},
            )
            chunk_response.raise_for_status()

    # Complete upload
    complete_response = requests.post(
        url_complete,
        headers={"Authorization": API_KEY},
    )
    complete_response.raise_for_status()
    complete_response_json = complete_response.json()
    return complete_response_json