

API Reference
Uploading Large Files
Version: 1.0
Updated: 26th August 2025
Updated: 26th August 2025
This API uses chunked uploads to handle files over 100MB.
- This avoids problems with server timeouts and limits on the size of the request body.
- It makes multiple requests to the server, uploading 5MB chunks at a time.
1. Create a new upload
POST
https://data.website/api/v3/dataset/:id/uploadPath Parameters
:id | 5-character ID of the dataset, eg. "ab12x" | string | Required |
HTTP Headers
Authorization | API key with editor permissions. | string | Required |
Content-Type | Must be "application/json". Most clients will set this automatically. | string | Required |
HTTP Return Codes
200 | Returns the callback URLs below. |
400 | Returns a JSON object explaining the error. |
403 | API key does not have editor permissions. |
404 | Dataset does not exist, or resource_id is invalid. |
upload_request_body.json
// Example POST body to the API:
{
filename: "example.csv", // Filename to upload
parts: 4, // Number of chunks to upload;
// eg. a 23mb file will be 5 parts.
// Optional fields:
resource_id: "rz1", // If replacing an existing file.
order: 8, // Position in the dataset list. Default: 0
title: "Spending Data", // Title of the resource on the page.
description: "..." // Optional description.
}
Response JSON: Upload has started
The response contains two URLs, which are used to upload file chunks and complete the upload.
upload_response_body.json
{
"urlChunk": "https://data.website/api/v3/upload/:uuid/chunk",
"urlComplete": "https://data.website/api/v3/upload/:uuid/complete"
}
2. Send the file chunks
POST
https://data.website/api/v3/upload/:uuid/chunkPath Parameters
:uuid | The upload ID, given in the above response. | string | Required |
HTTP Headers
Authorization | API key with editor permissions. | string | Required |
Content-Type | Must be "multipart/form-data". Most clients will set this automatically. | string | Required |
Form Fields
chunk | Chunk number, starting at 0. | number | Required |
file | 5MB chunk of the file being uploaded. | file | Required |
HTTP Return Codes
200 | Returns the etag of the chunk for confirmation. |
400 | Returns a JSON object explaining the error. |
403 | API key does not have editor permissions. |
404 | Upload does not exist. |
Once all the chunks are uploaded, you are ready to complete the upload.
3. Complete the upload
POST
https://data.website/api/v3/upload/:uuid/completePath Parameters
:uuid | The upload ID, given in the above response. | string | Required |
HTTP Headers
Authorization | API key with editor permissions. | string | Required |
HTTP Return Codes
200 | Returns JSON object summarising the upload. |
400 | Returns a JSON object explaining the error. eg. not all chunks have been uploaded. |
403 | API key does not have editor permissions. |
404 | Upload does not exist. |
Response JSON: Upload is complete
The response contains information about the uploaded file.
upload_response_body.json
{
// The full dataset JSON, with the new file added:
dataset: {
"id": "20abc",
"resources": {
"rz1": {
"title": "Spending Data",
"description": "...",
"origin": "site/dataset/20abc/.../example.csv",
// [other resource fields...]
}
// [other resources...]
}
// [other dataset fields...]
},
// The effective JSON patch applied to the dataset:
applied: [ ... ],
// The uploaded file's S3 storage location:
new_key: "site/dataset/20abc/.../example.csv",
// The resource_id that was created or overwritten:
resource_id: "rz1"
}
Example: Python client
upload_large_file.py
# Note: There is a Python client library that is simpler than this approach.
import os
import requests
API_KEY = os.getenv("DATAPRESS_API_KEY")
SITE_URL = "https://data.website"
def do_chunked_upload(
dataset_id,
file_path,
resource_id=None,
title=None,
description=None
):
# Get file size
file_size = os.path.getsize(file_path)
print(f"File size: {file_size}")
CHUNK_SIZE = 1024 * 1024 * 5 # 5MB
num_chunks = (file_size + CHUNK_SIZE - 1) // CHUNK_SIZE
print(f"Number of chunks: {num_chunks}")
# Create upload session
create_url = f"{SITE_URL}/api/v3/dataset/{dataset_id}/upload"
create_body = {
"filename": os.path.basename(file_path),
"parts": num_chunks,
}
if resource_id:
create_body["resource_id"] = resource_id
if title:
create_body["title"] = title
if description:
create_body["description"] = description
create_response = requests.post(
create_url,
headers={"Authorization": API_KEY},
json=create_body,
)
create_response.raise_for_status()
create_response_json = create_response.json()
url_chunk = create_response_json["urlChunk"]
url_complete = create_response_json["urlComplete"]
# Upload chunks
with open(file_path, "rb") as f:
for i in range(num_chunks):
print(f"Uploading chunk {i + 1} of {num_chunks}")
chunk_start = i * CHUNK_SIZE
chunk_end = min(chunk_start + CHUNK_SIZE, file_size)
chunk_response = requests.post(
url_chunk,
headers={"Authorization": API_KEY},
files={"file": f.read(chunk_end - chunk_start)},
data={"part": i},
)
chunk_response.raise_for_status()
# Complete upload
complete_response = requests.post(
url_complete,
headers={"Authorization": API_KEY},
)
complete_response.raise_for_status()
complete_response_json = complete_response.json()
return complete_response_json