API ReferenceChangelog
API Reference

Batch enrichment guide

Enrich many transaction at once

Batch Enrichment Guide

ⓘ Reading Time: 5 mins

Batch enrichment allows you to process large volumes of transactions efficiently. Instead of making individual API calls for each transaction, you can submit up to 50,000 transactions in a single request and retrieve the enriched results once processing is complete. We have two endpoints for batch enrichment: /batches/transactions/enrich and /batches/transactions/enrich/parse. Under the hood, the /batches/transactions/enrich endpoint behaves like the /transactions/enrich endpoint and the /batches/transactions/enrich/parse endpoint behaves like the /transactions/enrich/parse endpoint.

Getting Started

Let's walk through the process of enriching a batch of transactions:

  1. First, submit your batch of transactions:
import requests
import time

# Prepare your batch of transactions
transactions = [
    {
        "userId": "user_123",
        "merchantName": "Walmart",
        "amount": 50.23,
        "currencyCode": "USD",
        "occurredAt": "2024-03-21T15:30:00Z",
        "categoryType": "MCC",
        "categoryCode": "5411",
        "transactionId": "tx_123",  # Must be unique within the batch
        "location": {
            "city": "Seattle",
            "region": "WA",
            "country": "USA"
        }
    },
    # ... more transactions ...
]

# Submit the batch
response = requests.post(
    "https://east.sandbox.spade.com/batches/transactions/enrich",
    json={"transactions": transactions},
    headers={"X-Api-Key": "<Your API Key Here>"}
)

batch_id = response.json()["batchId"]
print(f"Batch submitted with ID: {batch_id}")

Notice that we hold on to the batchId returned from the submission request.
This ID is required to check the status and retrieve the results of the batch enrichment job.

  1. Check the batch status until it's complete:
def wait_for_batch_completion(batch_id: str, delay_seconds: int =10):
    """Wait for a batch job to complete, checking status periodically"""
    status = "pending"
    while True:
        response = requests.get(
            f"https://east.sandbox.spade.com/batches/{batch_id}",
            headers={"X-Api-Key": "<Your API Key Here>"}
        )
        status = response.json()["status"]

        print(f"Batch status: {status}")

        if status == "completed":
            return True
        elif status == "failed":
            raise Exception("Batch processing failed")

        time.sleep(delay_seconds)

    raise Exception("Timeout waiting for batch completion")

# Wait for the batch to complete
wait_for_batch_completion(batch_id)
  1. Once the batch is complete, retrieve the results:
# Get the enriched results
response = requests.get(
    f"https://east.sandbox.spade.com/batches/{batch_id}/results",
    headers={"X-Api-Key": "<Your API Key Here>"}
)

results = response.json()["results"]

# Process your enriched transactions
for enriched_transaction in results:
    print(f"Enriched transaction: {enriched_transaction['enrichmentId']}")
    # ... process the enrichment ...

Understanding Batch Status

A batch job can be in one of four states:

  • pending: The batch has been accepted but processing hasn't started
  • enriching: The batch is currently being processed
  • completed: All transactions have been enriched and results are ready
  • failed: The batch encountered an error and couldn't be processed

The /batches/{batchId}/results endpoint will only return results when the batch status is completed. If you request results before completion, you'll receive a 202 status code indicating that the results aren't yet available.

Best Practices

  1. Unique Transaction IDs: Ensure each transaction in your batch has a unique transactionId. Results are not guaranteed to be returned in the order in which they were submitted, so you can use the transactionId to correlate transactions in the results with the original requests. Duplicate IDs will cause the batch enrichment request to fail.

  2. Status Polling: Implement exponential backoff when checking batch status to avoid unnecessary API calls:

def wait_for_batch_completion_with_backoff(batch_id, max_attempts=10, initial_delay=60):
    """Wait for batch completion with exponential backoff"""
    delay = initial_delay

    while True:
        response = requests.get(
            f"https://east.sandbox.spade.com/batches/{batch_id}",
            headers={"X-Api-Key": "<Your API Key Here>"}
        )
        status = response.json()["status"]

        if status == "completed":
            return True
        elif status == "failed":
            raise Exception("Batch processing failed")

        print(f"Sleeping for {delay} seconds before checking status again")
        time.sleep(delay)
        delay = min(delay * 2, 60)

    raise Exception("Timeout waiting for batch completion")
  1. Timeouts: Uploading large batches can take a while, so we recommend increasing the timeout of your client. We support up to a 120 second timeout window.

  2. Error Handling: There are two classes of errors you will want to handle. The first is the errors you may encounter when submitting the batch via the /batches/transactions/enrich endpoint. The second class of errors can be returned as individual payloads in the results array of the /batches/{batchId}/results endpoint. For example, you may receive a 400 error for a single request in the batch if the request body was missing a required field, while other requests in the batch return a 200 status code. See our enrichment guide for more details on handling errors from our API.

Using Batch Enrichment with DE43 Data

If you have unparsed DE43 data, you can use the /batches/transactions/enrich/parse endpoint instead. This endpoint works exactly the same way, but allows you to omit some fields which are required in the /batches/transactions/enrich endpoint and include DE43 data that will be parsed before enrichment. Under the hood, this endpoint behaves like the /transactions/enrich/parse endpoint.

transactions = [
    {
        "userId": "user_123",
        "de43": "WALMART              SEATTLE     WAUS",
        "amount": 50.23,
        "currencyCode": "USD",
        "occurredAt": "2024-03-21T15:30:00Z",
        "categoryType": "MCC",
        "categoryCode": "5411",
        "transactionId": "tx_123",
    }
]

response = requests.post(
    "https://east.sandbox.spade.com/batches/transactions/enrich/parse",
    json={"transactions": transactions},
    headers={"X-Api-Key": "<Your API Key Here>"}
)

Additional Resources

Getting Started Guide