Enrich many transaction at once
Batch Enrichment Guide
ⓘ Reading Time: 5 mins
Batch enrichment allows you to process large volumes of transactions efficiently. Instead of making individual API calls for each transaction, you can submit up to 50,000 transactions in a single request and retrieve the enriched results once processing is complete. We have two endpoints for batch enrichment: /batches/transactions/enrich
and /batches/transactions/enrich/parse
. Under the hood, the /batches/transactions/enrich
endpoint behaves like the /transactions/enrich
endpoint and the /batches/transactions/enrich/parse
endpoint behaves like the /transactions/enrich/parse
endpoint.
Getting Started
Let's walk through the process of enriching a batch of transactions:
- First, submit your batch of transactions:
import requests
import time
# Prepare your batch of transactions
transactions = [
{
"userId": "user_123",
"merchantName": "Walmart",
"amount": 50.23,
"currencyCode": "USD",
"occurredAt": "2024-03-21T15:30:00Z",
"categoryType": "MCC",
"categoryCode": "5411",
"transactionId": "tx_123", # Must be unique within the batch
"location": {
"city": "Seattle",
"region": "WA",
"country": "USA"
}
},
# ... more transactions ...
]
# Submit the batch
response = requests.post(
"https://east.sandbox.spade.com/batches/transactions/enrich",
json={"transactions": transactions},
headers={"X-Api-Key": "<Your API Key Here>"}
)
batch_id = response.json()["batchId"]
print(f"Batch submitted with ID: {batch_id}")
Notice that we hold on to the batchId
returned from the submission request.
This ID is required to check the status and retrieve the results of the batch enrichment job.
- Check the batch status until it's complete:
def wait_for_batch_completion(batch_id: str, delay_seconds: int =10):
"""Wait for a batch job to complete, checking status periodically"""
status = "pending"
while True:
response = requests.get(
f"https://east.sandbox.spade.com/batches/{batch_id}",
headers={"X-Api-Key": "<Your API Key Here>"}
)
status = response.json()["status"]
print(f"Batch status: {status}")
if status == "completed":
return True
elif status == "failed":
raise Exception("Batch processing failed")
time.sleep(delay_seconds)
raise Exception("Timeout waiting for batch completion")
# Wait for the batch to complete
wait_for_batch_completion(batch_id)
- Once the batch is complete, retrieve the results:
# Get the enriched results
response = requests.get(
f"https://east.sandbox.spade.com/batches/{batch_id}/results",
headers={"X-Api-Key": "<Your API Key Here>"}
)
results = response.json()["results"]
# Process your enriched transactions
for enriched_transaction in results:
print(f"Enriched transaction: {enriched_transaction['enrichmentId']}")
# ... process the enrichment ...
Understanding Batch Status
A batch job can be in one of four states:
pending
: The batch has been accepted but processing hasn't startedenriching
: The batch is currently being processedcompleted
: All transactions have been enriched and results are readyfailed
: The batch encountered an error and couldn't be processed
The
/batches/{batchId}/results
endpoint will only return results when the batch status iscompleted
. If you request results before completion, you'll receive a 202 status code indicating that the results aren't yet available.
Best Practices
-
Unique Transaction IDs: Ensure each transaction in your batch has a unique
transactionId
. Results are not guaranteed to be returned in the order in which they were submitted, so you can use thetransactionId
to correlate transactions in the results with the original requests. Duplicate IDs will cause the batch enrichment request to fail. -
Status Polling: Implement exponential backoff when checking batch status to avoid unnecessary API calls:
def wait_for_batch_completion_with_backoff(batch_id, max_attempts=10, initial_delay=60):
"""Wait for batch completion with exponential backoff"""
delay = initial_delay
while True:
response = requests.get(
f"https://east.sandbox.spade.com/batches/{batch_id}",
headers={"X-Api-Key": "<Your API Key Here>"}
)
status = response.json()["status"]
if status == "completed":
return True
elif status == "failed":
raise Exception("Batch processing failed")
print(f"Sleeping for {delay} seconds before checking status again")
time.sleep(delay)
delay = min(delay * 2, 60)
raise Exception("Timeout waiting for batch completion")
-
Timeouts: Uploading large batches can take a while, so we recommend increasing the timeout of your client. We support up to a 120 second timeout window.
-
Error Handling: There are two classes of errors you will want to handle. The first is the errors you may encounter when submitting the batch via the
/batches/transactions/enrich
endpoint. The second class of errors can be returned as individual payloads in theresults
array of the/batches/{batchId}/results
endpoint. For example, you may receive a 400 error for a single request in the batch if the request body was missing a required field, while other requests in the batch return a 200 status code. See our enrichment guide for more details on handling errors from our API.
Using Batch Enrichment with DE43 Data
If you have unparsed DE43 data, you can use the /batches/transactions/enrich/parse
endpoint instead. This endpoint works exactly the same way, but allows you to omit some fields which are required in the /batches/transactions/enrich
endpoint and include DE43 data that will be parsed before enrichment. Under the hood, this endpoint behaves like the /transactions/enrich/parse
endpoint.
transactions = [
{
"userId": "user_123",
"de43": "WALMART SEATTLE WAUS",
"amount": 50.23,
"currencyCode": "USD",
"occurredAt": "2024-03-21T15:30:00Z",
"categoryType": "MCC",
"categoryCode": "5411",
"transactionId": "tx_123",
}
]
response = requests.post(
"https://east.sandbox.spade.com/batches/transactions/enrich/parse",
json={"transactions": transactions},
headers={"X-Api-Key": "<Your API Key Here>"}
)