Rate limits
Every request to partner/v1 is metered per project. This page documents the published quotas, the RateLimit-* headers we return on every response, the 429 rate_limited error, and the back-off strategy we expect well-behaved clients to use.
How limiting works
Credicorp meters traffic with a token-bucket algorithm scoped to your project — not to an individual key or token. Sandbox and live projects have independent buckets, so load-testing against the sandbox never eats into your production allowance. Each bucket has a sustained rate (tokens refilled per second) and a burst capacity (the maximum number of tokens the bucket can hold). A request costs one token; reads and writes are weighted equally.
Because the bucket refills continuously rather than resetting on a fixed wall-clock boundary, you can spend your burst capacity in a short spike and then settle back to the sustained rate without being locked out for a full minute. Limits are evaluated at the edge before a request reaches application code, so a throttled request never touches decisioning, the rails, or your account data.
Limits are applied per project. If you run several integrations from one project, they share a bucket — provision a separate project per workload when you need isolated headroom.
Default quotas by tier
Quotas scale with your partner tier. The figures below are the default ceilings on the live ring; sandbox is fixed at the Build tier regardless of your live tier. Some endpoints carry their own tighter sub-limit, listed further down.
| Tier | Sustained | Burst | Concurrent |
|---|---|---|---|
build (sandbox) | 10 req/s | 50 | 10 |
launch | 25 req/s | 100 | 20 |
scale | 100 req/s | 400 | 50 |
platform | Negotiated | Negotiated | Negotiated |
Per-endpoint sub-limits
A handful of routes touch external rails or expensive models and are throttled independently of your project bucket. These protect both you and the upstream provider, so you can hit them even when your project bucket has headroom to spare.
| Endpoint | Sub-limit | Why | |
|---|---|---|---|
/applications | POST | 5 req/s | Each opens a decisioning case. |
/decisions/{id}/refresh | POST | 1 req/s | Re-runs the model; costly. |
/payments/links | POST | 10 req/s | Provisions a PISP payment. |
/identity/checks | POST | 5 req/s | Calls the KYC/AML provider. |
Rate-limit headers
Every response — success or failure — carries the current state of your bucket. Read these instead of guessing; they let you slow down before you hit the wall. Header names follow the IETF RateLimit draft.
| Header | Example | Meaning |
|---|---|---|
RateLimit-Limit | 100 | Burst capacity of the bucket for this project. |
RateLimit-Remaining | 87 | Tokens left right now. Throttle as this nears 0. |
RateLimit-Reset | 3 | Seconds until the bucket is full again. |
RateLimit-Policy | 100;w=1;burst=100 | The active policy: rate, window and burst. |
Retry-After | 2 | Only on 429. Seconds to wait before retrying. |
A throttled response
HTTP/1.1 429 Too Many Requests RateLimit-Limit: 100 RateLimit-Remaining: 0 RateLimit-Reset: 2 RateLimit-Policy: "100;w=1;burst=100" Retry-After: 2 Content-Type: application/json
The 429 error
When the bucket is empty, the edge returns 429 Too Many Requests with the standard error envelope. The body tells you which scope tripped — the project bucket or a per-endpoint sub-limit — and always mirrors the wait in retry_after.
{
"error": {
"type": "rate_limited",
"message": "Project request quota exceeded. Retry after 2s.",
"scope": "project",
"retry_after": 2,
"docs_url": "https://dev.credicorp.co.uk/api-reference/rate-limits.html",
"request_id": "req_3fa7c9Qm0e"
}
}A 429 is safe to retry. The request never reached application logic, so nothing was created or charged. Replaying it — ideally with the same Idempotency-Key — cannot double-apply a write. See Idempotency.
Recommended back-off strategy
Honour Retry-After first; it is authoritative. When it is absent (for example on a transient 503), fall back to exponential back-off with full jitter so a fleet of clients does not retry in lockstep and stampede the edge the instant the window resets.
- On
429, sleep forRetry-Afterseconds, then retry the same request. - On
5xxwith noRetry-After, sleeprandom(0, base × 2^attempt), capped at 30 seconds. - Cap retries at 5 attempts; after that, surface the error and queue the work.
- Never retry a
4xxother than429— the request is malformed and will fail again.
# Retry a GET up to 5 times, honouring Retry-After for i in 1 2 3 4 5; do resp=$(curl -s -w "%{http_code}" -o /tmp/body.json \ https://api.credicorp.co.uk/partner/v1/applications \ -H "Authorization: Bearer $TOKEN") if [ "$resp" != "429" ]; then cat /tmp/body.json; break; fi wait=$(jq -r '.error.retry_after // 2' /tmp/body.json) sleep "$wait" done
// The official SDK retries 429/5xx for you, honouring Retry-After. use Credicorp\Credicorp; $cc = new Credicorp([ 'secret_key' => getenv('CC_SECRET_KEY'), 'max_retries' => 5, // full-jitter back-off, capped at 30s ]); $app = $cc->applications->create([ 'business' => ['company_number' => '16093826'], 'amount_pence' => 2500000, 'term_months' => 12, ]);
The official SDKs implement this exact strategy out of the box — bounded retries, full jitter, and Retry-After compliance. If you use them, you do not need to write any of this by hand.
Designing to stay under the limit
- Prefer webhooks to polling. Subscribe to
application.funded,decision.completedandpayment.paidrather than re-fetching on a timer. See Webhooks. - Page efficiently. Request the largest sensible
limitand follow the cursor instead of many small pages. See Pagination. - Cache token responses. Reuse the OAuth access token until it expires rather than minting one per request — the SDKs do this automatically.
- Smooth bulk jobs. When backfilling, run a single worker at the sustained rate rather than many parallel workers fighting over the burst.
Need a higher limit?
Quotas grow as you move up partner tiers, and platform limits are set per contract. If you have a launch event, a migration, or a seasonal spike, email [email protected] with your project ID and the rate you expect — we can raise sustained, burst and concurrency independently. Live platform health is on the status page.
