API Design Best Practices: Building for Scale and Developer Experience

A well-designed API is the difference between a platform developers love and one they work around. We have built APIs that now serve hundreds of millions of requests per day, and every hard lesson learned — about versioning, pagination, error formats, and authentication — maps directly to concrete decisions you make in the first week of API design. This guide distills those lessons into actionable patterns with real code examples.

Whether you are designing a new greenfield API or refactoring an existing one, the principles here apply equally. The goal is always the same: an API that is predictable, fault-tolerant, easy to consume, and safe to evolve.

RESTful Design Principles That Actually Matter

REST is widely misunderstood. Most APIs claiming to be "RESTful" are really HTTP APIs with JSON bodies. True RESTful design follows specific constraints that make your API self-describing and evolvable. Start with the fundamentals.

Resource Naming and URI Structure

URIs should identify resources, not actions. Use nouns, not verbs. Use plural collection names consistently. Nest resources only when the relationship is strict ownership — beyond two levels of nesting, flatten the structure.

# Good — resource-oriented, plural, lowercase, hyphenated
GET    /api/v1/users
GET    /api/v1/users/{userId}
GET    /api/v1/users/{userId}/orders
POST   /api/v1/users/{userId}/orders
DELETE /api/v1/users/{userId}/orders/{orderId}

# Avoid — verb-in-path, mixed case, inconsistent plurality
GET  /api/getUser/123
POST /api/createOrder
GET  /api/User/123/Order

For actions that do not map cleanly to CRUD operations — like sending an email, triggering a payment, or archiving a batch — use a sub-resource noun that describes the result of the action:

POST /api/v1/invoices/{invoiceId}/payments      # trigger payment
POST /api/v1/users/{userId}/password-resets     # initiate reset
POST /api/v1/reports/{reportId}/exports         # trigger export job

HTTP Methods and Status Code Semantics

Using the correct HTTP method is not just style — it communicates caching behaviour, safety, and idempotency to every intermediary between your client and server.

GET: Safe and idempotent. Must never modify state. Always cacheable.
POST: Creates a resource or triggers a non-idempotent action. Returns 201 Created with a Location header pointing to the new resource.
PUT: Full replacement of a resource. Idempotent — calling it multiple times produces the same result.
PATCH: Partial update. Apply only the fields provided; leave others unchanged. Use JSON Merge Patch (RFC 7396) for simple objects.
DELETE: Removes a resource. Returns 204 No Content on success, 404 if resource does not exist (or idempotently 204 if you prefer not to leak existence).

Status Code Cheat Sheet: 200 OK (GET/PUT/PATCH success), 201 Created (POST success), 204 No Content (DELETE success), 400 Bad Request (validation error), 401 Unauthorized (missing/invalid credentials), 403 Forbidden (authenticated but not permitted), 404 Not Found, 409 Conflict (duplicate or state conflict), 422 Unprocessable Entity (semantic validation failure), 429 Too Many Requests (rate limit hit), 500 Internal Server Error (never leak stack traces).

API Versioning Strategies

Versioning is the contract you make with your consumers. Breaking that contract silently is one of the worst things you can do as an API provider. Choose a strategy early and be consistent.

URI Versioning (Recommended for Public APIs)

The version lives in the URL path. Consumers can see which version they are calling at a glance, load balancers and proxies can route by version, and API gateway logs are immediately readable.

GET https://api.completedigi.com/v1/users
GET https://api.completedigi.com/v2/users

This is the most discoverable approach and the easiest to test with a browser or curl. Its downside is that it encourages treating versions as completely separate APIs rather than incremental evolutions.

Header Versioning (Recommended for Internal/Partner APIs)

The version is specified in a custom request header. URIs stay clean and stable, and you can default to the latest version when the header is absent.

GET /users HTTP/1.1
Host: api.completedigi.com
API-Version: 2026-02-01
Accept: application/json

Stripe uses date-based header versioning to great effect — every API change is tied to a calendar date, and each application is pinned to the version active when it was written.

Query Parameter Versioning (Avoid for Production)

GET /users?api_version=2

Simple to implement but pollutes query strings, breaks caching (same URI, different versions), and is easily forgotten by consumers. Reserve this only for experimentation or internal tooling.

Backward Compatibility Rules

A change is backward compatible if existing clients continue to work without modification. Safe changes include: adding new optional fields to responses, adding new optional request parameters, adding new endpoints, and adding new enum values (if clients handle unknown values gracefully). Breaking changes include: renaming or removing fields, changing field types, making optional fields required, and changing the meaning of existing status codes.

Pagination Patterns

Returning unbounded result sets is a reliability and security risk. Always paginate. The question is which pattern to use.

Offset Pagination

Classic and simple. The client specifies how many records to skip and how many to return. Best for small datasets where users need random page access (e.g., "jump to page 50").

GET /api/v1/products?offset=40&limit=20

{
  "data": [...],
  "pagination": {
    "total": 1842,
    "offset": 40,
    "limit": 20,
    "has_next": true,
    "has_prev": true
  }
}

Drawbacks: inconsistent results if records are inserted or deleted between pages (the "page drift" problem), and database performance degrades at high offsets — a OFFSET 100000 LIMIT 20 query must scan and discard 100,000 rows.

Cursor-Based Pagination (Recommended for Scale)

The server returns an opaque cursor encoding the position of the last item seen. The client passes this cursor to retrieve the next page. This approach is stable, performant at any scale, and avoids page drift entirely.

GET /api/v1/events?limit=50

{
  "data": [...],
  "pagination": {
    "next_cursor": "eyJpZCI6Ijg3NjUiLCJ0cyI6MTcwNjc4OTEwMH0",
    "has_next": true
  }
}

# Fetch next page
GET /api/v1/events?limit=50&cursor=eyJpZCI6Ijg3NjUiLCJ0cyI6MTcwNjc4OTEwMH0

The cursor itself is a base64-encoded JSON object containing the sort key and ID of the last record, allowing the database to use an indexed range query instead of a full scan. Never expose raw database IDs as cursors — always encode and treat them as opaque tokens.

Error Handling: RFC 7807 Problem Details

Inconsistent error formats are one of the most common API complaints from developers. Adopt RFC 7807 (Problem Details for HTTP APIs) as your standard. It is widely understood, machine-readable, and extensible.

HTTP/1.1 422 Unprocessable Entity
Content-Type: application/problem+json

{
  "type": "https://api.completedigi.com/errors/validation-error",
  "title": "Validation Failed",
  "status": 422,
  "detail": "The request body contains invalid field values.",
  "instance": "/api/v1/users/create",
  "request_id": "req_2f8b3c4d5e6f",
  "errors": [
    {
      "field": "email",
      "code": "invalid_format",
      "message": "Must be a valid email address."
    },
    {
      "field": "date_of_birth",
      "code": "future_date",
      "message": "Date of birth cannot be in the future."
    }
  ]
}

Always include a request_id in every response — success and error alike. When a developer reports a bug with a request ID, you can find the exact trace in your logging system within seconds instead of hunting through logs for hours.

The type field is a URI that serves as a stable, unique identifier for the error class. It does not need to resolve to a real page, but it should be documented. The instance field identifies the specific occurrence — useful when the same endpoint is mounted at multiple paths.

Rate Limiting Strategies

Rate limiting protects your infrastructure from abuse, ensures fair usage across all consumers, and gives your API predictable performance characteristics. Surface limits clearly so consumers can build compliant clients.

Choosing a Rate Limit Algorithm

Fixed Window: Simple counter reset every N seconds. Easy to implement but allows burst traffic at window boundaries — a client can exhaust the limit in the last second of one window and the first second of the next, doubling effective throughput.
Sliding Window: Tracks requests using a rolling time window. Smoother than fixed window but more memory-intensive (requires a sorted set per client in Redis).
Token Bucket: Clients accumulate tokens at a steady rate up to a maximum. Best for allowing short bursts while enforcing a long-term average — ideal for most API use cases.
Leaky Bucket: Requests are processed at a fixed output rate regardless of input rate. Guarantees smooth throughput at the cost of variable queuing latency.

Rate Limit Response Headers

HTTP/1.1 200 OK
X-RateLimit-Limit: 1000
X-RateLimit-Remaining: 743
X-RateLimit-Reset: 1738368000
X-RateLimit-Policy: 1000;w=3600

# When limit is exceeded:
HTTP/1.1 429 Too Many Requests
Retry-After: 47
X-RateLimit-Limit: 1000
X-RateLimit-Remaining: 0
X-RateLimit-Reset: 1738368000

Always return Retry-After on 429 responses. Without it, clients resort to random exponential backoff and often retry immediately, compounding the problem. Structure different limits by resource type — reads vs writes, expensive vs cheap endpoints — rather than one global limit per API key.

Authentication: JWT vs API Keys vs OAuth 2.0

There is no universal winner. Each mechanism fits a different trust model and threat landscape.

API Keys

Best for: server-to-server integrations, public data APIs, webhook verification. A random 32-byte hex or base58 string, stored as a hash (bcrypt or SHA-256) in your database, never in plaintext. Prefix the key with a service identifier to help developers identify which service issued it:

# Key format: {prefix}_{environment}_{random}
cdg_live_k8j2mPqR9nXv4wZtYhFsLcDaUeObNiGm
cdg_test_a1b2c3d4e5f6a1b2c3d4e5f6a1b2c3d4

Authorization: Bearer cdg_live_k8j2mPqR9nXv4wZtYhFsLcDaUeObNiGm

Issue keys with scopes, expiry dates, and usage quotas. Log every API key usage with IP, timestamp, and endpoint. Never log the key itself.

JWT (JSON Web Tokens)

Best for: short-lived user sessions, microservice-to-microservice calls where each service needs to verify identity without a central auth lookup. A JWT is self-contained — the payload carries claims that any service can verify with the public key.

# JWT Header.Payload.Signature
eyJhbGciOiJSUzI1NiIsInR5cCI6IkpXVCJ9
.
eyJzdWIiOiJ1c3JfMTIzNDU2IiwiaWF0IjoxNzM4MzY4MDAwLCJleHAiOjE3MzgzNzE2MDAsInNjb3BlcyI6WyJyZWFkOm9yZGVycyIsIndyaXRlOm9yZGVycyJdLCJvcmdfaWQiOiJvcmdfYWJjZGVmIn0
.
[RS256 signature]

# Decoded payload:
{
  "sub": "usr_123456",
  "iat": 1738368000,
  "exp": 1738371600,
  "scopes": ["read:orders", "write:orders"],
  "org_id": "org_abcdef"
}

Use short expiry times (15–60 minutes) and implement a refresh token rotation pattern. Always use RS256 or ES256 — never HS256 for production systems where multiple services verify tokens, because sharing the HMAC secret is a security anti-pattern.

OAuth 2.0

Best for: third-party integrations where your users authorize external applications to act on their behalf. Use the authorization code flow with PKCE for all clients. Never implement the implicit flow (deprecated) or resource owner password credentials flow (avoid). Scopes should be granular and follow the principle of least privilege: read:profile, write:billing, not a blanket admin scope.

GraphQL Trade-offs

GraphQL solves real problems — over-fetching, under-fetching, and the need for multiple round trips to assemble a view — but it introduces its own complexity. Use it deliberately, not by default.

When GraphQL Wins

Multiple client types (mobile, web, TV app) that need different shapes of the same data
Rapid product iteration where frontend teams should not wait for backend endpoint changes
Complex, highly interconnected domain models (e.g., social graphs, e-commerce catalogues)

GraphQL Pitfalls to Plan For

N+1 queries: A naive resolver fetches the parent, then executes one query per child. Use DataLoader (batching + caching) to collapse N database calls into one.
Query complexity attacks: A deeply nested query can overwhelm your database. Implement query depth limiting and complexity scoring before going to production.
Caching: GraphQL's POST-based queries bypass HTTP caching. Use persisted queries (hash the query, cache by hash) for CDN caching of common operations.
Schema versioning: Deprecate fields gracefully with the @deprecated(reason: "...")" directive rather than removing them. Expose schema changes through a changelog.



                OpenAPI / Swagger Documentation

                An API without accurate documentation is an API that nobody can use correctly. OpenAPI 3.1 is the industry standard. Write your spec first (design-first, not code-first) and generate server stubs and client SDKs from it.

openapi: 3.1.0
info:
  title: CompleteDigi Platform API
  version: 1.0.0
  contact:
    name: CompleteDigi Engineering
    url: https://completedigi.com/support

paths:
  /v1/users/{userId}:
    get:
      summary: Retrieve a user by ID
      operationId: getUserById
      tags: [Users]
      parameters:
        - name: userId
          in: path
          required: true
          schema:
            type: string
            pattern: '^usr_[a-zA-Z0-9]{16}$'
      responses:
        '200':
          description: User retrieved successfully
          content:
            application/json:
              schema:
                $ref: '#/components/schemas/User'
        '404':
          $ref: '#/components/responses/NotFound'

                Host your docs at a stable URL (/docs or /api-docs), update them atomically with each release using a CI gate that rejects code that makes the spec go out of sync, and provide a sandbox environment with realistic test data so developers can try calls without affecting production.

                Idempotency for Financial APIs

                Network failures are inevitable. When a client sends a payment request and the connection drops before receiving a response, it cannot know whether the server processed the request. Without idempotency, retrying creates duplicate charges. This is one of the most critical design decisions for any API handling money or side effects.

POST /api/v1/payments HTTP/1.1
Idempotency-Key: 550e8400-e29b-41d4-a716-446655440000
Content-Type: application/json

{
  "amount": 4999,
  "currency": "INR",
  "source": "card_xyz",
  "description": "Subscription renewal - Feb 2026"
}

                The server stores the Idempotency-Key alongside the request hash and response. If the same key arrives again within a TTL (typically 24 hours), the server returns the cached response immediately without re-executing the operation. Keys must be unique per client and per operation — never reuse a key for a different intent. Return the original response body and status code, including the original request_id, so clients can distinguish a cached replay from a fresh response.

                Webhook Design

                Webhooks invert the request-response model — your API calls the consumer instead. This is far more efficient than polling for event-driven integrations, but requires careful design to be reliable.

                Payload Structure

{
  "id": "evt_01HQ8MXYZ3N4P5Q6R7S8T9U0V",
  "type": "payment.succeeded",
  "created_at": "2026-02-01T10:23:45.678Z",
  "api_version": "2026-02-01",
  "data": {
    "object": {
      "id": "pay_abc123",
      "amount": 4999,
      "currency": "INR",
      "status": "succeeded"
    }
  }
}

                Security and Reliability

                
                    Sign every payload with HMAC-SHA256 using a per-endpoint secret. Include the signature in a X-CompleteDigi-Signature header. Consumers verify the signature before processing — this prevents spoofed webhook calls.
                    Retry with exponential backoff on non-2xx responses: attempt at 0s, 30s, 2m, 10m, 30m, 2h, 8h, 24h. After 72 hours of failures, mark the endpoint as disabled and notify the owner.
                    Include the event ID in every payload and instruct consumers to use it for deduplication — your retries and their processing must both be idempotent.
                    Deliver events in order within a stream using sequence numbers or event timestamps, but design consumers to handle out-of-order delivery gracefully.
                

                Hypermedia and HATEOAS

                HATEOAS (Hypermedia As The Engine Of Application State) is the most-discussed and least-implemented REST constraint. When done correctly, it makes your API self-documenting and allows clients to navigate state transitions without hardcoding URLs. Even a lightweight version of HATEOAS dramatically improves developer experience.

{
  "id": "ord_789xyz",
  "status": "pending_payment",
  "total": 12499,
  "currency": "INR",
  "_links": {
    "self":    { "href": "/api/v1/orders/ord_789xyz",          "method": "GET" },
    "pay":     { "href": "/api/v1/orders/ord_789xyz/payments", "method": "POST" },
    "cancel":  { "href": "/api/v1/orders/ord_789xyz",          "method": "DELETE" },
    "invoice": { "href": "/api/v1/orders/ord_789xyz/invoice",  "method": "GET" }
  }
}

                The _links block tells the client exactly which state transitions are valid right now. After the order is paid, pay disappears and shipment appears. Clients that respect _links never need to encode business-state logic — the server drives the workflow. This is especially powerful for complex multi-step processes like checkout flows, KYC workflows, or approval chains.

                API Gateway Patterns

                An API gateway is the single entry point for all client traffic into your backend. It handles cross-cutting concerns — authentication, rate limiting, request routing, SSL termination, logging, and response transformation — so individual services do not have to.

                Gateway Responsibilities

                
                    Authentication delegation: Validate tokens at the gateway and forward verified identity headers (X-User-Id, X-Org-Id, X-Scopes) to upstream services. Services trust the gateway — they never re-validate tokens.
                    Request aggregation (BFF pattern): A Backend for Frontend (BFF) gateway composes responses from multiple services into a single payload tailored to the client. Mobile gets a compact response; web gets a richer one.
                    Circuit breaking: If an upstream service starts returning errors or timing out, the gateway opens a circuit and returns cached responses or a graceful fallback instead of letting failures cascade.
                    Request/response transformation: Translate between API versions at the gateway layer, allowing upstream services to run the latest version while legacy clients see the old contract.
                

                
                    Operational insight: Instrument your gateway to emit a request_id on every incoming request and propagate it as a X-Request-Id header to all upstream calls. When this ID flows through your distributed tracing system (Jaeger, Tempo, or AWS X-Ray), you can reconstruct the full call graph of any request across 20 microservices from a single ID in your support ticket.
                

                Key Takeaways

                
                    Design-first, code-second: Write your OpenAPI spec before writing any implementation code. Review it with consumers and iterate on it cheaply before any code exists to change.
                    Version from day one: Add /v1/ to every path from the first commit. Retrofitting versioning into an unversioned API is painful and risky.
                    Standardize errors early: Adopt RFC 7807 problem details across your entire API surface. Inconsistent error formats are impossible to fix once consumers have built on them.
                    Cursor pagination for anything that scales: Offset pagination is a convenience feature, not a scaling strategy. Switch to cursor-based pagination before you hit performance problems, not after.
                    Idempotency for all write operations: Require an Idempotency-Key header on every POST and PUT that produces a side effect. This makes retries safe and your API far more reliable over unreliable networks.
                    Rate limit headers are part of your contract: Tell your consumers exactly what the limits are and when they reset. Do not make them guess through trial and error.
                    Webhooks need signatures and retries: Unsigned webhooks are a security liability. No retry logic means events are silently lost on any transient failure.
                

                
                    Design Your API Architecture
                    Our engineering team has designed and scaled APIs serving hundreds of millions of requests per day. Let us help you build an API that developers love and your infrastructure can handle.
                    Design Your API Architecture