Implementing Data Subject Requests: Architecture & Request Lifecycle

Every privacy regulation enacted in the last decade — GDPR, CCPA, DPDP Act — grants individuals the right to ask an organization: what data do you hold on me, and what are you doing with it? The right to access, correct, delete, port, and restrict processing. These are Data Subject Requests.

Most organizations treat DSR as a manual process. An email arrives, someone in legal opens a spreadsheet, a few engineers are asked to query a few databases, and a response is assembled over several weeks. This works at fifty requests a month. It collapses at five hundred. It is a liability at five thousand.

This series covers how to build a DSR system that operates at scale — automated intake, identity verification, cross-system orchestration, deletion architecture, and audit. This first part addresses the foundation: the system architecture and the request lifecycle from intake to closure.

The request lifecycle

A DSR moves through seven stages. Every stage must be explicitly modeled in the system, because every stage has a different owner, a different SLA, and a different failure mode.

Intake. The request enters the system. This could be a web form, an email to a designated privacy address, an API call from a consent management platform, or a request forwarded from customer support. Regardless of channel, the system must normalize the request into a canonical format immediately. A DSR is a DSR whether it arrives as a polite email or a legal demand letter. The intake layer strips the channel, extracts the core fields — requestor identity, request type, scope, jurisdiction — and creates a request record with a unique identifier and a timestamp. The SLA clock starts here.

Identity verification. Before any data is accessed, the requestor must be verified. This is not optional and it is not trivial. A deletion request from someone who is not the data subject is a data breach, not compliance. Verification methods vary by jurisdiction and risk profile. For authenticated users — those who submit the request while logged in — session-based verification may suffice. For unauthenticated requests, the system must challenge: government ID upload, email verification to the address on file, knowledge-based authentication, or a combination. The verification step must be modeled as a gate. The request does not advance until the gate passes. If verification fails or times out, the request enters a rejection flow with its own audit trail.

Classification. Not all DSRs are equal. The system must classify the request into a type: access (provide a copy of my data), deletion (erase my data), correction (fix inaccurate data), portability (export my data in a machine-readable format), restriction (stop processing my data for a specific purpose), or opt-out (stop selling or sharing my data). Classification determines the downstream workflow. An access request triggers a data collection pipeline. A deletion request triggers a deletion orchestration pipeline. A correction request triggers a targeted update flow. These are fundamentally different operations with different system dependencies, different risk profiles, and different response formats. Misclassification means the wrong pipeline executes, and the response is non-compliant.

Scoping. Once classified, the request must be scoped. Scope means: which systems hold data for this individual, and which categories of data are in play? Scoping depends entirely on the quality of the organization's data inventory. If the inventory is comprehensive and current, scoping is a lookup. If the inventory is incomplete — which is the common case — scoping becomes the bottleneck. The scoping step must also evaluate jurisdictional rules. A GDPR request from an EU resident has different scope boundaries than a CCPA request from a California consumer. A DPDP Act request from an Indian data principal has different retention exemptions than either. The scoping layer encodes these rules.

Execution. The classified, scoped request is dispatched to the appropriate systems for fulfillment. This is the most technically complex stage and it is covered in detail in Part 2 (data discovery and orchestration) and Part 3 (deletion architecture) of this series. The key architectural decision at this stage is whether execution is synchronous or asynchronous. For organizations with fewer than five data systems, synchronous fan-out — call each system, wait for all responses, assemble the result — may work. For organizations with dozens or hundreds of systems, asynchronous orchestration with a message queue, status tracking per system, partial failure handling, and retry logic is the only viable approach.

Review. Before the response is delivered to the requestor, it must be reviewed. Automated review checks for completeness: did all systems respond, are there gaps, did any system return an error that requires manual intervention? In regulated environments or for high-sensitivity requests, a human review step may be required — legal review for access requests that touch litigation-hold data, compliance review for deletion requests that conflict with regulatory retention obligations. The review step is where exemptions are applied. Not all data can be deleted. Financial transaction records may be subject to seven-year retention under tax law. Data under active litigation hold cannot be erased. AML records have their own retention mandates. The review layer must surface these conflicts and document the decision.

Response and closure. The final stage delivers the response to the requestor and closes the record. For access requests, the response is a data package — typically a structured export in JSON, CSV, or PDF, delivered through a secure download link with an expiration window. For deletion requests, the response is a confirmation of what was deleted, what was retained and why, and the legal basis for any retained data. For correction requests, the response confirms what was changed. Every response must be logged. The closure record includes the original request, the verification outcome, the classification, the scope, the execution results per system, any exemptions applied, and the final response delivered. This record is the audit trail. When a regulator asks "show me how you handled this request," this record is the answer.

The data model

The request lifecycle implies a data model. Getting this right at the start prevents painful migrations later.

The core entity is the dsr_request. It holds the request identifier, the requestor's verified identity reference, the request type, the jurisdiction, the intake channel, the intake timestamp, the current status, and the SLA deadline. Status is an enum that maps directly to the lifecycle stages: intake, verification_pending, verification_failed, classified, scoped, executing, review_pending, review_complete, responded, closed.

Each request has one or more dsr_task records. A task represents a unit of work dispatched to a specific system. If a deletion request is scoped to eight systems, there are eight tasks. Each task has its own status, its own timestamps, and its own result payload. This per-task granularity is essential. When one system out of eight fails, the request is not failed — seven tasks succeeded and one needs retry or manual intervention. Without per-task tracking, the entire request is opaque.

Each request also has a dsr_verification record — the method used, the outcome, the timestamp, and any supporting evidence reference. This is auditable separately from the request itself.

Each request has a dsr_response record — the response type, the delivery channel, the delivery timestamp, and a reference to the response artifact (the data package for access requests, the confirmation document for deletions).

And each request has a dsr_audit_log — an append-only log of every state transition, every decision, every human intervention, every exemption applied. This log is immutable. It is the compliance record.

SLA management

Every jurisdiction defines a response deadline. GDPR mandates 30 days, extendable by two months for complex requests with notification to the requestor. CCPA allows 45 days, extendable by an additional 45. The DPDP Rules do not yet specify DSR timelines explicitly, but the Act requires response without unreasonable delay.

The SLA clock starts at intake, not at verification, not at classification. This is a common implementation error. Organizations that start the clock at classification have already consumed days on verification and intake processing, leaving less time for the actual work.

The system must track SLA at the request level and surface approaching deadlines proactively. A daily SLA report showing requests at 50%, 75%, and 90% of their deadline window is the minimum. Automated escalation — alerting the privacy team lead when a request crosses the 75% threshold — prevents regulatory breaches.

For organizations processing hundreds of requests monthly, SLA management is not a dashboard. It is a scheduling system. Requests must be prioritized by deadline proximity, not by arrival order. A request received twenty days ago with ten days remaining is more urgent than a request received yesterday with twenty-nine days remaining, even though the latter arrived more recently.

Architecture patterns

Two patterns dominate DSR system architecture.

The centralized orchestrator pattern places all lifecycle logic in a single service. The orchestrator receives the request, manages verification, classifies, scopes, dispatches tasks to downstream systems via API calls or message queues, collects responses, runs the review logic, and generates the response. This pattern is simpler to build, simpler to debug, and provides a single point of observability. It works well for organizations with fewer than twenty data systems. Its weakness is that it creates a single point of failure and can become a bottleneck at high volumes.

The event-driven choreography pattern distributes the lifecycle across services that communicate through events. The intake service publishes a request.created event. The verification service consumes it, performs verification, and publishes request.verified. The classification service consumes that, and so on. Each service owns its stage. This pattern scales better, tolerates partial failures more gracefully, and allows individual stages to evolve independently. Its weakness is operational complexity — tracing a request across six services and a message broker requires distributed tracing, correlation IDs, and a centralized status view that aggregates state from all services.

In practice, most production DSR systems use a hybrid. The orchestrator manages the lifecycle state machine and dispatches tasks, but the tasks themselves execute asynchronously. The orchestrator does not wait for each system to complete. It publishes tasks to a queue, and each downstream system reports completion back. The orchestrator aggregates status and advances the lifecycle when all tasks for a stage are complete.

What this does not cover

This part has addressed the lifecycle, the data model, SLA management, and the architectural skeleton. It has not addressed the two hardest problems in a DSR system: finding the data and deleting it. Data discovery and cross-system orchestration — how you locate a person's data across an enterprise that was never designed to answer that question — is covered in Part 2. Deletion architecture — the engineering of actual data erasure across live systems, backups, caches, logs, and downstream consumers — is covered in Part 3. Audit, compliance reporting, and scaling are covered in Part 4.

The lifecycle is the skeleton. The next three parts are the muscle, the nervous system, and the skin.