You know the drill. You scan a QR code at a chai tapri, enter your PIN, and within two seconds, a little speaker box announces, "Received twenty rupees on PhonePe." It feels like magic. But behind those two seconds is a beautifully orchestrated dance of distributed systems.
If you are a fresher getting into backend engineering, understanding how a payment system works is a rite of passage. Today, we are going to look at the system design of a basic UPI payment orchestrator. We'll keep it theoretical, neat, and clean.
Let's dive in.
The Core Problem
Moving money digitally isn't just about updating a database row from A = A - 20 to B = B + 20. The money lives in two completely different databases (the Payer's Bank and the Payee's Bank).
Because these are separate systems communicating over the internet, things will fail. Networks drop, banks go down for maintenance, and timeouts happen. The core challenge of payment system design is ensuring Atomicity: either the money successfully moves from A to B, or it stays with A. Money should never just "disappear" in the middle.
To solve this, we need a middleman. In the real world, this is the NPCI (National Payments Corporation of India). In our system design, we call this the Payment Orchestrator.
The Architecture: Meet the Actors
Our system consists of a few key components:
- The Validator: The bouncer at the club. It checks if the payment request is valid before doing any heavy lifting.
- The Idempotency Store: The memory bank. It remembers every transaction we've started so we never accidentally charge a user twice for the same click.
- The Bank Clients: The adapters that talk to the actual banking systems (HDFC, SBI, ICICI, etc.) to perform Debits and Credits.
- The Orchestrator: The brain. It coordinates the entire flow between the payer's bank and the payee's bank.
The Happy Path: When Everything Goes Right
Let's walk through the exact sequence of events when you send ₹500 to your friend.
Step 1: Validation
Before we touch any money, the Validator checks the basics:
- Is the amount greater than zero? (You can't send ₹0).
- Are the sender and receiver VPAs (UPI IDs) valid?
- Are the sender and receiver different people? (You can't pay yourself).
- Is the timestamp recent? (Rejecting requests that are suspiciously old or from the future).
Step 2: Idempotency Check
Network glitches happen. Sometimes your phone sends the "Pay" request twice because of a laggy 4G connection. The Idempotency Store checks the unique Transaction ID. If it sees a transaction ID it is already processing, it rejects the duplicate. This guarantees that one click = one payment.
Step 3: The Debit (Taking the money)
The Orchestrator tells the Payer's Bank Client: "Hey, deduct ₹500 from Alice's account." The bank checks Alice's balance, deducts the money, and returns a success signal along with a Reference Number (RRN).
Step 4: The Credit (Giving the money)
Now that we have the money, the Orchestrator tells the Payee's Bank Client: "Hey, add ₹500 to Bob's account." The bank adds the money and returns a success signal.
The Orchestrator marks the transaction as SETTLED. The speaker box announces the payment. Everyone is happy.
The Unhappy Path: Dealing with Failures
The happy path is easy. A good backend engineer designs for the unhappy path. What happens when things break?
Scenario A: Debit Fails
The Orchestrator tries to debit Alice, but she has insufficient funds, or her bank is down.
The Fix: This is an easy one. The Orchestrator immediately marks the transaction as FAILED. No money moved, so no harm done.
Scenario B: Debit Succeeds, but Credit Fails
This is the nightmare scenario. Alice's account was debited ₹500, but when the Orchestrator tries to credit Bob, Bob's bank is offline and returns an error. Now Alice is minus ₹500, and Bob hasn't received it.
The Fix: The Orchestrator must immediately trigger a Reverse Debit (a refund). It tells Alice's bank: "The credit failed, put the ₹500 back into Alice's account." Once the reversal is successful, the transaction is marked as FAILED. This is why you sometimes see "Payment failed, money will be refunded in 3-5 days" (though modern systems often reverse it instantly).
Scenario C: The Timeout
Banks are notoriously slow. What if the Orchestrator sends the Credit request to Bob's bank, and the bank just... doesn't respond? We can't wait forever.
The Fix: The Orchestrator enforces a Hard Timeout (e.g., 30 seconds). If the entire debit-and-credit process takes longer than 30 seconds, the Orchestrator cuts the cord and marks the transaction as TIMED_OUT. Background workers (reconciliation jobs) will later figure out what exactly happened and process refunds if necessary.
Why This Design is Beautiful
If you look closely at this architecture, you'll notice a few brilliant engineering principles:
- Separation of Concerns: The Orchestrator doesn't know how to talk to HDFC or SBI. It just says
debit()andcredit(). TheBankClienthandles the messy banking APIs. - Fail-Safe by Default: The system is designed to reverse actions if the next step fails. It guarantees that money isn't created or destroyed, only moved safely.
- State Machines: A transaction isn't just "Done" or "Not Done". It moves through clear states:
PROCESSING->SETTLED(orFAILED/TIMED_OUT).
Wrapping Up
Building a payment system isn't about writing complex algorithms; it's about handling state and failures predictably. You have to assume that every network call will eventually fail, and design a safety net (like reverse debits and idempotency) to catch it.
Next time you scan a QR code and hear that little confirmation beep, take a second to appreciate the silent, split-second orchestration happening in the cloud to keep your money safe.
If you found this helpful, feel free to share it with other folks getting into backend engineering!
