Ever wondered what actually happens when you hit Pay on your banking app and your friend receives the money almost instantly? To the user, it feels like magic. But as engineers, we know there is no magic, only careful state management, reliable infrastructure, and good system design.
If you're a fresher or junior engineer trying to understand how real-world fintech systems are built, this is one of the best problems to study. Let's look under the hood of a modern bank payment pipeline.
The Problem: It's Not Just `balance = balance - amount`
When you first learn to code, a bank transfer looks deceptively simple:
UPDATE accounts SET balance = balance - 500 WHERE id = sender_id;UPDATE accounts SET balance = balance + 500 WHERE id = receiver_id;
But what happens if the database crashes exactly between step 1 and step 2? The sender loses money, the receiver gets nothing, and your system has just created a financial disaster.
In the real world, distributed systems fail all the time. Networks drop, APIs time out, and databases get locked. So banks do not model transfers as two casual updates. They build an event-driven, state-machine-backed pipeline that treats money movement as a controlled process, not a single query.
1. The API Gateway
Every transfer starts at the front door: the API Gateway, usually something like Nginx, Kong, or a managed gateway sitting in front of the core backend services.
Its first major job is idempotency.
Imagine a user tapping the Pay button three times because the network feels slow. We cannot process the transfer three times. To prevent that, the mobile app generates a unique idempotency-key and sends it in the request headers. The API layer stores or checks that key in a fast datastore like Redis.
If the same key shows up again within a safe window, the gateway returns the original result instead of starting a new transfer.
2. The Core Ledger
Strong financial systems do not just mutate a balance column. They use double-entry accounting, where every transaction has a debit side and a credit side.
They also avoid moving money immediately. Instead, they often use a reservation pattern:
- The system reserves the amount first
- The available balance drops
- The money is not fully settled yet
- Final settlement happens only after the receiving side confirms success
This ledger usually lives in a highly consistent relational database like PostgreSQL or CockroachDB. If the downstream transfer fails, the reservation is released and the money becomes available again.
3. The Smart Router
Banks do not send every payment through the same rail. The payment core works like a strategy engine and selects the right transfer path based on the payload.
- UPI / IMPS for fast, low-to-medium value transfers
- NEFT for scheduled or batch-friendly transfers
- RTGS for high-value transfers with stricter rules
The router evaluates factors like amount, transfer type, timing, and regulatory rules, then picks the correct rail adapter. That adapter knows how to speak to the external payment network.
4. Handling Network Chaos
When the system calls an external payment rail, it usually expects one of three states: SETTLED, FAILED, or PENDING.
PENDING is the interesting one.
If an external bank or payment network is slow, we cannot keep the user's HTTP request open forever. That would waste resources and eventually crush the service under high load. So the backend returns 202 Accepted, stores the transfer as PENDING, and lets asynchronous systems take over.
5. Background Workers and Reconciliation
This is where background workers earn their salary.
Workers continuously scan or consume pending transfers and ask the external system for the final result: did this transfer settle or not? This process is called reconciliation.
Once the worker gets a definitive outcome, it updates the ledger:
SETTLEDif the transfer succeededFAILEDif the transfer was rejected- reservation released if the money needs to be returned
These workers are often driven by queues like RabbitMQ, Kafka, SQS, or even scheduled jobs depending on the system's scale and reliability needs.
6. Real-Time Notifications
After reconciliation marks the transfer as settled, the user still needs to know.
Polling the API every two seconds would be terrible. It would drain mobile batteries and create unnecessary load on the backend. So modern systems use an event-driven notification layer.
- The payment core or worker publishes a
TransferSettledevent to Kafka or SQS - A separate Notification Service consumes that event
- That service pushes a real-time update through WebSockets, FCM, or APNs
- The user's screen flips to Payment Successful almost instantly
The Big Picture: The Journey of ₹500
- Initiate: Mobile app sends
POST /transferwith anidempotency-key. - Check: Redis verifies the request is not a duplicate.
- Reserve: PostgreSQL creates ledger entries and moves ₹500 into a reserved state.
- Route: The payment core selects the correct rail, maybe UPI for a small transfer.
- Execute: The external rail responds with
PENDINGbecause the network is slow. - Return Early: The HTTP request ends and the user sees Processing....
- Reconcile: A worker later checks the external network and receives
SETTLED. - Commit: The ledger is finalized and the reservation becomes a settled transfer.
- Publish: A success event is dropped into Kafka or another broker.
- Notify: The notification service pushes the result to the user's device.
Why This Design Works
- It is fault-tolerant. Failures are expected, so the system has explicit pending and recovery paths.
- It is scalable. The API layer, ledger, workers, and notification systems can all scale independently.
- It protects money. Reservations and reconciliation reduce the risk of money disappearing mid-transfer.
- It is understandable. The transfer moves through clear states instead of hiding complexity in a single function.
Conclusion
Building a payment system is not about clever algorithms. It is about managing distributed state safely.
Once you separate the API layer, the ledger, the async reconciliation workers, and the notification system, the design becomes far more reliable. Each part has one job. Together, they ensure money is either moved correctly or not moved at all.
Next time you tap Pay and see money move in seconds, remember: behind that smooth experience is a carefully engineered pipeline working very hard to make the transfer feel simple.
