- Published on
System design: Bottlenecks by third-party API
- Authors
- Name
- Dinh Nguyen Truong
Years ago I had a chance to work on a logistics project for a famous company.
The business logic was super complicated, but the main logic for the end customer, the one that generates revenue, is simple: booking a parcel and tracking its status.
There is one problem that occurs in the design — bottlenecks caused by the third-party API service.
In this post, I will discuss the problem and the solution in detail.
1. The overall project design
This is a typical microservice design system with a load balancer, multiple app services, and one database. Why one database? It is good for data consistency across all services.
Plus, Azure Cosmos DB does a good job scaling under high load. We just have to set up the proper partitionKey for splitting the logical and physical partition.
So the overall design is good both consistency and scalability.
2. Detail design
Booking a new parcel logic
How it works:
The flow contains 3 main small steps:
Validate the parcel data
Send to the third party delivery service, wait for the response
Store to database
And return response to user.
Pros/Cons:
One of the biggest problem in this design is handling the third party api. Rate limiting, timeouts, budget constraints, or sometimes unknown errors can occur. Any of those problems can slow down the processing time, and eventually, limit how many booking requests we can handle — which equals limiting our revenue.
Another problem is the microservice design isn’t pure. It doesn’t make sense to handle third party delivery service in Finance and Admin app. This logic should be separated into a new service.
Tracking parcel status logic
How it works:
The user will go to Tracking UI to view the status of each parcel
The parcel status will be updated in 3 ways:
Listening to third-party delivery service
Setup a background job to query third-party delivery service
Update internally from Admin team
Pros/Cons:
There are no problems with this design.
2. How I improve it
Overall, the only problem in this design is handle third party api service and separate the delivery service from the other. In my opinion, introduce event queue to the system will solve it nicely.
How it works:
Now booking new parcel will be:
validate the parcel data
store it in the database
fire new parcel event to event queue (Kafka)
And when Kafka receives this event:
Send event to delivery service
Delivery service will try making requests to third-party API, handle rate limit
If successful, store additional data to the DB
If fail, return that status to kafka and retry it later
Pros/Cons:
From now the booking request will all be handled internally by our system, much easier for us to increase the throughput.
One problem is this setup is a bit more complicated, and needs to test carefully.
And for the UI we needs to inform user of additional status when their parcel is successfully booked by Delivery service.
Recap
Bottlenecks in third-party API is such a common thing. But for critical operations, we should fix it as soon as possible. In this case, I solved it by decoupling from internal system, using event queues. But maybe in another case, it would not be suitable at all.
Hope you guys find my insight valuable!