Published on

System design: Bottlenecks by third-party API

Authors
  • avatar
    Name
    Dinh Nguyen Truong
    Twitter

Years ago I had a chance to work on a logistics project for a famous company.

The business logic was super complicated, but the main logic for the end customer, the one that generates revenue, is simple: booking a parcel and tracking its status.

There is one problem that occurs in the design — bottlenecks caused by the third-party API service.

In this post, I will discuss the problem and the solution in detail.

1. The overall project design

This is a typical microservice design system with a load balancer, multiple app services, and one database. Why one database? It is good for data consistency across all services.

Plus, Azure Cosmos DB does a good job scaling under high load. We just have to set up the proper partitionKey for splitting the logical and physical partition.

So the overall design is good both consistency and scalability.

2. Detail design

Booking a new parcel logic

How it works:

The flow contains 3 main small steps:

  • Validate the parcel data

  • Send to the third party delivery service, wait for the response

  • Store to database

And return response to user.

Pros/Cons:

One of the biggest problem in this design is handling the third party api. Rate limiting, timeouts, budget constraints, or sometimes unknown errors can occur. Any of those problems can slow down the processing time, and eventually, limit how many booking requests we can handle — which equals limiting our revenue.

Another problem is the microservice design isn’t pure. It doesn’t make sense to handle third party delivery service in Finance and Admin app. This logic should be separated into a new service.

Tracking parcel status logic

How it works:

The user will go to Tracking UI to view the status of each parcel
The parcel status will be updated in 3 ways:

  • Listening to third-party delivery service

  • Setup a background job to query third-party delivery service

  • Update internally from Admin team

Pros/Cons:

There are no problems with this design.

2. How I improve it

Overall, the only problem in this design is handle third party api service and separate the delivery service from the other. In my opinion, introduce event queue to the system will solve it nicely.

How it works:

Now booking new parcel will be:

  • validate the parcel data

  • store it in the database

  • fire new parcel event to event queue (Kafka)

And when Kafka receives this event:

  • Send event to delivery service

  • Delivery service will try making requests to third-party API, handle rate limit

  • If successful, store additional data to the DB

  • If fail, return that status to kafka and retry it later

Pros/Cons:

From now the booking request will all be handled internally by our system, much easier for us to increase the throughput.

One problem is this setup is a bit more complicated, and needs to test carefully.

And for the UI we needs to inform user of additional status when their parcel is successfully booked by Delivery service.

Recap

Bottlenecks in third-party API is such a common thing. But for critical operations, we should fix it as soon as possible. In this case, I solved it by decoupling from internal system, using event queues. But maybe in another case, it would not be suitable at all.

Hope you guys find my insight valuable!