How to handle OpenAI's API rate limits in Xano?

Options

Hi,

OpenAI's API has rate limits.

In the case of chat completions on the free trial, the api rate limits currently are:

1. Requests per minute: 3 requests per minute.
2. Tokens per minute: 40,000 tokens per minute.
You'll get the amount of tokens you consumed in each api response (see image below).

My use case is the following: I am looping through a list of products, so I use "for each loop" in Xano and send each product to the OpenAI API, one by one, to ask gpt to describe each product.

For each product, I do 4 different requests to OpenAI (so I do 4 requests per loop), this is causing me errors because of the mentioned OpenAI api rate limits.

What ideas would you have to make Xano respect both of this rate limits (requests per minute and tokens per minute)?

Any help would be appreciated.

Thanks!

Tagged:

Answers

  • pachocastillosr
    Options

    Note: The open ai API might be called by others of my Xano endpoints, consuming tokens and requests of the same api rate limit.

  • Pawel Magdanski
    Pawel Magdanski Member ✭✭✭
    Options

    Super quick solution. I would save the amount of tokens you have somewhere in the database and update them after every call. You probably need to have some sort of a backup to not run out of them during the last call you can make so if your call can take for example max 50 tokens then set a conditions that you need to have 60 tokens at your disposal to make a call. Hope that makes sense

  • Ray Deck
    Ray Deck Trusted Xano Expert ✭✭✭
    Options

    Setting up locks can help too. Locks let you make sure one client is talking to OpenAI at a time. The thesis is that while the lock is "on" your endpoints wait for the lock to be released. If you are on a scale plan, the Redis functions are fantastic for storing the lock. But you can use a database for this job - a little less efficient, but pre-scale probably won't matter.

    Basically, you can set up a while loop to wait for the lock to clear. Inside the loop, Get record from your lock table to see if there is a lock on it. If so, sleep. If not, update the table with the lock, and then do your OpenAI work. At the end of the work, update the table again to release the lock. WIthin that, you can introduce additional "sleep" or whatever to cool down the pace of api usage within the endpoint.

    It's an interesting problem! Definitely part of the hardest 5% we work on in the daily office hours and forum at State Change Pro in a group setting.

    If you're looking for 1-1 consulting to work the problem, Pawel does that and he's a SCP member too!

  • pachocastillosr
    Options

    Thank you both! @Pawel Magdanski @Ray Deck. I'll try to solve this with your suggestions and let you know if I need more personalized consulting