DB locking for parallel processing?

Options
I have a queue db and I have a process that needs to pick a limited number of records from it and process them. Sometimes, when the queue is large I run multiple processes in parallel. Doing it by either hitting an internal API endpoint with a timeout for request OR with a lambda OR with just multiple tasks - tried all of those.

My question is — what will be the best way to avoid multiple parallel processes to pick the same entries?

Here is what I tried myself:
1. I tried immediately deleting the records that the processes picked so that another process doesn't pick those up. Ended up having this error:[Image]
2. I tried adding a "status" column to the db and instead of deleting records. Once the process runs it picks up entries that are not in "processing" status and immidiatly loop-edits them to mark as "processing" Only after it it actually gets to processing. It seemed to work fine when I had some gap in time between running multiple processes. I think it allowed time for the db records to get updated. When I removed this time gap, things went nuts I ended up spamming users with alerts. 

I've read that some types of databases have a locking mechanism that prevents access when something important is happening. Is there something alike in Xano? Maybe it can help somehow. Or are there any better ways to achieve what I want?

Thanks.

I am still trying to find the best way to run multiple processes in parallel.

Comments

  • Valentin R
    Valentin R Member
    Options
    While writing this post I thought that maybe querying records one by one will be more fail-proof. They will get marked for processing or deleted (which is faster, by the way?) quicker which may lead to less overlaps. Still not a 100% solution probably.
  • Ray Deck
    Ray Deck Trusted Xano Expert ✭✭✭
    Options
    Getting this right is a deep and common problem in highly-scaled systems, so you're in good company.

    In other queuing systems, best practice is to make the queued functionality idempotent, which means it would have the same effect even if run multiple times. That way firing redundantly is just a matter of performance, rather than misbehavior.

    Serializing - going one-at-a-time - is much easier to handle than a pool of multiple actions. Obviously slower, but far less chance of running over yourself.
  • Valentin R
    Valentin R Member
    Options
     my next try would be to get a big list of entries, split it in multiple and feed to an internal endpoint. This way multiple processes should be independant.
  • pachocastillosr
    Options

    @Valentin R Hi, I hope you are doing well. Did you find a solution for this? Please let me know, thanks!

  • Valentin R
    Options

    @pachocastillosr I've tried many things, but the latest I am using is smth like this
    1. There is a task that picks entries with a blank "status" from a table.
    2. Every entry of the table is updated with status "processing"
    3. The task pushes a set of leads to Xano's endpoint via lambda function. A lambda is needed to allow paralles processing (and not wait until the endpoint finished it's job). That endpoint runs a function stack that processes entries and at the end of it changes the status in the initial table (queue table) to "done". This function set also utilized the new try-catch function, so that in a case of processing error the entry will get the status "failed" and won't remain in "processing" status.
    4. The tasks runs again, checks how many entries are currently in "processing" status. If there are too many, it does nothing. If not, it picks another batch of entries with blank status.

    I have no idea if this is an optimal setup, but nothing esle worked better for me. I did try the post-processing but it has an issue - all post-processing is done sequentially for a particular function. It doesn't allow parallel processing, so I had to go with a call to an endpoint with a lambda like this:

    axios({ method: 'post', url: 'https://xanoendpoint.com/api:12345/process', data: { unprocessed: $var.unprocessed }});
    return "OK";


  • pachocastillosr
    Options

    @Valentin R thanks for the info, very helpful!