Re: Multithreaded queue in PgSQL

From: "Jeff Peck" <peck(dot)jeff(at)gmail(dot)com>
To: pgsql-general(at)postgresql(dot)org
Subject: Re: Multithreaded queue in PgSQL
Date: 2008-07-15 09:59:47
Message-ID: e3d3805d0807150259u25493acdt4825a89ceb16a2d@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

> We are doing the same (newsletter) and there is no problem to lock the
> whole table for a short time with an advisory lock as the java id
> fetching worker is locking the table (that does not lock the table for
> reading or writing, it is only locking his java worker brothers that
> are using the same advisory lock), fetches, let's say, 50 id's of
> records marked as CREATED and changes their status to PROCESSING. Then
> several workers are getting the id's and fetch the needed data from
> the table independently and process and update them in parallel. We
> have 3 java machines getting id's for 10 parallel workers and
> everything works just fine.
>
> Getting the IDs is much much faster usually then real processing.

Weird this thread came back up today. I actually implemented a small
example this morning in Python using Klint Gore's suggestion and was
very happy with the results. I have attached 4 files for anyone
interested in looking at a really simple example. You'll need
Python+psycopg2 driver for this, but anyone should be able to follow
this easily:
db.py - Edit this file with your db settings.
client.py - This would be your worker process
insert.py - This is used to enter some jobs into the jobs table.
reset.py - This is used to reset each job back to 'REQUESTED' status.

This example assumes that you have a jobs table like below:
create table jobs(id serial primary key, status text not null)

First edit db.py with your own db settings.
Fire up as many copies of client.py as you'd like.
Now enter some jobs into jobs table. Running insert.py will enter 100
jobs for you.
Now watch as your clients process the jobs.

Once all of the jobs have finished processing, you can run reset.py to
mark all of the jobs back to 'REQUESTED' status so that the clients
start processing all over again.

I hope it is OK to attach examples! Just seems like this question
comes up often.

Jeff Peck

Attachment Content-Type Size
db.py text/plain 456 bytes
client.py text/plain 3.0 KB
insert.py text/plain 563 bytes
reset.py text/plain 424 bytes

In response to

Browse pgsql-general by date

  From Date Subject
Next Message justin 2008-07-15 10:37:33 Re: FAQ correction for Windows 2000/XP
Previous Message Teodor Sigaev 2008-07-15 09:29:54 Re: [GENERAL] Fragments in tsearch2 headline