Batch processing select

From: Steve Crawford <scrawford(at)pinpointresearch(dot)com>
To: postgresql General <pgsql-general(at)postgresql(dot)org>
Subject: Batch processing select
Date: 2004-11-02 23:30:10
Message-ID: 200411021530.10027.scrawford@pinpointresearch.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

I'm still mulling the best way to handle this situation.

I have a table that describes work to be processed. This table
includes a description of the work as well as priority and scheduling
information (certain records can only be handled by certain client
processes or at particular times of the day or week). I have several
hundred client processes to handle the work, most, but not all, of
which can handle any of the items in the database.

When a process is free, it needs to return the results to the table
(not an issue) but also needs to get new work assigned for processing
(problem).

I need to select one record from the table so the client program can
process it. This record should be the highest priority item that the
requesting client is able to process at that particular time. Of
course, it can't be a record that has been completed or which is
already being handled by another process.

Performance is an issue. Each piece of work takes ~20-300 seconds to
handle and the overall processing rate is ~10 items/second. The
"to-do" table often exceeds 500,000 records.

In my earlier attempt I tried "select ... for update where {record
needs processing} limit 1; set status flag to in-progress;".

Unfortunately for this purpose the second process hitting the DB will
block and then return 0 records when the first process completes
since the status-flag has changed to in-progress.

I've considered "select ... for update where {record needs processing
and tuple not locked} limit 1..." but don't know of a function that
returns the lock status of a tuple.

Any ideas of how I can attack this problem?

Cheers,
Steve

Browse pgsql-general by date

  From Date Subject
Next Message Martin Foster 2004-11-02 23:52:12 Restricting Postgres
Previous Message Steve Crawford 2004-11-02 22:48:21 Re: Daylight Savings Time handling on persistent connections