From: | Chris Browne <cbbrowne(at)acm(dot)org> |
---|---|
To: | pgsql-general(at)postgresql(dot)org |
Subject: | Re: Delete/update with limit |
Date: | 2007-07-23 18:52:13 |
Message-ID: | 60644bymua.fsf@dba2.int.libertyrms.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-general |
nagy(at)ecircle-ag(dot)com (Csaba Nagy) writes:
> In postgres we're currently not chunking, due to the fact that the code
> to do it is simply overly contorted and inefficient compared to the
> other DBs we use. At least all the solutions we could think of to do the
> chunking in a safe way while the inserts are running in parallel,
> without disturbing them, have invariably resulted in overly complicated
> code compared to the simple delete with limit + delete trigger solution
> we have for the other DBs.
>
> Now I don't put too much hope I can convince anybody that the limit on
> the delete/update commands has valid usage scenarios, but then can
> anybody help me find a good solution to chunk-wise process such a buffer
> table where insert speed is the highest priority (thus no indexes, the
> minimum of fields), and batch processing should still work fine with big
> table size, while not impacting at all the inserts, and finish in short
> time to avoid long running transactions ? Cause I can't really think of
> one... other than our scheme with the delete with limit + trigger +
> private temp table thing.
All that comes to mind is to put a SERIAL primary key on the table,
which shouldn't be *too* terribly expensive an overhead, assuming
there is reasonably complex processing going on; you then do something
like:
- select ID from the incoming table, order by ID, limit 500, to grab a
list of IDs;
- delete from the table for that set of IDs.
Actually, is there any particular reason why you couldn't simply have
your "batch processing" loop look like:
Loop Forever
DELETE from incoming_table;
VACUUM incoming_table;
End Loop;
???
The alternative that I suggested amounts to:
Loop Forever
DELETE from incoming_table where id in (select id from incoming_table limit 500);
VACUUM incoming_table;
End Loop;
I realize you're concerned that maintaining the index will be too
costly; I don't think it is obvious without actual benchmarking that
this is *in fact* too costly.
I'm pretty sure of one countervailing consideration: there's a cost to
VACUUMing the table that will throw in some costs; it is possible that
the cost of the index would be noise against that.
--
"cbbrowne","@","acm.org"
http://cbbrowne.com/info/lisp.html
When a man talks dirty to a woman, its sexual harassment. When a woman
talks dirty to a man, it's 3.95 per minute.
From | Date | Subject | |
---|---|---|---|
Next Message | Michael Glaesemann | 2007-07-23 18:59:22 | Re: Using COALESCE nside function |
Previous Message | A. Kretschmer | 2007-07-23 18:49:55 | Re: Using COALESCE nside function |