Skip site navigation (1) Skip section navigation (2)

Re: understanding the interaction with delete/select/vacuum

From: Alan Stange <stange(at)rentec(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: pgsql-novice(at)postgresql(dot)org
Subject: Re: understanding the interaction with delete/select/vacuum
Date: 2005-08-29 20:09:51
Message-ID: 43136B8F.2020202@rentec.com (view raw or flat)
Thread:
Lists: pgsql-novice
Tom Lane wrote:
> Alan Stange <stange(at)rentec(dot)com> writes:
>   
>> I have a long running process which does a 'SELECT ID FROM T'.  The 
>> results are being streamed to the client using a fetch size limit.  This 
>> process with take 26 hours to run.    It turns out that all the "C" and 
>> "P" are going to be deleted when the SELECT gets to them.
>>     
>
>   
>> Several hours into this process, after the "C" rows have been deleted in 
>> a separate transaction but we haven't yet gotten to the "P" rows, a 
>> vacuum is begun on table T.
>>     
>
>   
>> What happens?
>>     
>
> VACUUM can't remove any rows that are still potentially visible to any
> open transaction ... so those rows will stay.  It's best to avoid having
> single transactions that take 26 hours to run --- there are a lot of
> other inefficiencies that will show up in such a situation.  
Thanks.

Is there a variation of the isolation rules that would achieve my 
desired goal:  have the deleted rows be vacuumed even though the select 
still has them in visibility?   Or is this just a the wrong direction to 
go in?

> Can you
> break the long-running process into shorter transactions?
>   
That's what I'm working on now.   I've reworked the sql command so that 
the deletes involved don't take hours to run but instead happen in 10K 
row chunks.  Now I was going to rework the select to work in O(100K) row 
chunks.


Is there a reason why the open() calls for a vacuum don't use O_DIRECT, 
thus possibly preventing the IO from flushing lots of data from 
memory?   I was going to hack something up for the WAL files for 8.1, 
but I found that O_DIRECT is now used when using open_sync for the WAL 
files.   Finally, why O_RDWR for the wal files and not O_WRONLY?   I was 
under the impression that the files were only written to by the usual 
postgresql server processes.

Thanks much!

-- Alan

In response to

pgsql-novice by date

Next:From: AndreasDate: 2005-08-29 20:30:48
Subject: message dupes anyone ?
Previous:From: Tom LaneDate: 2005-08-29 19:52:13
Subject: Re: understanding the interaction with delete/select/vacuum

Privacy Policy | About PostgreSQL
Copyright © 1996-2014 The PostgreSQL Global Development Group