Re: understanding the interaction with delete/select/vacuum

From: Alan Stange <stange(at)rentec(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: pgsql-novice(at)postgresql(dot)org
Subject: Re: understanding the interaction with delete/select/vacuum
Date: 2005-08-29 20:09:51
Message-ID: 43136B8F.2020202@rentec.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-novice

Tom Lane wrote:
> Alan Stange <stange(at)rentec(dot)com> writes:
>
>> I have a long running process which does a 'SELECT ID FROM T'. The
>> results are being streamed to the client using a fetch size limit. This
>> process with take 26 hours to run. It turns out that all the "C" and
>> "P" are going to be deleted when the SELECT gets to them.
>>
>
>
>> Several hours into this process, after the "C" rows have been deleted in
>> a separate transaction but we haven't yet gotten to the "P" rows, a
>> vacuum is begun on table T.
>>
>
>
>> What happens?
>>
>
> VACUUM can't remove any rows that are still potentially visible to any
> open transaction ... so those rows will stay. It's best to avoid having
> single transactions that take 26 hours to run --- there are a lot of
> other inefficiencies that will show up in such a situation.
Thanks.

Is there a variation of the isolation rules that would achieve my
desired goal: have the deleted rows be vacuumed even though the select
still has them in visibility? Or is this just a the wrong direction to
go in?

> Can you
> break the long-running process into shorter transactions?
>
That's what I'm working on now. I've reworked the sql command so that
the deletes involved don't take hours to run but instead happen in 10K
row chunks. Now I was going to rework the select to work in O(100K) row
chunks.

Is there a reason why the open() calls for a vacuum don't use O_DIRECT,
thus possibly preventing the IO from flushing lots of data from
memory? I was going to hack something up for the WAL files for 8.1,
but I found that O_DIRECT is now used when using open_sync for the WAL
files. Finally, why O_RDWR for the wal files and not O_WRONLY? I was
under the impression that the files were only written to by the usual
postgresql server processes.

Thanks much!

-- Alan

In response to

Browse pgsql-novice by date

  From Date Subject
Next Message Andreas 2005-08-29 20:30:48 message dupes anyone ?
Previous Message Tom Lane 2005-08-29 19:52:13 Re: understanding the interaction with delete/select/vacuum