Re: [Patch] Optimize dropping of relation buffers using dlist

From: Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com>
To: amit(dot)kapila16(at)gmail(dot)com
Cc: k(dot)jamison(at)fujitsu(dot)com, tgl(at)sss(dot)pgh(dot)pa(dot)us, andres(at)anarazel(dot)de, robertmhaas(at)gmail(dot)com, tomas(dot)vondra(at)2ndquadrant(dot)com, pgsql-hackers(at)postgresql(dot)org
Subject: Re: [Patch] Optimize dropping of relation buffers using dlist
Date: 2020-09-16 02:16:35
Message-ID: 20200916.111635.2166233126272752935.horikyota.ntt@gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

At Wed, 2 Sep 2020 08:18:06 +0530, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote in
> On Wed, Sep 2, 2020 at 7:01 AM Kyotaro Horiguchi
> <horikyota(dot)ntt(at)gmail(dot)com> wrote:
> > Isn't a relation always locked asscess-exclusively, at truncation
> > time? If so, isn't even the result of lseek reliable enough?
> >
>
> Even if the relation is locked, background processes like checkpointer
> can still touch the relation which might cause problems. Consider a
> case where we extend the relation but didn't flush the newly added
> pages. Now during truncate operation, checkpointer can still flush
> those pages which can cause trouble for truncate. But, I think in the
> recovery path such cases won't cause a problem.

I reconsided on this and still have a doubt.

Is this means lseek(SEEK_END) doesn't count blocks that are
write(2)'ed (by smgrextend) but not yet flushed? (I don't think so,
for clarity.) The nblocks cache is added just to reduce the number of
lseek()s and expected to always have the same value with what lseek()
is expected to return. The reason it is reliable only during recovery
is that the cache is not shared but the startup process is the only
process that changes the relation size during recovery.

If any other process can extend the relation while smgrtruncate is
running, the current DropRelFileNodeBuffers should have the chance
that a new buffer for extended area is allocated at a buffer location
where the function already have passed by, which is a disaster.

regards.

--
Kyotaro Horiguchi
NTT Open Source Software Center

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Michael Paquier 2020-09-16 02:30:06 Re: Force update_process_title=on in crash recovery?
Previous Message Euler Taveira 2020-09-16 02:14:56 Re: Feedback on table expansion hook (including patch)