Re: Way to check whether a particular block is on the shared_buffer?

From: Kouhei Kaigai <kaigai(at)ak(dot)jp(dot)nec(dot)com>
To: Jim Nasby <Jim(dot)Nasby(at)BlueTreble(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>, Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Amit Langote <Langote_Amit_f8(at)lab(dot)ntt(dot)co(dot)jp>
Subject: Re: Way to check whether a particular block is on the shared_buffer?
Date: 2016-02-05 03:50:36
Message-ID: 9A28C8860F777E439AA12E8AEA7694F8011A7573@BPXM15GP.gisp.nec.co.jp
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

> On 2/4/16 12:30 AM, Kouhei Kaigai wrote:
> >> 2. A feature to suspend i/o write-out towards a particular blocks
> >> > that are registered by other concurrent backend, unless it is not
> >> > unregistered (usually, at the end of P2P DMA).
> >> > ==> to be discussed.
>
> I think there's still a race condition here though...
>
> A
> finds buffer not in shared buffers
>
> B
> reads buffer in
> modifies buffer
> starts writing buffer to OS
>
> A
> Makes call to block write, but write is already in process; thinks
> writes are now blocked
> Reads corrupted block
> Much hilarity ensues
>
> Or maybe you were just glossing over that part for brevity.
>
Thanks, this part was not clear from my previous description.

At the time when B starts writing buffer to OS, extension will catch
this i/o request using a hook around the smgrwrite, then the mechanism
registers the block to block P2P DMA request during B's write operation.
(Of course, it unregisters the block at end of the smgrwrite)
So, even if A wants to issue P2P DMA concurrently, it cannot register
the block until B's write operation.

In practical, this operation shall be "try lock", because B's write
operation implies existence of the buffer in main memory, so B does
not need to wait A's write operation if B switch DMA source from SSD
to main memory.

Thanks,
--
NEC Business Creation Division / PG-Strom Project
KaiGai Kohei <kaigai(at)ak(dot)jp(dot)nec(dot)com>

> ...
>
> > I tried to design a draft of enhancement to realize the above i/o write-out
> > suspend/resume, with less invasive way as possible as we can.
> >
> > ASSUMPTION: I intend to implement this feature as a part of extension,
> > because this i/o suspend/resume checks are pure overhead increment
> > for the core features, unless extension which utilizes it.
> >
> > Three functions shall be added:
> >
> > extern int GetStorageMgrNumbers(void);
> > extern f_smgr GetStorageMgrHandlers(int smgr_which);
> > extern void SetStorageMgrHandlers(int smgr_which, f_smgr smgr_handlers);
> >
> > As literal, GetStorageMgrNumbers() returns the number of storage manager
> > currently installed. It always return 1 right now.
> > GetStorageMgrHandlers() returns the currently configured f_smgr table to
> > the supplied smgr_which. It allows extensions to know current configuration
> > of the storage manager, even if other extension already modified it.
> > SetStorageMgrHandlers() assigns the supplied 'smgr_handlers', instead of
> > the current one.
> > If extension wants to intermediate 'smgr_write', extension will replace
> > the 'smgr_write' by own function, then call the original function, likely
> > mdwrite, from the alternative function.
> >
> > In this case, call chain shall be:
> >
> > FlushBuffer, and others...
> > +-- smgrwrite(...)
> > +-- (extension's own function)
> > +-- mdwrite
>
> ISTR someone (Robert Haas?) complaining that this method of hooks is
> cumbersome to use and can be fragile if multiple hooks are being
> installed. So maybe we don't want to extend it's usage...
>
> I'm also not sure whether this is better done with an smgr hook or a
> hook into shared buffer handling...
> --
> Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
> Experts in Analytics, Data Architecture and PostgreSQL
> Data in Trouble? Get it in Treble! http://BlueTreble.com
>
>
> --
> Sent via pgsql-hackers mailing list (pgsql-hackers(at)postgresql(dot)org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-hackers

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Etsuro Fujita 2016-02-05 04:01:06 Re: Optimization for updating foreign tables in Postgres FDW
Previous Message Robert Haas 2016-02-05 03:33:32 Re: postgres_fdw join pushdown (was Re: Custom/Foreign-Join-APIs)