Re: Cache relation sizes?

From: Kyotaro HORIGUCHI <horiguchi(dot)kyotaro(at)lab(dot)ntt(dot)co(dot)jp>
To: k(dot)jamison(at)jp(dot)fujitsu(dot)com
Cc: andres(at)anarazel(dot)de, tsunakawa(dot)takay(at)jp(dot)fujitsu(dot)com, thomas(dot)munro(at)enterprisedb(dot)com, ideriha(dot)takeshi(at)jp(dot)fujitsu(dot)com, david(dot)rowley(at)2ndquadrant(dot)com, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Cache relation sizes?
Date: 2019-02-14 11:41:04
Message-ID: 20190214.204104.41690378.horiguchi.kyotaro@lab.ntt.co.jp
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

At Wed, 13 Feb 2019 05:48:28 +0000, "Jamison, Kirk" <k(dot)jamison(at)jp(dot)fujitsu(dot)com> wrote in <D09B13F772D2274BB348A310EE3027C643880D(at)g01jpexmbkw24>
> On February 6, 2019, 8:57 AM +0000, Andres Freund wrote:
> > Maybe I'm missing something here, but why is it actually necessary to
> > have the sizes in shared memory, if we're just talking about caching
> > sizes? It's pretty darn cheap to determine the filesize of a file that
> > has been recently stat()/lseek()/ed, and introducing per-file shared
> > data adds *substantial* complexity, because the amount of shared memory
> > needs to be pre-determined. The reason I want to put per-relation data
> > into shared memory is different, it's about putting the buffer mapping
> > into shared memory, and that, as a prerequisite, also need per-relation
> > data. And there's a limit of the number of relations that need to be
> > open (one per cached page at max), and space can be freed by evicting
> > pages.
>
> Ahh.. You are right about the logic of putting it in the shared memory.
> As for Thomas' toy patch, multiple files share one counter in shmem.
> Although it currently works, it might likely to miss.
> Though his eventual plan of the idea is to use an array of N counters
> and map relation OIDs onto them.
> But as your point about complexity says, in shared memory we cannot
> share the same area with multiple files, so that needs an area to
> allocate depending on the number of files.
>
> Regarding the allocation of per-relation data in shared memory, I
> thought it can be a separated component at first so I asked for
> validity of the idea. But now I consider the point raised.

I still believe that one shared memory element for every
non-mapped relation is not only too-complex but also too-much, as
Andres (and implicitly I) wrote. I feel that just one flag for
all works fine but partitioned flags (that is, relations or files
corresponds to the same hash value share one flag) can reduce the
shared memory elements to a fixed small number.

Note: I'm still not sure how much lseek impacts performance.

regards.

--
Kyotaro Horiguchi
NTT Open Source Software Center

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Kyotaro HORIGUCHI 2019-02-14 12:04:37 Re: [Suspect SPAM] Better error messages when lacking connection slots for autovacuum workers and bgworkers
Previous Message Matsumura, Ryo 2019-02-14 11:00:24 RE: [PROPOSAL]a new data type 'bytea' for ECPG