Re: pg_stat_*_columns?

From: Magnus Hagander <magnus(at)hagander(dot)net>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Joel Jacobson <joel(at)trustly(dot)com>, Jim Nasby <Jim(dot)Nasby(at)bluetreble(dot)com>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: pg_stat_*_columns?
Date: 2015-06-21 15:43:34
Message-ID: CABUevEwV=_zc8Zfrn9UQR13VDPtVNSHaA3sGNmvJR7MhE1MqYA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Sat, Jun 20, 2015 at 11:55 PM, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:

> On Sat, Jun 20, 2015 at 7:01 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> >> But if the structure
> >> got too big to map (on a 32-bit system), then you'd be sort of hosed,
> >> because there's no way to attach just part of it. That might not be
> >> worth worrying about, but it depends on how big it's likely to get - a
> >> 32-bit system is very likely to choke on a 1GB mapping, and maybe even
> >> on a much smaller one.
> >
> > Yeah, I'm quite worried about assuming that we can map a data structure
> > that might be of very significant size into shared memory on 32-bit
> > machines. The address space just isn't there.
>
> Considering the advantages of avoiding message queues, I think we
> should think a little bit harder about whether we can't find some way
> to skin this cat. As I think about this a little more, I'm not sure
> there's really a problem with one stats DSM per database. Sure, the
> system might have 100,000 databases in some crazy pathological case,
> but the maximum number of those that can be in use is bounded by
> max_connections, which means the maximum number of stats file DSMs we
> could ever need at one time is also bounded by max_connections. There
> are a few corner cases to think about, like if the user writes a
> client that connects to all 100,000 databases in very quick
> succession, we've got to jettison the old DSMs fast enough to make
> room for the new DSMs before we run out of slots, but that doesn't
> seem like a particularly tough nut to crack. If the stats collector
> ensures that it never attaches to more than MaxBackends stats DSMs at
> a time, and each backend ensures that it never attaches to more than
> one stats DSM at a time, then 2 * MaxBackends stats DSMs is always
> enough. And that's just a matter of bumping
> PG_DYNSHMEM_SLOTS_PER_BACKEND from 2 to 4.
>
> In more realistic cases, it will probably be normal for many or all
> backends to be connected to the same database, and the number of stats
> DSMs required will be far smaller.
>
>

What about a combination in the line of something like this: stats
collector keeps the statistics in local memory as before. But when a
backend needs to get a snapshot of it's data, it uses a shared memory queue
to request it. What the stats collector does in this case is allocate a new
DSM, copy the data into that DSM, and hands the DSM over to the backend. At
this point the stats collector can forget about it, and it's up to the
backend to get rid of it when it's done with it.

That means the address space thing should not be any worse than today,
because each backend will still only see "it's own data". And we only need
to copy the data for databases that are actually used.

--
Magnus Hagander
Me: http://www.hagander.net/
Work: http://www.redpill-linpro.com/

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Andres Freund 2015-06-21 15:53:36 Re: pg_stat_*_columns?
Previous Message Andres Freund 2015-06-21 15:11:54 Re: Insufficient locking for ALTER DEFAULT PRIVILEGES