Re: pg_stat_*_columns?

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Magnus Hagander <magnus(at)hagander(dot)net>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Joel Jacobson <joel(at)trustly(dot)com>, Jim Nasby <Jim(dot)Nasby(at)bluetreble(dot)com>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: pg_stat_*_columns?
Date: 2015-06-20 15:32:55
Message-ID: 45012.1434814375@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Magnus Hagander <magnus(at)hagander(dot)net> writes:
> On Sat, Jun 20, 2015 at 10:55 AM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>> I dunno that tweaking the format would accomplish much. Where I'd love
>> to get to is to not have to write the data to disk at all (except at
>> shutdown). But that seems to require an adjustable-size shared memory
>> block, and I'm not sure how to do that. One idea, if the DSM stuff
>> could be used, is to allow the stats collector to allocate multiple
>> DSM blocks as needed --- but how well would that work on 32-bit
>> machines? I'd be worried about running out of address space.

> I've considered both that and to perhaps use a shared memory message queue
> to communicate. Basically, have a backend send a request when it needs a
> snapshot of the stats data and get a copy back through that method instead
> of disk. It would be much easier if we didn't actually take a snapshot of
> the data per transaction, but we really don't want to give that up (if we
> didn't care about that, we could just have a protocol asking for individual
> values).

Yeah, that might work quite nicely, and it would not require nearly as
much surgery on the existing code as mapping the stuff into
constrained-size shmem blocks would do. The point about needing a data
snapshot is a good one as well; I'm not sure how we'd preserve that
behavior if backends are accessing the collector's data structures
directly through shmem.

I wonder if we should think about replacing the IP-socket-based data
transmission protocol with a shared memory queue, as well.

> We'd need a way to actually transfer the whole hashtables over, without
> rebuilding them on the other end I think. Just the cost of looping over it
> to dump and then rehashing everything on the other end seems quite wasteful
> and unnecessary.

Meh. All of a sudden you've made it complicated and invasive again,
to get rid of a bottleneck that's not been shown to be a problem.
Let's do the simple thing first, else maybe nothing will happen at all.

regards, tom lane

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Alvaro Herrera 2015-06-20 15:36:11 Re: pretty bad n_distinct estimate, causing HashAgg OOM on TPC-H
Previous Message Feng Tian 2015-06-20 15:29:33 Re: pretty bad n_distinct estimate, causing HashAgg OOM on TPC-H