Re: pg_stat_*_columns?

From: Magnus Hagander <magnus(at)hagander(dot)net>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Joel Jacobson <joel(at)trustly(dot)com>, Jim Nasby <Jim(dot)Nasby(at)bluetreble(dot)com>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: pg_stat_*_columns?
Date: 2015-06-20 15:15:11
Message-ID: CABUevEz6sjx254hPZV_AX=sxodtoeVZHmE4CjuCMunTFgcrv=g@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Sat, Jun 20, 2015 at 10:55 AM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:

> Robert Haas <robertmhaas(at)gmail(dot)com> writes:
> > On Sat, Jun 20, 2015 at 10:12 AM, Joel Jacobson <joel(at)trustly(dot)com>
> wrote:
> >> I guess it
> >> primarily depends on how much of the new code that would need to be
> >> rewritten, if the collector is optimized/rewritten in the future?
>
> > I don't think that's really the issue. It's more that I think it
> > would be the extra data would be likely to cause real pain for users.
>
> Yes. The stats data already causes real pain.
>
> > FWIW, I tend to think that the solution here has less to do with
> > splitting the data up into more files and more to do with rethinking
> > the format.
>
> I dunno that tweaking the format would accomplish much. Where I'd love
> to get to is to not have to write the data to disk at all (except at
> shutdown). But that seems to require an adjustable-size shared memory
> block, and I'm not sure how to do that. One idea, if the DSM stuff
> could be used, is to allow the stats collector to allocate multiple
> DSM blocks as needed --- but how well would that work on 32-bit
> machines? I'd be worried about running out of address space.
>

I've considered both that and to perhaps use a shared memory message queue
to communicate. Basically, have a backend send a request when it needs a
snapshot of the stats data and get a copy back through that method instead
of disk. It would be much easier if we didn't actually take a snapshot of
the data per transaction, but we really don't want to give that up (if we
didn't care about that, we could just have a protocol asking for individual
values).

We'd need a way to actually transfer the whole hashtables over, without
rebuilding them on the other end I think. Just the cost of looping over it
to dump and then rehashing everything on the other end seems quite wasteful
and unnecessary.

--
Magnus Hagander
Me: http://www.hagander.net/
Work: http://www.redpill-linpro.com/

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tomas Vondra 2015-06-20 15:28:56 Re: pretty bad n_distinct estimate, causing HashAgg OOM on TPC-H
Previous Message Tomas Vondra 2015-06-20 14:56:04 Re: pretty bad n_distinct estimate, causing HashAgg OOM on TPC-H