Re: Millions of tables

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Jeff Janes <jeff(dot)janes(at)gmail(dot)com>
Cc: Greg Spiegelberg <gspiegelberg(at)gmail(dot)com>, Álvaro Hernández Tortosa <aht(at)8kdata(dot)com>, "pgsql-performa(dot)" <pgsql-performance(at)postgresql(dot)org>
Subject: Re: Millions of tables
Date: 2016-09-26 16:52:39
Message-ID: 22569.1474908759@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

Jeff Janes <jeff(dot)janes(at)gmail(dot)com> writes:
> A problem is that those statistics are stored in one file (per database; it
> used to be one file per cluster). With 8 million tables, that is going to
> be a pretty big file. But the code pretty much assumes the file is going
> to be pretty small, and so it has no compunction about commanding that it
> be read and written, in its entirety, quite often.

I don't know that anyone ever believed it would be small. But at the
time the pgstats code was written, there was no good alternative to
passing the data through files. (And I'm not sure we envisioned
applications that would be demanding fresh data constantly, anyway.)

Now that the DSM stuff exists and has been more or less shaken out,
I wonder how practical it'd be to use a DSM segment to make the stats
collector's data available to backends. You'd need a workaround for
the fact that not all the DSM implementations support resize (although
given the lack of callers of dsm_resize, one could be forgiven for
wondering whether any of that code has been tested at all). But you
could imagine abandoning one DSM segment and creating a new one of
double the size anytime the hash tables got too big.

regards, tom lane

In response to

Browse pgsql-performance by date

  From Date Subject
Next Message Jim Nasby 2016-09-26 19:58:50 Re: Storing large documents - one table or partition by doc?
Previous Message Jeff Janes 2016-09-26 16:29:23 Re: Millions of tables