Quick Links

Re: plan time of MASSIVE partitioning ...

From:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To:	Robert Haas <robertmhaas(at)gmail(dot)com>
Cc:	Boszormenyi Zoltan <zb(at)cybertec(dot)at>, Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>, Alvaro Herrera <alvherre(at)commandprompt(dot)com>, Stephen Frost <sfrost(at)snowman(dot)net>, Hans-Jürgen Schönig <postgres(at)cybertec(dot)at>, pgsql-hackers Hackers <pgsql-hackers(at)postgresql(dot)org>, Josh Berkus <josh(at)agliodbs(dot)com>
Subject:	Re: plan time of MASSIVE partitioning ...
Date:	2010-10-29 17:44:14
Message-ID:	4411.1288374254@sss.pgh.pa.us
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

Robert Haas <robertmhaas(at)gmail(dot)com> writes:
> On Fri, Oct 29, 2010 at 12:53 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>> However, if the hot spot does stay in SearchCatCache, I can't help
>> noticing that those bucket chains are looking a bit overloaded ---
>> sixty-plus entries per bucket ain't good. Maybe it's time to teach
>> catcache.c how to reorganize its hashtables once the load factor
>> exceeds a certain level. Or more drastically, maybe it should lose
>> its private hashtable logic and use dynahash.c; I'm not sure at the
>> moment if the private implementation has any important characteristics
>> dynahash hasn't got.

> I'm not sure what's happening in this particular case, but I seem to
> remember poking at a case a while back where we were doing a lot of
> repeated statistics lookups for the same columns. If that's also the
> the case here and if there is some way to avoid it (hang a pointer to
> the stats off the node tree somewhere?) we might be able to cut down
> on the number of hash probes, as an alternative to or in addition to
> making them faster.

I think there are already layers of caching in the planner to avoid
fetching the same stats entries more than once per query. The problem
here is that there are so many child tables that even fetching stats
once per table per query starts to add up. (Also, as I said, I'm
worried that we're being misled by the fact that there are no stats to
fetch --- so we're not seeing the costs of actually doing something with
the stats if they existed.)

regards, tom lane

In response to

Re: plan time of MASSIVE partitioning ... at 2010-10-29 17:16:44 from Robert Haas

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	murphy pope	2010-10-29 18:15:15	Re: Madam I
Previous Message	Tom Lane	2010-10-29 17:31:32	Re: plan time of MASSIVE partitioning ...