Re: inheritance: planning time vs children number vs column number

From: Marc Cousin <cousinmarc(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>, pgsql-performance(at)postgresql(dot)org
Subject: Re: inheritance: planning time vs children number vs column number
Date: 2011-03-01 07:42:42
Message-ID: 201103010842.42128.cousinmarc@gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

Le mardi 01 mars 2011 07:20:19, Tom Lane a écrit :
> Marc Cousin <cousinmarc(at)gmail(dot)com> writes:
> > The Monday 28 February 2011 16:35:37, Tom Lane wrote :
> >> Could we see a concrete example demonstrating that? I agree with Heikki
> >> that it's not obvious what you are testing that would have such
> >> behavior. I can think of places that would have O(N^2) behavior in the
> >> length of the targetlist, but it seems unlikely that they'd come to
> >> dominate runtime at a mere 1000 columns.
> >
> > I feel a little silly not having provided a test case from the startق€�
> >
> > A script doing a complete test is attached to this email.
>
> I did some oprofile analysis of this test case. It's spending
> essentially all its time in SearchCatCache, on failed searches of
> pg_statistic. The cache accumulates negative entries for each probed
> column, and then the searches take time proportional to the number of
> entries, so indeed there is an O(N^2) behavior --- but N is the number
> of columns times number of tables in your test case, not just the number
> of columns.
>
> The cache is a hash table, so ideally the search time would be more or
> less constant as the table grows, but to make that happen we'd need to
> reallocate with more buckets as the table grows, and catcache.c doesn't
> do that yet. We've seen a few cases that make that look worth doing,
> but they tend to be pretty extreme, like this one.
>
> It's worth pointing out that the only reason this effect is dominating
> the runtime is that you don't have any statistics for these toy test
> tables. If you did, the cycles spent using those entries would dwarf
> the lookup costs, I think. So it's hard to get excited about doing
> anything based on this test case --- it's likely the bottleneck would be
> somewhere else entirely if you'd bothered to load up some data.
>
> regards, tom lane

Yes, for the same test case, with a bit of data in every partition and
statistics up to date, planning time goes from 20 seconds to 125ms for the 600
children/1000 columns case. Which is of course more than acceptable.

Now I've got to check it's the same problem on the real environment. I think
it has quite a few empty partitions, so no statistics for them…

Thanks a lot.

Marc

In response to

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Joby Joba 2011-03-01 08:46:44 Two different execution plans for similar requests
Previous Message Tom Lane 2011-03-01 06:20:19 Re: inheritance: planning time vs children number vs column number