Re: inheritance: planning time vs children number vs column number

From: Marc Cousin <cousinmarc(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>, pgsql-performance(at)postgresql(dot)org
Subject: Re: inheritance: planning time vs children number vs column number
Date: 2011-02-28 18:47:46
Message-ID: 201102281947.46590.cousinmarc@gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

The Monday 28 February 2011 16:35:37, Tom Lane wrote :
> Marc Cousin <cousinmarc(at)gmail(dot)com> writes:
> > The Monday 28 February 2011 13:57:45, Heikki Linnakangas wrote :
> >> Testing here with a table with 1000 columns and 100 partitions, about
> >> 80% of the planning time is looking up the statistics on attribute
> >> width, to calculate average tuple width. I don't see O(n^2) behavior,
> >> though, it seems linear.
> >
> > It is only based on experimentation, for my part, of course

> >
> > If you measure the planning time, modifying either the columns or the
> > partitions number, the square root of the planning time is almost
> > perfectly proportional with the parameter you're playing with.
>
> Could we see a concrete example demonstrating that? I agree with Heikki
> that it's not obvious what you are testing that would have such behavior.
> I can think of places that would have O(N^2) behavior in the length of
> the targetlist, but it seems unlikely that they'd come to dominate
> runtime at a mere 1000 columns.
>
> regards, tom lane

I feel a little silly not having provided a test case from the start…

A script doing a complete test is attached to this email.

It's doing a simple

CREATE TABLE test_father (col0 int,col1 int,col2 int,col3 int,col4 int,col5
int,col6 int,col7 int,col8 int,col9 int,col10 in
t,col11 int,col12 int,col13 int,col14 int,col15 int,col16 int,col17 int,col18
int,col19 int,col20 int,col21 int,col22 int,co
l23 int,…)

Followed by 600
CREATE TABLE test_child_0 (CHECK (col0=0)) INHERITS (test_father);

And a single

SELECT col0 FROM test_father WHERE col0=0;

Here are my results (from the same machine). I've done it with 600 partitions,
to have big planning times. If you need a smaller one (this one takes nearly
ten minutes to run) tell me.

COLS:100 PARTITIONS:600
Time : 513,764 ms (sqrt : 22.6)
COLS:200 PARTITIONS:600
Time : 906,214 ms (sqrt : 30.1)
COLS:300 PARTITIONS:600
Time : 2255,390 ms (sqrt : 47.48)
COLS:400 PARTITIONS:600
Time : 4816,820 ms (sqrt : 69.4)
COLS:500 PARTITIONS:600
Time : 5736,602 ms (sqrt : 75.73)
COLS:600 PARTITIONS:600
Time : 7659,617 ms (sqrt : 87.51)
COLS:700 PARTITIONS:600
Time : 9313,260 ms (sqrt : 96.5)
COLS:800 PARTITIONS:600
Time : 13700,353 ms (sqrt : 117.04)
COLS:900 PARTITIONS:600
Time : 13914,765 ms (sqrt : 117.95)
COLS:1000 PARTITIONS:600
Time : 20335,750 ms (sqrt : 142.6)
COLS:1100 PARTITIONS:600
Time : 21048,958 ms (sqrt : 145.08)
COLS:1200 PARTITIONS:600
Time : 27619,559 ms (sqrt : 166.18)
COLS:1300 PARTITIONS:600
Time : 31357,353 ms (sqrt : 177.08)
COLS:1400 PARTITIONS:600
Time : 34435,711 ms (sqrt : 185.57)
COLS:1500 PARTITIONS:600
Time : 38954,676 ms (sqrt : 197.37)

As for my previous results, these ones are on a machine doing a bit of other
work, so some values may be a bit offset, and it's only one measure each time
anyway.

The CSV file I sent from the first email is obtained running the exact same
commands, but playing on both columns and partitions, and averaged over 3
measures.

Regards.

Attachment Content-Type Size
script.sql.bz2 application/x-bzip 10.0 KB

In response to

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Tom Lane 2011-02-28 18:57:36 Re: Query on view radically slower than query on underlying table
Previous Message Josh Berkus 2011-02-28 18:35:42 Re: Load and Stress on PostgreSQL 9.0