Re: Declarative partitioning - another take

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Amit Langote <Langote_Amit_f8(at)lab(dot)ntt(dot)co(dot)jp>
Cc: Amit Langote <amitlangote09(at)gmail(dot)com>, Rajkumar Raghuwanshi <rajkumar(dot)raghuwanshi(at)enterprisedb(dot)com>, Ashutosh Bapat <ashutosh(dot)bapat(at)enterprisedb(dot)com>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Declarative partitioning - another take
Date: 2016-11-30 15:54:56
Message-ID: CA+TgmoYuPJUpnO21XLyYCbomk+6NtJrTWEqHYmbToe9aAwgBww@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, Nov 29, 2016 at 6:24 AM, Amit Langote
<Langote_Amit_f8(at)lab(dot)ntt(dot)co(dot)jp> wrote:
> # All times in seconds (on my modestly-powerful development VM)
> #
> # nrows = 10,000,000 generated using:
> #
> # INSERT INTO $tab
> # SELECT '$last'::date - ((s.id % $maxsecs + 1)::bigint || 's')::interval,
> # (random() * 5000)::int % 4999 + 1,
> # case s.id % 10
> # when 0 then 'a'
> # when 1 then 'b'
> # when 2 then 'c'
> # ...
> # when 9 then 'j'
> # end
> # FROM generate_series(1, $nrows) s(id)
> # ORDER BY random();
> #
> # The first item in the select list is basically a date that won't fall
> # outside the defined partitions.
>
> Time for a plain table = 98.1 sec
>
> #part parted tg-direct-map tg-if-else
> ===== ====== ============= ==========
> 10 114.3 1483.3 742.4
> 50 112.5 1476.6 2016.8
> 100 117.1 1498.4 5386.1
> 500 125.3 1475.5 --
> 1000 129.9 1474.4 --
> 5000 137.5 1491.4 --
> 10000 154.7 1480.9 --

Very nice!

Obviously, it would be nice if the overhead were even lower, but it's
clearly a vast improvement over what we have today.

> Regarding tuple-mapping-required vs no-tuple-mapping-required, all cases
> currently require tuple-mapping, because the decision is based on the
> result of comparing parent and partition TupleDesc using
> equalTupleDescs(), which fails so quickly because TupleDesc.tdtypeid are
> not the same. Anyway, I simply commented out the tuple-mapping statement
> in ExecInsert() to observe just slightly improved numbers as follows
> (comparing with numbers in the table just above):
>
> #part (sub-)parted
> ===== =================
> 10 113.9 (vs. 127.0)
> 100 135.7 (vs. 156.6)
> 500 182.1 (vs. 191.8)

I think you should definitely try to get that additional speedup when
you can. It doesn't seem like a lot when you think of how much is
already being saved, but a healthy number of users are going to
compare it to the performance on an unpartitioned table rather than to
our historical performance. 127/98.1 = 1.29, but 113.9/98.1 = 1.16
-- and obviously a 16% overhead from partitioning is way better than a
29% overhead, even if the old overhead was a million percent.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Amit Langote 2016-11-30 15:56:14 Re: Declarative partitioning - another take
Previous Message Robert Haas 2016-11-30 15:48:51 Re: Declarative partitioning - another take