Re: Prepared statements and generic plans

From: "'bruce(at)momjian(dot)us'" <bruce(at)momjian(dot)us>
To: Albe Laurenz <laurenz(dot)albe(at)wien(dot)gv(dot)at>
Cc: "David G(dot) Johnston" <david(dot)g(dot)johnston(at)gmail(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Prepared statements and generic plans
Date: 2016-06-10 23:46:40
Message-ID: 20160610234640.GA27837@momjian.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, Jun 7, 2016 at 06:52:15AM +0000, Albe Laurenz wrote:
> Bruce Momjian wrote:
> >> ! distinct column values, a generic plan assumes a column equality
> >> ! comparison will match 33% of processed rows. Column statistics
> >>
> >> ... assumes *that* a column equality comparison will match 33% of *the* processed rows.
> >
> > Uh, that seems overly wordy. I think the rule is that if the sentence
> > makes sense without the words, you should not use them, but it is
> > clearly a judgement call in this case. Do you agree?
>
> My gut feeling is that at least the "the" should be retained, but mine
> are the guts of a German speaker.
> It is clearly a judgement call, so follow your instincts.

I think "that/the" would make sense if this sentence was referencing a
specific result. The sentence is referencing a hypothetical, so I don't
think "that/the" is needed.

> > One more thing --- there was talk of moving some of this into chapter
> > 66, but as someone already mentioned, there are no subsections there
> > because it is a dedicated topic:
> >
> > 66. How the Planner Uses Statistics.
> >
> > I am not inclined to add a prepare-only section to that chapter. On the
> > other hand, the issues described apply to PREPARE and to protocol-level
> > prepare, so having it in PREPARE also seems illogical. However, I am
> > inclined to leave it in PREPARE until we are ready to move all of this
> > to chapter 66.
>
> I think it would be ok to leave it where it is in your patch; while the
> paragraph goes into technical detail, it is still alright in the general
> documentation (but only just).

I researched moving some of this text into chapter 66, but found that
only some of it related to the optimizer. I also realized that the text
applies to the libpq/wire protocol prepare cases too, so rather than
bounce readers to the PREPARE manual page, and then to chapter 66, I
just kept it all in PREPARE, with a reference from the wire protocol
prepare section.

Also, is it possible to do an EXPLAIN prepare() with the libpq/wire
protocol? I can't do PREPARE EXPLAIN, but I can do EXPLAIN EXECUTE.
However, I don't see any way to inject EXPLAIN into the libpq/wire
prepare case. Can you specify prepare(EXPLAIN SELECT)? (PREPARE
EXPLAIN SELECT throws a syntax error.)

Looking at how the code behaves, it seems custom plans that are _more_
expensive (plus planning cost) than the generic plan switch to the
generic plan after five executions, as now documented. Custom plans
that are significantly _cheaper_ than the generic plan _never_ use the
generic plan.

Here is an example --- first load this SQL:

DROP TABLE IF EXISTS test;
CREATE TABLE test (c1 INT, c2 INT);
INSERT INTO test SELECT c1, abs(floor(random() * 4)-1) FROM generate_series(1, 10000) AS a(c1);
INSERT INTO test SELECT c1, abs(floor(random() * 4)-1) FROM generate_series(10001, 15000) AS a(c1);
INSERT INTO test SELECT c1, abs(floor(random() * 4)-1) FROM generate_series(15001, 20000) AS a(c1);
-- add non-uniformly-distributed values to 'c2'
INSERT INTO test SELECT 20001, 3;
INSERT INTO test SELECT 20002, 4;
CREATE INDEX i_test_c1 ON test (c1);
CREATE INDEX i_test_c2 ON test (c2);
ANALYZE test;
PREPARE prep_c1 AS SELECT * FROM test WHERE c1 = $1;
PREPARE prep_c2 AS SELECT * FROM test WHERE c2 = $1;

prep_c1 references 'c1', which is a unique column. Any value used in
the EXECUTE, e.g. EXPLAIN EXECUTE prep_c1(1), existent or non-existent,
generates an index scan, and after five executions a generic index
scan is used.

For prep_c2, if you use the 50% common value '1', the first five
executions use a sequential scan, then the sixth is a generic Bitmap
Heap Scan. For the 25% value of '0' or '2', the first five runs
generate a Bitmap Heap Scan, and a generic Bitmap Heap Scan on the sixth
and after.

For a prep_c2 value of 3 or any non-existent value, an Index Scan is
used, and a generic plan is never chosen, because the Index Scan is
significantly cheaper than the generic plan.

Updated patch attached.

--
Bruce Momjian <bruce(at)momjian(dot)us> http://momjian.us
EnterpriseDB http://enterprisedb.com

+ As you are, so once was I. As I am, so you will be. +
+ Ancient Roman grave inscription +

Attachment Content-Type Size
prepare.diff text/x-diff 5.7 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Thom Brown 2016-06-11 00:44:23 Confusing recovery message when target not hit
Previous Message Andres Freund 2016-06-10 22:34:24 Re: Perf Benchmarking and regression.