Re: Trouble with plan statistics for behaviour for query.

From: Craig James <cjames(at)emolecules(dot)com>
To: Trevor Campbell <tcampbell(at)atlassian(dot)com>
Cc: pgsql-performance(at)postgresql(dot)org
Subject: Re: Trouble with plan statistics for behaviour for query.
Date: 2012-06-01 00:08:49
Message-ID: CAFwQ8rdiCgdqjjgLNKirHTuZP_bWX_+aXrkqXyWPqg1+-0BNgQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general pgsql-performance

On Thu, May 31, 2012 at 4:34 PM, Trevor Campbell <tcampbell(at)atlassian(dot)com>wrote:

> Thanks Craig, that certainly leads down the right path.
>
> The following is all done in pgAdmin3:
>
> Using an actual value we I get the plan I expect
> explain analyze select CG.ID, CG.ISSUEID, CG.AUTHOR, CG.CREATED, CI.ID,
> CI.FIELDTYPE, CI.FIELD, CI.OLDVALUE, CI.OLDSTRING, CI.NEWVALUE, CI.NEWSTRING
> from PUBLIC.CHANGEGROUP CG inner join PUBLIC.CHANGEITEM CI on CG.ID =
> CI.GROUPID where CG.ISSUEID=10006 order by CG.CREATED asc, CI.ID asc
>
> "Sort (cost=106.18..106.22 rows=13 width=434) (actual time=0.115..0.115
> rows=12 loops=1)"
> " Sort Key: cg.created, ci.id"
> " Sort Method: quicksort Memory: 29kB"
> " -> Nested Loop (cost=0.00..105.94 rows=13 width=434) (actual
> time=0.019..0.067 rows=12 loops=1)"
> " -> Index Scan using chggroup_issue on changegroup cg
> (cost=0.00..19.73 rows=10 width=29) (actual time=0.009..0.013 rows=10
> loops=1)"
> " Index Cond: (issueid = 10006::numeric)"
> " -> Index Scan using chgitem_chggrp on changeitem ci
> (cost=0.00..8.58 rows=3 width=411) (actual time=0.004..0.005 rows=1
> loops=10)"
> " Index Cond: (groupid = cg.id)"
> "Total runtime: 0.153 ms"
>
> Using a prepared statement with a variable , I get a poor plan requiring a
> sequential scan
> prepare t2(real) as
> select CG.ID, CG.ISSUEID, CG.AUTHOR, CG.CREATED, CI.ID, CI.FIELDTYPE,
> CI.FIELD, CI.OLDVALUE, CI.OLDSTRING, CI.NEWVALUE, CI.NEWSTRING
> from PUBLIC.CHANGEGROUP CG inner join PUBLIC.CHANGEITEM CI on CG.ID =
> CI.GROUPID where CG.ISSUEID=$1 order by CG.CREATED asc, CI.ID asc;
>
> explain analyze execute t2 (10006);
>
> "Sort (cost=126448.89..126481.10 rows=12886 width=434) (actual
> time=1335.615..1335.616 rows=12 loops=1)"
> " Sort Key: cg.created, ci.id"
> " Sort Method: quicksort Memory: 29kB"
> " -> Nested Loop (cost=0.00..125569.19 rows=12886 width=434) (actual
> time=0.046..1335.556 rows=12 loops=1)"
> " -> Seq Scan on changegroup cg (cost=0.00..44709.26 rows=10001
> width=29) (actual time=0.026..1335.460 rows=10 loops=1)"
> " Filter: ((issueid)::double precision = $1)"
> " -> Index Scan using chgitem_chggrp on changeitem ci
> (cost=0.00..8.05 rows=3 width=411) (actual time=0.007..0.008 rows=1
> loops=10)"
> " Index Cond: (groupid = cg.id)"
> "Total runtime: 1335.669 ms"
>
> Using a prepared statement with a cast of the variable to the right type,
> I get the good plan back
> prepare t2(real) as
> select CG.ID, CG.ISSUEID, CG.AUTHOR, CG.CREATED, CI.ID, CI.FIELDTYPE,
> CI.FIELD, CI.OLDVALUE, CI.OLDSTRING, CI.NEWVALUE, CI.NEWSTRING
> from PUBLIC.CHANGEGROUP CG inner join PUBLIC.CHANGEITEM CI on CG.ID =
> CI.GROUPID where CG.ISSUEID=cast($1 as numeric) order by CG.CREATED asc,
> CI.ID asc;
>
> explain analyze execute t2 (10006);
>
> "Sort (cost=106.19..106.22 rows=13 width=434) (actual time=0.155..0.156
> rows=12 loops=1)"
> " Sort Key: cg.created, ci.id"
> " Sort Method: quicksort Memory: 29kB"
> " -> Nested Loop (cost=0.00..105.95 rows=13 width=434) (actual
> time=0.048..0.111 rows=12 loops=1)"
> " -> Index Scan using chggroup_issue on changegroup cg
> (cost=0.00..19.73 rows=10 width=29) (actual time=0.031..0.042 rows=10
> loops=1)"
> " Index Cond: (issueid = ($1)::numeric)"
> " -> Index Scan using chgitem_chggrp on changeitem ci
> (cost=0.00..8.58 rows=3 width=411) (actual time=0.006..0.006 rows=1
> loops=10)"
> " Index Cond: (groupid = cg.id)"
> "Total runtime: 0.203 ms"
>
> Now the challenge is to get java/jdbc to get this done right. We make a
> big effort to ensure we always use prepared statements and variable
> bindings to help protect from SQL injection vulnerabilities.
>

JDBC has some features that are supposed to be convenient (automatic
preparing based on a number-of-executions threshold) that strike me as
misguided. It's one thing to hide irrelevant details from the app, and
another thing entirely to cause a huge change in the exact SQL that's sent
to the server ... which is what JDBC seems to do.

I think the trick is that if you use JDBC prepared statements, you have to
understand how it's trying to be trickly and circumvent it so that you're
always in full control of what it's doing.

Craig

>
>
>
> On 01/06/12 09:08, Craig James wrote:
>
>> I use Perl, not JDBC, but this thread may be relevant to your problem.
>>
>> http://postgresql.1045698.n5.**nabble.com/Slow-statement-**
>> when-using-JDBC-td3368379.html<http://postgresql.1045698.n5.nabble.com/Slow-statement-when-using-JDBC-td3368379.html>
>>
>>
>>

In response to

Browse pgsql-general by date

  From Date Subject
Next Message Joshua Tolley 2012-06-01 01:30:47 Re: Procedural Languages
Previous Message Trevor Campbell 2012-05-31 23:34:24 Re: Trouble with plan statistics for behaviour for query.

Browse pgsql-performance by date

  From Date Subject
Next Message Vitalii Tymchyshyn 2012-06-01 09:06:51 Re: Trouble with plan statistics for behaviour for query.
Previous Message Trevor Campbell 2012-05-31 23:34:24 Re: Trouble with plan statistics for behaviour for query.