Re: Any better plan for this query?..

From: Dimitri <dimitrik(dot)fr(at)gmail(dot)com>
To: Aidan Van Dyk <aidan(at)highrise(dot)ca>
Cc: Simon Riggs <simon(at)2ndquadrant(dot)com>, Merlin Moncure <mmoncure(at)gmail(dot)com>, PostgreSQL Performance <pgsql-performance(at)postgresql(dot)org>
Subject: Re: Any better plan for this query?..
Date: 2009-05-11 22:46:53
Message-ID: 5482c80a0905111546q6587eecaoceb86e1ed4ba44c8@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

Hi Aidan,

thanks a lot for this detailed summary!

So, why I don't use prepare here: let's say I'm testing the worst
stress case :-) Imagine you have thousands of such kind of queries -
you cannot prepare all of them! :-) or you'll maybe prepare it once,
but as I showed previously in this thread prepare statement itself
takes 16ms, so for a single shot there is no gain! :-) Stressing with
such kind of short and simple queries (and again, they have joins, it
may be even more simple :-)) will give me a result to show with
guarantee my worst case - I know then if I have to deploy a bombarding
OLTP-like application my database engine will be able to keep such
workload, and if I have performance problems they are inside of
application! :-) (well, it's very simplistic, but it's not far from
the truth :-))

Now, as you see from your explanation, the Part #2 is the most
dominant - so why instead to blame this query not to implement a QUERY
PLANNER CACHE??? - in way if any *similar* query is recognized by
parser we simply *reuse* the same plan?..

Rgds,
-Dimitri

On 5/11/09, Aidan Van Dyk <aidan(at)highrise(dot)ca> wrote:
> * Dimitri <dimitrik(dot)fr(at)gmail(dot)com> [090511 11:18]:
>> Folks, it's completely crazy, but here is what I found:
>>
>> - if HISTORY table is analyzed with target 1000 my SELECT response
>> time is jumping to 3ms, and the max throughput is limited to 6.000 TPS
>> (it's what happenned to 8.3.7)
>>
>> -if HISTORY table is analyzed with target 5 - my SELECT response time
>> is decreasing to 1.2ms (!) and then my max TPS level is ~12.000 !
>> and CPU is used up to 95% even by 8.3.7 :-) and 8.4 performed better
>> just because I left its analyze target to default 100 value.
>>
>> Anyone may explain me why analyze target may have so huge negative
>> secondary effect?..
>
> It's actually pretty straight forward.
>
> The PostgreSQL query planner is a "smart planner". It takes into
> consideration all the statistics available on the columns/tables,
> expected outputs based on inputs, etc, to choose what it thinks will be
> the best plan. The more data you have in statistics (the larger
> statistics target you have), the more CPU time and longer it's going to
> take to "plan" your queries. The tradeoff is hopefully better plans.
>
> But, in your scenario, where you are hitting the database with the
> absolute worst possible way to use PostgreSQL, with small, repeated,
> simple queries, you're not getting the advantage of "better" plans. In
> your case, you're throwing absolutely simple queries at PG as fast as
> you can, and for each query, PostgreSQL has to:
>
> 1) Parse the given "query string"
> 2) Given the statistics available, plan the query and pick the best one
> 3) Actually run the query.
>
> Part 2 is going to dominate the CPU time in your tests, more so the more
> statistics it has to evaluate, and unless the data has to come from the
> disks (i.e. not in shared buffers or cache) is thus going to dominate the
> time before you get your results. More statistics means more time
> needed to do the planning/picking of the query.
>
> If you were to use prepared statements, the cost of #1 and #2 is done
> once, and then every time you throw a new execution of the query to
> PostgreSQL, you get to just do #3, the easy quick part, especially for
> small simple queries where all the data is in shared buffers or the cache.
>
> a.
>
> --
> Aidan Van Dyk Create like a god,
> aidan(at)highrise(dot)ca command like a king,
> http://www.highrise.ca/ work like a slave.
>

In response to

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Alvaro Herrera 2009-05-11 22:54:29 Re: Any better plan for this query?..
Previous Message Dimitri 2009-05-11 22:30:43 Re: Any better plan for this query?..