From: | "ktm(at)rice(dot)edu" <ktm(at)rice(dot)edu> |
---|---|
To: | Böckler Andreas <andy(at)boeckler(dot)org> |
Cc: | Kevin Grittner <kgrittn(at)mail(dot)com>, Jeff Janes <jeff(dot)janes(at)gmail(dot)com>, pgsql-performance(at)postgresql(dot)org |
Subject: | Re: Query-Planer from 6seconds TO DAYS |
Date: | 2012-10-26 14:55:13 |
Message-ID: | 20121026145513.GB2872@aart.rice.edu |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-performance |
On Fri, Oct 26, 2012 at 04:37:33PM +0200, Böckler Andreas wrote:
> Hi,
>
>
> Am 25.10.2012 um 20:22 schrieb Kevin Grittner:
>
> >
> > The idea is to model actual costs on your system. You don't show
> > your configuration or describe your hardware, but you show an
> > estimate of retrieving over 4000 rows through an index and describe a
> > response time of 4 seconds, so you must have some significant part of
> > the data cached.
> Sure my effective_cache_size 10 GB
> But my right Table has the size of 1.2 TB (yeah Terra) at the moment (partitioned a 40GB slices) and has 3 * 10^9 records
>
> My left table has only the size of 227MB and 1million records. Peanuts.
> > I would see how the workload behaves with the following settings:
> >
> > effective_cache_size = <your shared_buffers setting plus what the OS
> > shows as cached pages>
> > seq_page_cost = 1
> > random_page_cost = 2
> > cpu_tuple_cost = 0.05
> >
> > You can set these in a session and check the plan with EXPLAIN. Try
> > various other important important queries with these settings and
> > variations on them. Once you hit the right factors to model your
> > actual costs, the optimizaer will make better choices without needing
> > to tinker with it each time.
>
> i've played with that already ….
>
> NESTED LOOP -> GOOD
> SEQSCAN -> VERY BAD
>
> SET random_page_cost = 4;
> 2012-08-14' AND '2012-08-30' -> NESTED LOOP
> 2012-08-13' AND '2012-08-30' -> SEQSCAN
> SET random_page_cost = 2;
> 2012-08-14' AND '2012-08-30' -> NESTED LOOP
> 2012-08-07' AND '2012-08-30' -> NESTED LOOP
> 2012-08-06' AND '2012-08-30' -> SEQSCAN
> SET random_page_cost = 1;
> 2012-08-14' AND '2012-08-30' -> NESTED LOOP
> 2012-08-07' AND '2012-08-30' -> NESTED LOOP
> 2012-07-07' AND '2012-08-30' -> NESTED LOOP
> 2012-07-06' AND '2012-08-30' -> SEQSCAN
>
> The thing is ..
> - You can alter what you want. The planner will switch at a certain time range.
> - There is not one case, where the SEQSCAN-Method will be better .. It's not possible.
>
> So the only way to tell the planner that he's doomed is
> SET enable_seqscan=0
> which is not very elegant. (Query Hints would be BTW jehovah!)
>
> You would be forced to write something like this:
> var lastValueEnable_seqscan = "SHOw enable_seqscan"
> SET enable_seqscan=0;
> SELECT ...
> SET enable_seqscan=lastValueEnable_seqscan;
>
> Kind regards
>
> Andy
>
Hi Andy,
You have the sequential_page_cost = 1 which is better than or equal to
the random_page_cost in all of your examples. It sounds like you need
a sequential_page_cost of 5, 10, 20 or more.
Regards,
Ken
From | Date | Subject | |
---|---|---|---|
Next Message | Kevin Grittner | 2012-10-26 14:59:28 | Re: Query-Planer from 6seconds TO DAYS |
Previous Message | Böckler Andreas | 2012-10-26 14:37:33 | Re: Query-Planer from 6seconds TO DAYS |