From: | Fabrício dos Anjos Silva <fabricio(dot)silva(at)linkcom(dot)com(dot)br> |
---|---|
To: | pgsql-performance(at)postgresql(dot)org |
Subject: | Re: How does PG know if data is in memory? |
Date: | 2010-10-01 14:00:44 |
Message-ID: | AANLkTinrYVcG-K7wZ+mgGDM6x179eak24dFL8Pc3j34z@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-performance |
Craig,
I agree with you. Not completely, but I do.
I'm just stuck in a situation where I can't figure out what values to use
for the parameters. I can't even think of a way on how to test and discover
that.
I followed Josh Berkus' GUC spreadsheet and some tutorials on PG wiki,
but how do I test if my configuration is good or bad? I see in PG log that
some queries have bad plans, but should I do in order to tell PG to make
better decisions? I tried different values with no success.
I understand that parameters have no "work everywhere" values. Each
database has its characteristics and each server has its HW specifications.
Is there any automated test tool? A can compile a list of real-world
queries, and provide an exact copy of my db server just for testing. But how
do I do it? Write a bunch of scripts? Is there any serious tool that try
different parameters, run a load test, process results and generate reports?
Again, thanks all of you for the replies.
Cheers,
Fabrício dos Anjos Silva
LinkCom Soluções em T.I.
2010/10/1 Kevin Grittner <Kevin(dot)Grittner(at)wicourts(dot)gov>
> Craig Ringer <craig(at)postnewspapers(dot)com(dot)au> wrote:
>
> > Because the query often only wants a small subset of the data, and
> > whole relations are rarely fully cached, it's not enough to know
> > that "some of relation X is cached", it has to know if the cached
> > parts are the parts that'll be required, or at least an
> > approximation of that. It sounds horrendously complicated to keep
> > track of to me, and in the end it won't make query execution any
> > faster, it'll just potentially help the planner pick a better
> > plan. I wonder if that'd be worth the extra CPU time spent
> > managing the cache and cache content stats, and using those cache
> > stats when planning? It'd be an interesting experiment, but the
> > outcome is hardly obvious.
>
> I agree with that, but I think there's an even more insidious issue
> here. Biasing plans heavily toward using what is already in cache
> could have a destabilizing effect on performance. Let's say that
> some query or maintenance skews the cache toward some plan which is
> much slower when cached than another plan would be if cached. Let's
> also postulate that this query runs very frequently. It will always
> settle for what's fastest *this* time, not what would make for
> fastest performance if consistently used. If it never chooses the
> plan which would run better if cached, the data used for that plan
> may never make it into cache, and you will limp along with the
> inferior plan forever.
>
> If you set the overall level of caching you expect, the optimizer
> will tend to wind up with data cached to support the optimal plans
> for that level of caching for the frequently run queries.
>
> -Kevin
>
From | Date | Subject | |
---|---|---|---|
Next Message | Tom Lane | 2010-10-01 14:40:11 | Re: How does PG know if data is in memory? |
Previous Message | Kevin Grittner | 2010-10-01 13:46:07 | Re: How does PG know if data is in memory? |