Re: How does PG know if data is in memory?

From: Fabrício dos Anjos Silva <fabricio(dot)silva(at)linkcom(dot)com(dot)br>
To: pgsql-performance(at)postgresql(dot)org
Subject: Re: How does PG know if data is in memory?
Date: 2010-10-01 14:00:44
Message-ID: AANLkTinrYVcG-K7wZ+mgGDM6x179eak24dFL8Pc3j34z@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

Craig,

I agree with you. Not completely, but I do.

I'm just stuck in a situation where I can't figure out what values to use
for the parameters. I can't even think of a way on how to test and discover
that.
I followed Josh Berkus' GUC spreadsheet and some tutorials on PG wiki,
but how do I test if my configuration is good or bad? I see in PG log that
some queries have bad plans, but should I do in order to tell PG to make
better decisions? I tried different values with no success.

I understand that parameters have no "work everywhere" values. Each
database has its characteristics and each server has its HW specifications.

Is there any automated test tool? A can compile a list of real-world
queries, and provide an exact copy of my db server just for testing. But how
do I do it? Write a bunch of scripts? Is there any serious tool that try
different parameters, run a load test, process results and generate reports?

Again, thanks all of you for the replies.

Cheers,

Fabrício dos Anjos Silva
LinkCom Soluções em T.I.

2010/10/1 Kevin Grittner <Kevin(dot)Grittner(at)wicourts(dot)gov>

> Craig Ringer <craig(at)postnewspapers(dot)com(dot)au> wrote:
>
> > Because the query often only wants a small subset of the data, and
> > whole relations are rarely fully cached, it's not enough to know
> > that "some of relation X is cached", it has to know if the cached
> > parts are the parts that'll be required, or at least an
> > approximation of that. It sounds horrendously complicated to keep
> > track of to me, and in the end it won't make query execution any
> > faster, it'll just potentially help the planner pick a better
> > plan. I wonder if that'd be worth the extra CPU time spent
> > managing the cache and cache content stats, and using those cache
> > stats when planning? It'd be an interesting experiment, but the
> > outcome is hardly obvious.
>
> I agree with that, but I think there's an even more insidious issue
> here. Biasing plans heavily toward using what is already in cache
> could have a destabilizing effect on performance. Let's say that
> some query or maintenance skews the cache toward some plan which is
> much slower when cached than another plan would be if cached. Let's
> also postulate that this query runs very frequently. It will always
> settle for what's fastest *this* time, not what would make for
> fastest performance if consistently used. If it never chooses the
> plan which would run better if cached, the data used for that plan
> may never make it into cache, and you will limp along with the
> inferior plan forever.
>
> If you set the overall level of caching you expect, the optimizer
> will tend to wind up with data cached to support the optimal plans
> for that level of caching for the frequently run queries.
>
> -Kevin
>

In response to

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Tom Lane 2010-10-01 14:40:11 Re: How does PG know if data is in memory?
Previous Message Kevin Grittner 2010-10-01 13:46:07 Re: How does PG know if data is in memory?