Re: WIP: cross column correlation ...

From: Rod Taylor <rod(dot)taylor(at)gmail(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Bruce Momjian <bruce(at)momjian(dot)us>, Josh Berkus <josh(at)agliodbs(dot)com>, PostgreSQL - Hans-Jürgen Schönig <postgres(at)cybertec(dot)at>, pgsql-hackers Hackers <pgsql-hackers(at)postgresql(dot)org>, Boszormenyi Zoltan <zb(at)cybertec(dot)at>
Subject: Re: WIP: cross column correlation ...
Date: 2011-02-25 17:03:58
Message-ID: AANLkTi=5F5nyLE+Ox9j2C1AvzG-1qboZzpuZfGSukdZO@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

> 4. Even if we could accurately estimate the percentage of the table
> that is cached, what then? For example, suppose that a user issues a
> query which retrieves 1% of a table, and we know that 1% of that table
> is cached. How much of the data that the user asked for is cache?
> Hard to say, right? It could be none of it or all of it. The second
> scenario is easy to imagine - just suppose the query's been executed
> twice. The first scenario isn't hard to imagine either.
>
>
I have a set of slow disks which can impact performance nearly as much as in
cached in memory versus the fast disks.

How practical would it be for analyze to keep a record of response times for
given sections of a table as it randomly accesses them and generate some
kind of a map for expected response times for the pieces of data it is
analysing?

It may well discover, on it's own, that recent data (1 month old or less)
has a random read response time of N, older data (1 year old) in a different
section of the relation tends to have a response time of 1000N, and really
old data (5 year old) tends to have a response time of 3000N.

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2011-02-25 17:17:55 Re: wCTE behaviour
Previous Message Cédric Villemain 2011-02-25 16:45:43 Re: WIP: cross column correlation ...