[WIP] cache estimates, cache access cost

From: Cédric Villemain <cedric(dot)villemain(dot)debian(at)gmail(dot)com>
To: Greg Smith <greg(at)2ndquadrant(dot)com>
Cc: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: [WIP] cache estimates, cache access cost
Date: 2011-06-14 14:29:36
Message-ID: BANLkTi=WU9YuVS6s4oV_8gM3ggcFiKP-rA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

2011/5/16 Greg Smith <greg(at)2ndquadrant(dot)com>:
> Cédric Villemain wrote:
>>
>>
>> http://git.postgresql.org/gitweb?p=users/c2main/postgres.git;a=shortlog;h=refs/heads/analyze_cache
>>
>
> This rebases easily to make Cedric's changes move to the end; I just pushed
> a version with that change to
> https://github.com/greg2ndQuadrant/postgres/tree/analyze_cache if anyone
> wants a cleaner one to browse.  I've attached a patch too if that's more
> your thing.
>
> I'd recommend not getting too stuck on the particular hook Cédric has added
> here to compute the cache estimate, which uses mmap and mincore to figure it
> out.  It's possible to compute similar numbers, albeit less accurate, using
> an approach similar to how pg_buffercache inspects things.  And I even once
> wrote a background writer extension that collected this sort of data as it
> was running the LRU scan anyway.  Discussions of this idea seem to focus on
> how the "what's in the cache?" data is collected, which as far as I'm
> concerned is the least important part.  There are multiple options, some
> work better than others, and there's no reason that can't be swapped out
> later.  The more important question is how to store the data collected and
> then use it for optimizing queries.

Attached are updated patches without the plugin itself. I've also
added the cache_page_cost GUC, this one is not per tablespace, like
others page_cost.

There are 6 patches:

0001-Add-reloscache-column-to-pg_class.patch
0002-Add-a-function-to-update-the-new-pg_class-cols.patch
0003-Add-ANALYZE-OSCACHE-VERBOSE-relation.patch
0004-Add-a-Hook-to-handle-OSCache-stats.patch
0005-Add-reloscache-to-Index-Rel-OptInfo.patch
0006-Add-cache_page_cost-GUC.patch

I have some comments on my own code:

* I am not sure of the best datatype to use for 'reloscache'
* I didn't include the catalog number change in the patch itself.
* oscache_update_relstats() is very similar to vac_update_relstats(),
maybe better to merge them but reloscache should not be updated at the
same time than other stats.
* There is probably too much work done in do_oscache_analyze_rel()
because I kept vac_open_indexes() (not a big drama atm)
* I don't know so much how gram.y works, so I am not sure my changes
cover all cases.
* No tests; similar columns and GUC does not have test either, but it
lacks a test for ANALYZE OSCACHE

--
Cédric Villemain               2ndQuadrant
http://2ndQuadrant.fr/     PostgreSQL : Expertise, Formation et Support

Attachment Content-Type Size
0001-Add-reloscache-column-to-pg_class.patch text/x-patch 7.3 KB
0002-Add-a-function-to-update-the-new-pg_class-cols.patch text/x-patch 2.5 KB
0003-Add-ANALYZE-OSCACHE-VERBOSE-relation.patch text/x-patch 11.9 KB
0004-Add-a-Hook-to-handle-OSCache-stats.patch text/x-patch 2.4 KB
0005-Add-reloscache-to-Index-Rel-OptInfo.patch text/x-patch 3.0 KB
0006-Add-cache_page_cost-GUC.patch text/x-patch 4.4 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2011-06-14 14:30:28 Re: PATCH: CreateComments: use explicit indexing for ``values''
Previous Message Alexander Korotkov 2011-06-14 14:25:09 Re: WIP: collect frequency statistics for arrays