Re: Question about caching on full table scans

From: Scott Marlowe <scott(dot)marlowe(at)gmail(dot)com>
To: Markus Innerebner <markus(dot)innerebner(at)inf(dot)unibz(dot)it>
Cc: Albe Laurenz <laurenz(dot)albe(at)wien(dot)gv(dot)at>, pgsql-performance(at)postgresql(dot)org
Subject: Re: Question about caching on full table scans
Date: 2012-08-30 18:13:45
Message-ID: CAOR=d=2hdVwmm_M808GyhQqu95OfTTKf=UEtPqn6SadVg51SEg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

On Thu, Aug 30, 2012 at 11:34 AM, Markus Innerebner
<markus(dot)innerebner(at)inf(dot)unibz(dot)it> wrote:
> Hi Laurenz,
>
>
> In your approach 1 to 3, what do you mean with "load into main memory"?
>
>
>
> I forgot to say: I use Java and connect with JDBC.
>
> in approach 1 I do an initial loading of the entire relation, by executing 1
> SQL query to load all edges in main memory, where I create my main memory
> structure
> as an adjacency list.
>
> Do you
> a) make sure that the data you talk about are in the PostgreSQL buffer
> cache
> or
>
>
> b) retrieve the data from PostgreSQL and store it somewhere in your
> application?
>
>
> In approach 1 I do that, as described before.
>
> But after each experiment I restart a new java process.
>
>
>
> To clear PostgreSQL's cache, restart the server.
> That should be a fast operation.
> Since version 8.3, PostgreSQL is smart enough not to evict the
> whole cache for a large sequential scan.
>
>
>
>
> To flush the filesystem cache (from Linux 2.6.16 on), use
> sync; echo 3 > /proc/sys/vm/drop_caches
>
>
> I started to do that , and
> yes, this solves my problem!!
>
> I assume that deleting file system cache implies that also postgres cache is
> deleted, isn't it ?

NO. PostgreSQL maintains its own cache. To flush it you need to
restart postgresql. However, in a previous post you stated this:

> I know that postgres (and OS) is caching that dataset. But is there a way to force the database
> to remove that values from the cache?

It is NOT guaranteed that postgresql will be caching your data in a
full table scan. To keep from blowing out the shared buffers
postgresql uses as a cache, it uses a ring buffer for sequential scans
so it is quite likely that on a sequential scan postgresql is not
caching your data.

In response to

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Bruce Momjian 2012-08-30 20:44:33 Re: pg_dump and thousands of schemas
Previous Message Jeff Janes 2012-08-30 18:00:32 Re: Question about caching on full table scans