Re: Caching of Queries

From: "Aaron Werman" <awerman2(at)hotmail(dot)com>
To: "Iain" <iain(at)mst(dot)co(dot)jp>, "Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: "Jim C(dot) Nasby" <decibel(at)decibel(dot)org>, <pgsql-performance(at)postgresql(dot)org>
Subject: Re: Caching of Queries
Date: 2004-09-28 13:04:34
Message-ID: BAY18-DAV126nx8dyDd000801d8@hotmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance


----- Original Message -----
From: "Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: "Iain" <iain(at)mst(dot)co(dot)jp>
Cc: "Jim C. Nasby" <decibel(at)decibel(dot)org>; <pgsql-performance(at)postgresql(dot)org>
Sent: Monday, September 27, 2004 11:17 PM
Subject: Re: [PERFORM] Caching of Queries

> "Iain" <iain(at)mst(dot)co(dot)jp> writes:
> > I can only tell you (roughly) how it works wth Oracle,
>
> Which unfortunately has little to do with how it works with Postgres.
> This "latches" stuff is irrelevant to us.

Latches are the Oracle term for semaphores. Both Oracle and pg use
semaphores and spin locks to serialize activity in critical sections. I
believe that the point that blocking/queuing reduces scalability is valid.

>
> In practice, any repetitive planning in PG is going to be consulting
> catalog rows that it draws from the backend's local catalog caches.
> After the first read of a given catalog row, the backend won't need
> to re-read it unless the associated table has a schema update. (There
> are some other cases, like a VACUUM FULL of the catalog the rows came
> from, but in practice catalog cache entries don't change often in most
> scenarios.) We need place only one lock per table referenced in order
> to interlock against schema updates; not one per catalog row used.
>
> The upshot of all this is that any sort of shared plan cache is going to
> create substantially more contention than exists now --- and that's not
> even counting the costs of managing the cache, ie deciding when to throw
> away entries.

I imagine a design where a shared plan cache would consist of the plans,
indexed by a statement hash and again by dependant objects. A statement to
be planned would be hashed and matched to the cache. DDL would need to
synchronously destroy all dependant plans. If each plan maintains a validity
flag, changing the cache wouldn't have to block so I don't see where there
would be contention.

>
> A backend-local plan cache would avoid the contention issues, but would
> of course not allow amortizing planning costs across multiple backends.
>
> I'm personally dubious that sharing planning costs is a big deal.
> Simple queries generally don't take that long to plan. Complicated
> queries do, but I think the reusability odds go down with increasing
> query complexity.
>

I think both the parse and planning are major tasks if the transaction rate
is high. Simple queries can easily take much longer to plan than execute, so
this is a scalability concern. Caching complicated queries is valuable -
apps seem to have lots of similar queries because they are intimately
related to the data model.

> regards, tom lane
>
> ---------------------------(end of broadcast)---------------------------
> TIP 6: Have you searched our list archives?
>
> http://archives.postgresql.org
>

In response to

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Tom Lane 2004-09-28 13:58:01 Re: Caching of Queries
Previous Message Iain 2004-09-28 04:47:30 Re: Caching of Queries