Re: Postgresql Caching

From: mark(at)mark(dot)mielke(dot)cc
To: Lexington Luthor <Lexington(dot)Luthor(at)gmail(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Postgresql Caching
Date: 2006-10-15 18:40:08
Message-ID: 20061015184008.GA20873@mark.mielke.cc
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Sun, Oct 15, 2006 at 06:43:43PM +0100, Lexington Luthor wrote:
> mark(at)mark(dot)mielke(dot)cc wrote:
> >Using memcache, I've had problems with consistency brought right to
> >the front. Both of these have failed me:
> > 1) When updating a PostgreSQL record, I invalidate the memcache record.
> > If another process comes along in parallel before I commit, notices
> > that the memcache record is invalidated, it queries the data from
> > SQL, and updates the memcache record back to the old value. :-(
> How can this fail? The PostgreSQL MVCC will hold the second transaction
> back until the effect on the tuple is known (i.e. after the first
> transaction is over). Have you not been using SERIALIZABLE transactions?

I don't *want* to use SERIALIZABLE transactions. That would grind my
application to a halt.

Consistency isn't the same as serializable. Memcache offers *NEITHER*.

> With a bit of careful planning (and a few SELECT FOR UPDATE queries to
> prevent deadlock), having perfect consistency and correct caching is
> possible.

Your conclusion is false for all cases except data that will never change.

You can never have perfect consistency across different systems (memcache /
postgresql) and especially not when their visibility rules differ. What is
visible to something via memcache is always latest uncommitted. What is
visible in PostgreSQL is something less than that. Consistency is not
possible. Correct caching is therefore also not possible unless you define
correct as 'latest', and even then, you have problems if memcache expires
the record, before the real record has been commited into PostgreSQL.

Under a significant enough load, it becomes visible.

> > 2) When updating a PostgreSQL record, I updated the memcache record
> > to the new value. If another process comes along in parallel before
> > I commit, that is still looking at an older view, cross-referencing
> > may not work as expected.
> This breaks integrity, and all bets are off.

Both break integrity.

> >I'm currently settled on 2), but setting a short timeout (5 seconds) on
> >the data. Still an imperfect compromise between speed and accuracy, but
> >it isn't causing me problems... yet.
> What exactly does your application do about the possibility of incorrect
> data?

Right now? I've limited it to display problems. Wrong counts. I think
I tracked down all the significant problems. For me, "latest" is often
equally good to "consistent" where memcache is giving "latest" and
PostgreSQL is giving "consistent".

> >Consistency is very valuable to me. If it wasn't for memcache being
> >hundreds or more times faster, I wouldn't use it in the cases I do.
> >It can be dangerous.
> Consistency and caching are not mutually exclusive, and there are many
> frameworks that handle the burden of maintaining both for you.

Consistency and memcached, *are* mutually exclusive.

memcached provides no mechanisms for consistency.

Close may good enough for many. Close is the best that it can do.

Cheers,
mark

--
mark(at)mielke(dot)cc / markm(at)ncf(dot)ca / markm(at)nortel(dot)com __________________________
. . _ ._ . . .__ . . ._. .__ . . . .__ | Neighbourhood Coder
|\/| |_| |_| |/ |_ |\/| | |_ | |/ |_ |
| | | | | \ | \ |__ . | | .|. |__ |__ | \ |__ | Ottawa, Ontario, Canada

One ring to rule them all, one ring to find them, one ring to bring them all
and in the darkness bind them...

http://mark.mielke.cc/

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message mark 2006-10-15 18:41:46 Re: Postgresql Caching
Previous Message Neil Conway 2006-10-15 18:26:12 Re: Asynchronous I/O Support