Re: Are we accepting cancel interrupts too often?

From: Bruce Momjian <pgman(at)candle(dot)pha(dot)pa(dot)us>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Are we accepting cancel interrupts too often?
Date: 2001-12-31 16:41:56
Message-ID: 200112311641.fBVGfuP28721@candle.pha.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Tom Lane wrote:
> Bruce Momjian <pgman(at)candle(dot)pha(dot)pa(dot)us> writes:
> > I started to look at when this nice code was added to determine if this
> > was part of the original design or added later and found you wrote it
> > yourself, so I guess we don't have to ask anyone to make sure there
> > isn't something were are missing.
>
> As far as I can recall my thinking at the time, it went like so:
> "We *should* be able to accept a cancel interrupt anywhere we are not
> actually in the midst of modifying shared-memory data structures,
> because after all the database system is supposed to be robust against
> crashes, and those could happen anyplace".
>
> But the fallacy in equating a cancel to a crash is that we have rather
> extensive logic for coping with a crash (including reinitializing shared
> memory from scratch). A cancel will only provoke elog cleanup, which is
> not nearly as thorough. For example, it's not obvious that shared
> memory structures that are protected by different locks couldn't get out
> of sync.
>

Yes, I saw the RESUME_INTERRUPTS in SpinLockRelease(). It seems very
aggresive to allow a query cancel there.

>
> BTW, I spent some time yesterday trying to use this worry to explain my
> latest favorite bugaboo, the duplicate-rows complaints we've gotten from
> a few people. It is easy to see that a cancel being accepted at the
> right place (exit from the first WriteBuffer in heap_update) could leave
> an updated tuple created and its buffer marked dirty, while the old
> tuple's buffer is not yet marked dirty and might therefore be discarded
> unwritten. (The WAL entry is correct but will never be consulted unless
> there's a crash.) However, this scenario doesn't seem to explain the
> failures because the cancel would lead to transaction abort, so the
> updated tuple should never be considered good anyway. Back to the
> drawing board...

I thought we were seeing duplicates in 7.1, which didn't have this code.

--
Bruce Momjian | http://candle.pha.pa.us
pgman(at)candle(dot)pha(dot)pa(dot)us | (610) 853-3000
+ If your life is a hard drive, | 830 Blythe Avenue
+ Christ can be your backup. | Drexel Hill, Pennsylvania 19026

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2001-12-31 17:02:30 Re: Are we accepting cancel interrupts too often?
Previous Message Tom Lane 2001-12-31 16:32:57 Re: Are we accepting cancel interrupts too often?