> But: the question at this point is why we've never seen such a report
> before 8.4. If this theory is correct, it's been broken for a *long*
> time. I can think of a couple of possible explanations:
> A: the problem can only manifest if this loop has work to do for
> a relcache entry that is not the last one in its bucket chain.
> 8.4 might have added more preloaded relcache entries than were there
> before. Or the 8.4 changes in the hash functions might have shuffled
> the entries' bucket placement around so that the problem can happen
> when it couldn't before.
The latter theory appears to be the correct one: in 8.4, pg_database
is at risk (since it has a trigger) and it shares a hash bucket with
pg_ts_dict. In versions 8.0-8.3 there is, by pure luck, no hash
collision for vulnerable catalogs. I checked with variants of
select relname, hashoid(oid)%512 as bucket from pg_class where (relhasrules or relhastriggers) and relkind in ('r','i') and relnamespace = 11 and hashoid(oid)%512 in (select hashoid(oid)%512 from pg_class where relkind in ('r','i') and relnamespace = 11 group by 1 having count(*)>1);
which is conservative since it looks at all system catalogs/indexes
whether or not they are part of the preloaded set.
7.4 does show a collision, but since we've not heard reports of this
before, I speculate that it might have some other behavior that
protects it. The relevant code was certainly a lot different back then.
Interestingly, the bug can no longer be reproduced in CVS HEAD, because
pg_database no longer has a trigger. We had better fix it anyway of
course, since future hash collisions are unpredictable. I'm wondering
though whether to bother back-patching further than 8.4. Thoughts?
> B: the 8.4 changes in the shared-cache-inval mechanism might have
> made it more likely that a freshly started backend could get hit with a
> relcache flush request. I should think that those changes would have
> made this *less* likely not more so, so maybe there is an additional
> bug lurking in that area.
I thought I'd better check this theory too. I double-checked the SI
code and can't find any evidence of a problem of that sort. The
nextMsgNum of a new backend is correctly initialized to maxMsgNum, and
the correct lock is held, so it should work correctly. I think it's
just that Michael's system has sufficiently high load peaks to sometimes
delay an incoming backend long enough for it to get reset. There might
be kernel scheduling quirks contributing to the behavior too.
regards, tom lane
In response to
pgsql-bugs by date
|Next:||From: Heikki Linnakangas||Date: 2009-09-25 06:53:24|
|Subject: Re: Postgresql 8.4.1 segfault, backtrace|
|Previous:||From: Tom Lane||Date: 2009-09-25 00:50:13|
|Subject: Re: Postgresql 8.4.1 segfault, backtrace |