Re: Fix mdsync never-ending loop problem

From: Heikki Linnakangas <heikki(at)enterprisedb(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Patches <pgsql-patches(at)postgresql(dot)org>
Subject: Re: Fix mdsync never-ending loop problem
Date: 2007-04-05 16:57:04
Message-ID: 46152A60.1070404@enterprisedb.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers pgsql-patches

Tom Lane wrote:
> Heikki Linnakangas <heikki(at)enterprisedb(dot)com> writes:
>> Tom Lane wrote:
>>> But I dislike copying the table entries anyway, see comment on -hackers.
>
>> Frankly the cycle id idea sounds more ugly and fragile to me. You'll
>> need to do multiple scans of the hash table that way, starting from top
>> every time you call AbsorbFsyncRequests (like we do know).
>
> How so? You just ignore entries whose cycleid is too large. You'd have
> to be careful about wraparound in the comparisons, but that's not hard
> to deal with. Also, AFAICS you still have the retry problem (and an
> even bigger memory leak problem) with this coding --- the "to-do list"
> doesn't eliminate the issue of correct handling of a failure.

You have to start the hash_seq_search from scratch after each call to
AbsorbFsyncRequests because it can remove entries, including the one the
scan is stopped on.

I think the failure handling is correct in the "to-do list" approach,
when an entry is read from the list, it's checked that the entry hasn't
been removed from the hash table. Actually there was a bug in the
original LDC patch in the failure handling: it replaced the per-entry
failures-counter with a local retry_counter variable, but it wasn't
cleared after a successful write which would lead to bogus ERRORs when
multiple relations are dropped during the mdsync. I kept the original
per-entry counter, though the local variable approach could be made to work.

The memory leak obviously needs to be fixed, but that's just a matter of
adding a pfree.

>> Ok, will do that. Or would you like to just take over from here?
>
> No, I'm up to my ears in varlena. You're the one in a position to test
> this, anyway.

Ok.

--
Heikki Linnakangas
EnterpriseDB http://www.enterprisedb.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2007-04-05 17:08:32 Re: Fix mdsync never-ending loop problem
Previous Message Andrew Dunstan 2007-04-05 16:51:34 buildfarm minimum script versions

Browse pgsql-patches by date

  From Date Subject
Next Message Tom Lane 2007-04-05 17:08:32 Re: Fix mdsync never-ending loop problem
Previous Message Tom Lane 2007-04-05 16:34:03 Re: Fix mdsync never-ending loop problem