From: | Boszormenyi Zoltan <zb(at)cybertec(dot)at> |
---|---|
To: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
Cc: | Alvaro Herrera <alvherre(at)commandprompt(dot)com>, Marc Cousin <cousinmarc(at)gmail(dot)com>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>, Hans-Juergen Schoenig <hs(at)cybertec(dot)at>, Ants Aasma <ants(at)cybertec(dot)at>, Robert Haas <robertmhaas(at)gmail(dot)com> |
Subject: | Re: [PATCH] lock_timeout and common SIGALRM framework |
Date: | 2012-09-24 21:30:41 |
Message-ID: | 5060D101.8020908@cybertec.at |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
Hi,
2012-09-22 20:49 keltezéssel, Tom Lane írta:
> Boszormenyi Zoltan <zb(at)cybertec(dot)at> writes:
>> new version with a lot more cleanup is attached.
> I looked at this patch, and frankly I'm rather dismayed. It's a mess.
I hope you won't find this one a mess. I tried to address all your complaints.
> [rather long diatribe on modifying PGSemaphoreLock improperly]
I have returned to the previous version that used PGSemaphoreTimedLock
and this time I save and restore errno around calling the timeout condition
checker.
> The whole lmgrtimeout module seems to me to be far more mechanism than is
> warranted, for too little added functionality. [...]
lmgrtimeout is no more, back to hardcoding things.
> The minimum thing I would want it to do is avoid calling
> timeout.c multiple times, which is what would happen right now (leading
> to four extra syscalls per lock acquisition, which is enough new overhead
> to constitute a strong objection to committing this patch at all).
I have added enable_multiple_timeouts() and disable_multiple_timeouts()
that minimize the number setitimer() calls.
> There's considerable lack of attention to updating comments, too.
> For instance in WaitOnLock you only bothered to update the comment
> immediately adjacent to the changed code, and not the two comment
> blocks above that, which both have specific references to deadlocks
> being the reason for failure.
I modified the comment in question. I hope the wording is right.
> Also, the "per statement" mode for lock timeout doesn't seem to be
> any such thing, because it's implemented like this:
>
> + case LOCK_TIMEOUT_PER_STMT:
> + enable_timeout_at(LOCK_TIMEOUT,
> + TimestampTzPlusMilliseconds(
> + GetCurrentStatementStartTimestamp(),
> + LockTimeout));
> + break;
>
> That doesn't provide anything like "you can spend at most N milliseconds
> waiting for locks during a statement". What it is is "if you happen to be
> waiting for a lock N milliseconds after the statement starts, or if you
> attempt to acquire any lock more than N milliseconds after the statement
> starts, you lose instantly". I don't think that definition actually adds
> any useful functionality compared to setting statement_timeout to N
> milliseconds, and it's certainly wrongly documented. To do what the
> documentation implies would require tracking and adding up the time spent
> waiting for locks during a statement. Which might be a good thing to do,
> especially if the required gettimeofday() calls could be shared with what
> timeout.c probably has to do anyway at start and stop of a lock wait.
> But this code doesn't do it.
The code now properly accumulates the time spent in waiting for
LOCK_TIMEOUT. This means that if STATEMENT_TIMEOUT and
LOCK_TIMEOUT are set to the same value, STATEMENT_TIMEOUT
will trigger because it considers the time as one contiguous unit,
LOCK_TIMEOUT only accounts the time spent in waiting, not the time
spent with useful work. This means that LOCK_TIMEOUT doesn't
need any special code in its handler function, it's a NOP. The
relation between timeouts is only handled by the timeout.c module.
> Lastly, I'm not sure where is the best place to be adding the control
> logic for this, but I'm pretty sure postinit.c is not it. It oughta be
> somewhere under storage/lmgr/, no?
The above change means that there is no control logic outside of
storage/lmgr now.
I temporarily abandoned the idea of detailed error reporting on
the object type and name/ID. WaitOnLock() reports "canceling statement
due to lock timeout" and LockAcquire() kept its previous semantics.
This can be quickly revived in case of demand, it would be another ~15K patch.
I hope you can find another time slot in this CF to review this one.
Best regards,
Zoltán Böszörményi
--
----------------------------------
Zoltán Böszörményi
Cybertec Schönig & Schönig GmbH
Gröhrmühlgasse 26
A-2700 Wiener Neustadt, Austria
Web: http://www.postgresql-support.de
http://www.postgresql.at/
Attachment | Content-Type | Size |
---|---|---|
2-lock_timeout-v26.patch.gz | application/x-tar | 11.9 KB |
From | Date | Subject | |
---|---|---|---|
Next Message | Simon Riggs | 2012-09-24 21:52:46 | Re: [PoC] load balancing in libpq |
Previous Message | Alvaro Herrera | 2012-09-24 20:17:20 | Re: External Replication |