I attach a patch to fix some issues in the autovac process handling
code. Mainly, instead of rechecking periodically in the launcher
whether a worker was able to start or not, what we do is signal it from
the postmaster when it fails.
In order to do this I had to make the postmaster set a flag in shared
memory, much like the pmsignal.c code does. According to previous
discussions this is an acceptable thing for postmaster to do.
I must admit I did not try to fill shared memory with garbage and see if
it caused a postmaster crash. What I did test was creating a database,
making sure that the launcher was picking it up, and then delete the
database's directory. This correctly causes a worker to log errors when
trying to connect, and it is handled well -- i.e. other databases are
not starved. (One remaining problem I see is that a database in this
state, or similar, that fails to be vacuumed, *will* cause starvation if
in danger of Xid wraparound).
Another thing I noticed is that I can change the "autovacuum" parameter
in postgresql.conf and send SIGHUP to postmaster, which correctly
starts the launcher if it was previously disabled. To make the shutdown
case act accordingly, I made the launcher check the "start daemon" flag
after rereading the config file, and quit if it's off. I tried multiple
combinations of having it set and unset, changing
stats_collect_tuple_level and it works fine.
One problem with the patch is this (new code):
bn = (Backend *) malloc(sizeof(Backend));
! if (bn)
! bn->pid = StartAutoVacWorker();
! bn->is_autovacuum = true;
! /* we don't need a cancel key */
! if (bn->pid > 0)
! /* FIXME -- unchecked memory allocation here */
! DLAddHead(BackendList, DLNewElem(bn));
If the palloc() inside DLNewElem fails, we will fail to report a "fork
failure" to the launcher. I am not sure how serious this is. One idea
that came to mind was using a PG_TRY block, sending the signal in the
CATCH block, and then rethrowing the exception. Is this acceptable?
All in all, this patch increases the reliability of autovacuum, so I
intend to apply it shortly unless there are objections. Further
improvements might come later as I become aware of possible fixes.
Alvaro Herrera http://www.amazon.com/gp/registry/DXLWNGRJD34J
"Hay quien adquiere la mala costumbre de ser infeliz" (M. A. Evans)
pgsql-patches by date
|Next:||From: Tom Lane||Date: 2007-06-19 17:53:17|
|Subject: Re: [HACKERS] 'Waiting on lock' |
|Previous:||From: Michael Paesold||Date: 2007-06-19 17:37:40|
|Subject: Re: WIP: rewrite numeric division|