Re: PL/TCL Patch to prevent postgres from becoming multithreaded

From: Bruce Momjian <bruce(at)momjian(dot)us>
To: "Marshall, Steve" <smarshall(at)wsi(dot)com>
Cc: pgsql-patches(at)postgresql(dot)org
Subject: Re: PL/TCL Patch to prevent postgres from becoming multithreaded
Date: 2007-09-14 17:20:58
Message-ID: 200709141720.l8EHKwX13792@momjian.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers pgsql-patches


This has been saved for the 8.4 release:

http://momjian.postgresql.org/cgi-bin/pgpatches_hold

---------------------------------------------------------------------------

Marshall, Steve wrote:
> There is a problem in PL/TCL that can cause the postgres backend to
> become multithreaded. Postgres is not designed to be multithreaded, so
> this causes downstream errors in signal handling. We have seen this
> cause a number of "unexpected state" errors associated with notification
> handling; however, unpredictable signal handling would be likely to
> cause other errors as well.
>
> Some sample scripts are attached which will reproduce this problem when
> running against a multithreaded version of TCL, but will work without
> error with single-threaded TCL library. The scripts are a combination
> of Unix shell, perl DBI, and SQL commands. The postgres process can be
> seen to have multiple threads using the Linux command "ps -Lwfu
> postgres". In this command the NLWP columns will be 2 for multithreaded
> backend processes. The threaded/non-threaded state of the TCL library
> can be ascertained on Linux using ldd to determine if libpthread.so is
> linked to the TCL library (e.g. "ldd /usr/lib/libtcl8.4.so").
>
> The multithreaded behavior occurs the first time PL/TCL is used in a
> postgres backend, but only when postgres is linked against a
> multithread-enabled version of libtcl. Thus, this problem can be
> side-stepped by linking against the proper TCL library. However
> multithreaded TCL libraries are becoming the norm in Linux distributions
> and seems ubiquitous in the Windows world. Therefore a fix to the
> PL/TCL code is warrented.
>
> We determined that postgres became multithreaded during the creation of
> the TCL interpreter in a function called tcl_InitNotifier. This
> function is part of TCL's Notifier subsystem, which is used to monitor
> for events asynchronously from the TCL event loop. Although initialized
> when an interpreter is created, the Notifier subsystem is not used until
> a process enters the TCL event loop. This never happens within a
> postgres process, because postgres implements its own event loop.
> Therefore the initialization of the Notifier subsystem is not necessary
> within the context of PL/TCL.
>
> Our solution was to disable the Notifier subsystem by overriding the
> functions associated with it using the Tcl_SetNotifier function. This
> allows 8 functions related to the Notifier to overriden. Even though we
> found only two of the functions were ever called within postgres, we
> overrode 8 functions with no-op versions, just for completeness. A
> patch file containing the changes to pltcl.c from its 8.2.4 version is
> also attached.
>
> We tested this patch with PostgreSQL 8.2.4 on both RedHat Enterprise 4.0
> usingTCL 8.4 (single threaded) and RHE 5.0 using TCL 8.4.13
> (multithreaded). We expect this solution to work with Windows as well,
> although we have not tested it. There may be some problems using this
> solution with old versions of TCL that pre-date the Tcl_SetNotifier
> function. However this function has been around for quite a while; it
> was added in in the TCL 8.2 release, circa 2000.
>
> We hope this patch will be considered for a future PostgreSQL release.
>
> Steve Marshall
> Paul Bayer
> Doug Knight
> WSI Corporation
>
>

[ application/x-gzip is not supported, skipping... ]

> *** pltcl.c.orig 2007-09-10 12:58:34.000000000 -0400
> --- pltcl.c 2007-09-11 11:37:33.363222114 -0400
> ***************
> *** 163,168 ****
> --- 163,258 ----
> static void pltcl_build_tuple_argument(HeapTuple tuple, TupleDesc tupdesc,
> Tcl_DString *retval);
>
> + /**********************************************************************
> + * Declarations for functions overriden using Tcl_SetNotifier.
> + **********************************************************************/
> + static int fakeThreadKey; /* To give valid address for ClientData */
> +
> + static ClientData
> + pltcl_InitNotifier _ANSI_ARGS_((void));
> +
> + static void
> + pltcl_FinalizeNotifier _ANSI_ARGS_((ClientData clientData));
> +
> + static void
> + pltcl_SetTimer _ANSI_ARGS_((Tcl_Time *timePtr));
> +
> + static void
> + pltcl_AlertNotifier _ANSI_ARGS_((ClientData clientData));
> +
> + static void
> + pltcl_CreateFileHandler _ANSI_ARGS_((int fd, int mask, Tcl_FileProc *proc, ClientData clientData));
> +
> + static void
> + pltcl_DeleteFileHandler _ANSI_ARGS_((int fd));
> +
> + static void
> + pltcl_ServiceModeHook _ANSI_ARGS_((int mode));
> +
> + static int
> + pltcl_WaitForEvent _ANSI_ARGS_((Tcl_Time *timePtr));
> +
> + /**********************************************************************
> + * Definitions for functions overriden using Tcl_SetNotifier.
> + * These implementations effectively disable the TCL Notifier subsystem.
> + * This is okay because we never enter the TCL event loop from postgres,
> + * so the notifier capabilities are initialized, but never used.
> + *
> + * NOTE: Only InitNotifier and DeleteFileHandler ever seem to get called
> + * by postgres, but we implement all the functions for completeness.
> + **********************************************************************/
> +
> + ClientData
> + pltcl_InitNotifier()
> + {
> + return (ClientData) &(fakeThreadKey);
> + }
> +
> + void
> + pltcl_FinalizeNotifier(clientData)
> + ClientData clientData; /* Not used. */
> + {
> + }
> +
> + void
> + pltcl_SetTimer(timePtr)
> + Tcl_Time *timePtr;
> + {
> + }
> +
> + void
> + pltcl_AlertNotifier(clientData)
> + ClientData clientData;
> + {
> + }
> +
> + void
> + pltcl_CreateFileHandler(fd, mask, proc, clientData)
> + int fd;
> + int mask;
> + Tcl_FileProc *proc;
> + ClientData clientData;
> + {
> + }
> +
> + void
> + pltcl_DeleteFileHandler(fd)
> + int fd;
> + {
> + }
> +
> + void
> + pltcl_ServiceModeHook(mode)
> + int mode;
> + {
> + }
> +
> + int
> + pltcl_WaitForEvent(timePtr)
> + Tcl_Time *timePtr; /* Maximum block time, or NULL. */
> + {
> + return 0;
> + }
>
> /*
> * This routine is a crock, and so is everyplace that calls it. The problem
> ***************
> *** 189,194 ****
> --- 279,287 ----
> void
> _PG_init(void)
> {
> + /* Notifier structure used to override functions in Notifier subsystem*/
> + Tcl_NotifierProcs notifier;
> +
> /* Be sure we do initialization only once (should be redundant now) */
> if (pltcl_pm_init_done)
> return;
> ***************
> *** 199,204 ****
> --- 292,316 ----
> #endif
>
> /************************************************************
> + * Override the functions in the Notifier subsystem.
> + *
> + * We do this to prevent the postgres backend from becoming
> + * multithreaded, which happens in the default version of
> + * Tcl_InitNotifier if the TCL library has been compiled with
> + * multithreading support (i.e. when TCL_THREADS is defined
> + * under Unix, and in all cases under Windows).
> + ************************************************************/
> + notifier.setTimerProc = pltcl_SetTimer;
> + notifier.waitForEventProc = pltcl_WaitForEvent;
> + notifier.createFileHandlerProc = pltcl_CreateFileHandler;
> + notifier.deleteFileHandlerProc = pltcl_DeleteFileHandler;
> + notifier.initNotifierProc = pltcl_InitNotifier;
> + notifier.finalizeNotifierProc = pltcl_FinalizeNotifier;
> + notifier.alertNotifierProc = pltcl_AlertNotifier;
> + notifier.serviceModeHookProc = pltcl_ServiceModeHook;
> + Tcl_SetNotifier(&notifier);
> +
> + /************************************************************
> * Create the dummy hold interpreter to prevent close of
> * stdout and stderr on DeleteInterp
> ************************************************************/

>
> ---------------------------(end of broadcast)---------------------------
> TIP 3: Have you checked our extensive FAQ?
>
> http://www.postgresql.org/docs/faq

--
Bruce Momjian <bruce(at)momjian(dot)us> http://momjian.us
EnterpriseDB http://www.enterprisedb.com

+ If your life is a hard drive, Christ can be your backup. +

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2007-09-14 17:21:13 Re: Use of global and static variables in shared libraries
Previous Message Tom Lane 2007-09-14 16:53:10 Re: errcontext function

Browse pgsql-patches by date

  From Date Subject
Next Message Tom Lane 2007-09-14 17:22:45 Re: invalidly encoded strings
Previous Message Tom Lane 2007-09-14 17:04:34 Re: XML binary I/O (was Re: tsearch refactorings)