Re: [bug fix] Produce a crash dump before main() on Windows

From: Haribabu Kommi <kommi(dot)haribabu(at)gmail(dot)com>
To: "Tsunakawa, Takayuki" <tsunakawa(dot)takay(at)jp(dot)fujitsu(dot)com>
Cc: Michael Paquier <michael(at)paquier(dot)xyz>, Robert Haas <robertmhaas(at)gmail(dot)com>, Craig Ringer <craig(at)2ndquadrant(dot)com>, Magnus Hagander <magnus(at)hagander(dot)net>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [bug fix] Produce a crash dump before main() on Windows
Date: 2018-11-06 04:53:37
Message-ID: CAJrrPGcxZi4Z_SttnuUvYOaw+SAdk7+cJgYpuf7ao43vuJLH2w@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox
Thread:
Lists: pgsql-hackers

On Thu, Jul 26, 2018 at 3:52 PM Tsunakawa, Takayuki <
tsunakawa(dot)takay(at)jp(dot)fujitsu(dot)com> wrote:

> From: Michael Paquier [mailto:michael(at)paquier(dot)xyz]
> > No, I really mean a library dependency failure. For example, imagine
> that
> > Postgres is compiled on Windows dynamically, and that it depends on
> > libxml2.dll, which is itself compiled dynamically. Then imagine, in a
> > custom build echosystem, that a folk comes in and adds lz support to
> > libxml2 on Windows. If Postgres still consumes libxml2 but does not add
> > in its PATH a version of lz, then a backend in need of libxml2 would fail
> > to load, causing Postgres to not start properly. True, painful, story.
>
> I see, that's surely painful. But the DLL in use cannot be overwritten on
> Windows. So, I assume the following:
>
> 1. postmaster loads libxml2.dll without LZ in folder A.
> 2. Someone adds libxml2.dll with LZ in folder B. folder B is ahead of
> folder A in postgres's PATH.
> 3. Some user tries to connect to a database, creating a new child postgres
> process, which fails to load libxml2.dll in folder B.
>

I doubt that the above scenario is also possible in windows, Once the
process has started, it may not receive the new
environmental variable changes that are done. I am not sure though.

> > What is ticking me is if the popup window stuff discussed on this thread
> > could be a problem in the detection of such dependency errors, as with
> the
> > product I am thinking about, Postgres is not running as a service, but
> kicked
> > by another thing which is a service, and monitors the postmaster.
>
> I understood you are talking about a case where some (server) application
> uses PostgreSQL internally. That application runs as a Windows service,
> but PostgreSQL itself doesn't on its own. The application starts
> PostgreSQL by running pg_ctl start.
>
> In that case, postgres runs under service. I confirmed it with the
> following test program. This code is extracted from pgwin32_is_service()
> in PostgreSQL.
>
> --------------------------------------------------
> #include <windows.h>
> #include <stdio.h>
>
> int
> main(void)
> {
> BOOL IsMember;
> PSID ServiceSid;
> PSID LocalSystemSid;
> SID_IDENTIFIER_AUTHORITY NtAuthority = {SECURITY_NT_AUTHORITY};
> FILE *fp;
>
> SetErrorMode(0);
>
> /* Check for service group membership */
> if (!AllocateAndInitializeSid(&NtAuthority, 1,
>
> SECURITY_SERVICE_RID, 0, 0, 0, 0, 0, 0, 0,
>
> &ServiceSid))
> {
> fprintf(stderr, "could not get SID for service group:
> error code %lu\n",
> GetLastError());
> return 1;
> }
>
> if (!CheckTokenMembership(NULL, ServiceSid, &IsMember))
> {
> fprintf(stderr, "could not check access token membership:
> error code %lu\n",
> GetLastError());
> FreeSid(ServiceSid);
> return 1;
> }
> FreeSid(ServiceSid);
>
> fp = fopen("d:\\a.txt", "a");
> if (IsMember)
> fprintf(fp, "is under service\n");
> else
> fprintf(fp, "is not under service\n");
>
> return 0;
> }
> --------------------------------------------------
>
> You can build the above program with:
> cl chksvc.c advapi32.lib
>

Thanks for confirmation of that PostgreSQL runs as service.

Based on the following details, we can decide whether this fix is required
or not.
1. Starting of Postgres server using pg_ctl without service is of
production use or not?
2. Without this fix, how difficult is the problem to find out that
something is preventing the
server to start? In case if it is easy to find out, may be better to
provide some troubleshoot
guide for windows users can help.

I am in favor of doc fix if it easy to find the problem instead of assuming
the user usage.

Regards,
Haribabu Kommi
Fujitsu Australia

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Michael Paquier 2018-11-06 05:12:53 Re: pg_promote not marked as parallel-restricted in pg_proc.dat
Previous Message Thomas Munro 2018-11-06 04:29:36 Re: Strange failure in LWLock on skink in REL9_5_STABLE