Re: fork/exec patch: pre-CreateProcess finalization

From: Claudio Natoli <claudio(dot)natoli(at)memetrics(dot)com>
To: 'Tom Lane' <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Claudio Natoli <claudio(dot)natoli(at)memetrics(dot)com>
Cc: "'pgsql-patches(at)postgresql(dot)org'" <pgsql-patches(at)postgresql(dot)org>
Subject: Re: fork/exec patch: pre-CreateProcess finalization
Date: 2003-12-27 06:16:14
Message-ID: A02DEC4D1073D611BAE8525405FCCE2B0280AB@harris.memetrics.local
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-patches

> > Here's the critical difference in our thinking:
>
> > ISTM that we'll have to go to a lot of work to get the fork/exec
> > case to re-load the context required to format up the argument list to
> > PostgresMain. Certainly, it is a lot more work than allowing the
postmaster
> > itself (not the forked child) to create the argument list, albeit moving
> > auth to PostgresMain.
>
> I don't follow your thinking here. The information that has to be
> reloaded *must* be passed across the fork/exec boundary in
> either case.
> For example, there is no alternative to telling the backend process
> where PGDATA is: it's got to be passed down some way or another.

Yes. But not all of it. Sure, PGDATA is a trivial example of something
that's gotta get across one way or another. I take no argument with that.

How about things like, for instance, canAcceptConnections(). To make this
call successfully in a fork/exec'd child would take a bit of work. Of
course, there is a work-around... pass the value into BackendFork (or, in
the fork/exec case, on the command line). But, ISTM that, do this a few
times, and you might as well have just had the postmaster format up the
argument list.

> The reason your patch is messy is that the PostgresMain command line
> string involves both information passed down from the postmaster and
> information acquired from the client's connection-request packet.

Hmm.. that's not the way I see it.

> > After fork/exec'ing, the child process can't recover the context needed
to
> > create the argument list which it'll need to build up the PostgresMain
arg list.
>
> If that's literally true, then we are screwed because things cannot
> work. Pardon me for doubting this statement. All that information
> *must* be available in the child, once it has analyzed the information
> it must be able to collect from the postmaster and performed
> the client authentication handshake.

Ok. That, in the context of where it was written, instead of "can't" should
have read "can no longer ... [in the absence of passing a great deal of
additional context]". Mea culpa. Anyway, moving along...

> In particular, this alternative is quite unworkable:
>
> > b) delay the fork() call to the end of BackendFork in both fork/exec and
> > normal cases, turning it into solely an argument formatter for
PostgresMain
>
> We can *not* postpone the fork() until after client authentication.
> That loses all of the work that's been done since circa 7.1
> to eliminate cases where the postmaster gets blocked waiting for a
> single client to respond. SSL, PAM, and I think Kerberos all create
> denial-of-service risks if we do that.

Ah. OK! Now I see where we are talking at cross-purposes. Forking after
client auth was NOT what I was trying to suggest at all.

Let's have a look at BackendFork. Here's what it needs to set up the arg
list for PostgresMain, or otherwise uses:

* PreAuthDelay
* canAcceptConnections()
* ExtraOptions
* debugbuf
* DataDir (in EXEC_BACKEND case)
BackendInit (ie. client authentication)
port->cmdline_options
port->proto
port->database_name
port->user_name

What I was suggesting with b) was to format up the command line for the
items prefixed by * in the postmaster,
do the fork (or fork/exec), and then run the authentication in, say
PostgresMain. (Note: this is essentially what the fork/exec case currently
does).

Now, what I think you are suggesting (correct me if I'm wrong), is that we
should pass along whatever we need to in order to be able to setup the
fork/exec'd process to run BackendFork more or less unchanged.

I prefer the former option, as there is less duplication. Should anything be
changed/added to the items required for BackendFork'ing, no changes would be
required to the fork/exec code. This will not necessarily be true if we need
to pass a bunch of (additional) context, as in the latter case.

ISTM this would let us make BackendFork *very* simple, and pretty much
identical in both the normal and fork/exec cases, for the cost of pushing
the authenication step across to PostgresMain (non stand-alone case), which
is what the fork/exec case does now anyway!

What do you think?

Claudio

---
Certain disclaimers and policies apply to all email sent from Memetrics.
For the full text of these disclaimers and policies see
<a
href="http://www.memetrics.com/emailpolicy.html">http://www.memetrics.com/em
ailpolicy.html</a>

Responses

Browse pgsql-patches by date

  From Date Subject
Next Message Claudio Natoli 2003-12-27 10:27:27 Re: fork/exec patch: pre-CreateProcess finalization
Previous Message Tom Lane 2003-12-27 05:45:00 Re: Quoting of psql \d output