From: Stephen Frost <sfrost(at)snowman(dot)net>
To: Andres Freund <andres(at)anarazel(dot)de>
Cc: Kyotaro HORIGUCHI <horiguchi(dot)kyotaro(at)lab(dot)ntt(dot)co(dot)jp>, pgsql-hackers(at)postgresql(dot)org, tsunakawa(dot)takay(at)jp(dot)fujitsu(dot)com, robertmhaas(at)gmail(dot)com, tgl(at)sss(dot)pgh(dot)pa(dot)us, bruce(at)momjian(dot)us, GavinFlower(at)archidevsys(dot)co(dot)nz, ideriha(dot)takeshi(at)jp(dot)fujitsu(dot)com, alvherre(at)alvh(dot)no-ip(dot)org, michael(dot)paquier(at)gmail(dot)com, david(at)pgmasters(dot)net, craig(at)2ndquadrant(dot)com
Date: 2019-01-30 02:46:28
Views: Raw Message | Whole Thread | Download mbox
Lists: pgsql-hackers


* Andres Freund (andres(at)anarazel(dot)de) wrote:
> On 2019-01-29 21:09:22 -0500, Stephen Frost wrote:
> > * Andres Freund (andres(at)anarazel(dot)de) wrote:
> > > On 2019-01-29 20:52:08 -0500, Stephen Frost wrote:
> > > > * Andres Freund (andres(at)anarazel(dot)de) wrote:
> > > > > Leaving the desirability of the feature aside, isn't this racy as hell?
> > > > > I.e. it seems entirely possible that backends stop/start between
> > > > > determining the PID, and the ALTER SESSION creating the file, and it
> > > > > actually being processed. By the time that happens an entirely different
> > > > > session might be using that pid.
> > > >
> > > > That seems like something that could possibly be fixed, by adding in
> > > > other things to make it more likely to be the 'right' backend, but my
> > > > complaint here is that we are, again, using files to pass data between
> > > > backend processes and that seems like a pretty terrible direction to be
> > > > going in.
> > >
> > > I think pid would be wholly unsuitable for this, and if so we'd have to
> > > use something entirely independent.
> >
> > I would think you'd use pid + other stuff (user OID, backend proc entry
> > number, other things). Basically, if you see a file there with your pid
> > on it, then you look and see if the other things match- if so, act on
> > it, if not, discard the file. I still don't like this approach though,
> What do we gain by including the pid here? Seems much more reasonable to
> use a session id that's just unique over the life of a cluster.

Are you suggesting we have one of those already, or is the idea that
we'd add a cluster-lifetime session id for this?

> > I really don't think files are the right way to be going about this.
> Why? They persist and can be removed, they are introspectable, they
> automatically are removed from memory when there's no demand...

Well, we don't actually want these to persist, and it's because they do
that we have to deal with removing them, and I don't see a whole lot of
gain from them being introspectable; indeed, that seems like more of a
drawback than anything since it will invite people to whack those files
around and abuse them as if they were some externally documented

They also cost disk space, they require inodes, they have to be cleaned
up and managed on shutdown/restart, backup tools need to understand what
to do with them, potentially, we have to consider if we should have a
checksum for them, we have to handle out-of-disk space cases with them,
they could cause us to run out of disk space...

These same arguments could have been made about how we could have
implemented parallel query too. I agree that the use-case is somewhat
different there but there's also a lot of similarity when it comes to
managing this passing of information to that use-case.



In response to


Browse pgsql-hackers by date

  From Date Subject
Next Message Nagaura, Ryohei 2019-01-30 03:04:13 RE: [HACKERS] Cached plans and statement generalization
Previous Message David Fetter 2019-01-30 02:35:13 Re: Early WIP/PoC for inlining CTEs