Re: Allow escape in application_name

From: Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com>
To: masao(dot)fujii(at)oss(dot)nttdata(dot)com
Cc: kuroda(dot)hayato(at)fujitsu(dot)com, houzj(dot)fnst(at)fujitsu(dot)com, ikedamsh(at)oss(dot)nttdata(dot)com, pgsql-hackers(at)lists(dot)postgresql(dot)org, tgl(at)sss(dot)pgh(dot)pa(dot)us
Subject: Re: Allow escape in application_name
Date: 2021-11-05 03:17:31
Message-ID: 20211105.121731.1964382526981590180.horikyota.ntt@gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

At Fri, 5 Nov 2021 03:14:00 +0900, Fujii Masao <masao(dot)fujii(at)oss(dot)nttdata(dot)com> wrote in
>
>
> On 2021/11/04 20:42, kuroda(dot)hayato(at)fujitsu(dot)com wrote:
> > Dear Fujii-san,
> >Thank you for giving comments! I attached new patches.
>
> Thanks for updating the patch!
>
> + <para>
> + Note that if embedded strings have Non-ASCII,
> + these characters will be replaced with the question marks
> (<literal>?</literal>).
> + This limitation is caused by <literal>application_name</literal>.
> + </para>
>
> Isn't this descripton confusing because postgres_fdw actually doesn't
> do this?
> postgres_fdw just passses the application_name as it is to the remote
> server.

I'm not sure that that distinction is so clear for users. So I feel we
want a notice something like this. But it doesn't seem correct as it
is. When the user name of the session consists of non-ascii
characters, %u is finally seen as a sequence of '?'. It is not a
embedded strings in pgfdw_application_name. I didn't notice this
behavior.

"pgfdw_application_name is set to application_name of a pgfdw
connection after placeholder conversion, thus the resulting string is
subject to the same restrictions to application_names.". Maybe
followed by "If session user name or database name contains non-ascii
characters, they are replaced by question marks "?"".

Sidetracking a bit, considering this restriction, we might need to
reconsider about %u and %d. session-id might be useful as it is
ascii-string that can identify a session exactly.

> >> On second thought, do we really need padding support for
> >> application_name
> >> in postgres_fdw? I just thought that when I read the description
> >> "Padding can be useful to aid human readability in log files." in the
> >> docs
> >> about log_line_prefix.
> >My feelings was that we don't have any reasons not to support,
> > but I cannot found some good use-cases.
> > Now I decided to keep supporting that,
> > but if we face another problem I will cut that.
>
> I'd like to hear more opinions about this from others.
> But *if* there is actually no use case, I'd like not to add
> the feature, to make the code simpler.

I think padding is useful because it alingns the significant content
of log lines by equating the length of the leading fixed
components. application_name doesn't make any other visible components
unaligned even when shown in the pg_stat_activity view. It is not
useful in that sense.

Padding make the string itself make look maybe nicer, but gives no
other advantage.

I doubt people complain to the lack of the padding feature. Addition
to that, since pgfdw_application_name and log_line_prefix work
different way in regard to multibyte characters, we don't need to
struggle so hard to consilidate them in their behavior.

# I noticed that the paddig feature doesn't consider multibyte
# characters. (I'm not suggesting them ought to be handled
# "prpoerly").

In short, I'm for to removing it by +0.7.

> >> + case 'a':
> >> + if (MyProcPort)
> >> + {
> >>
> >> I commented that the check of MyProcPort is necessary because
> >> background
> >> worker not having MyProcPort may access to the remote server. The
> >> example
> >> of such process is the resolver process that Sawada-san was proposing
> >> for
> >> atomic commit feature. But the proposal was withdrawn and for now
> >> there seems no such process. If this is true, we can safely remove the
> >> check
> >> of MyProcPort? If so, something like Assert(MyProcPort != NULL) may
> >> need
> >> to be added, instead.
> >My understating was that we don't have to assume committing the
> >Sawada-san's patch.
> > I think that FDW is only available from backend processes in the
> > current implementation,
> > and MyProcPort will be substituted when processes are initialized() -
> > in BackendInitialize().
> > Since the backend will execute BackendInitialize() after forking()
> > from the postmaster,
> > we can assume that everyone who operates FDW has a valid value for
> > MyProcPort.
> > I removed if statement and added assertion.

I think it is right.

> + case 'u':
> + Assert(MyProcPort != NULL);
>
> Isn't it enough to perform this assertion check only once
> at the top of parse_pgfdw_appname()?

Yeah, in either way, we should treat them in the same way.

> > We can force parse_pgfdw_appname() not to be called if MyProcPort does
> > not exist,
> > but I don't think it is needed now.
>
> Yes.

(I assume you said "it is needed now".) I'm not sure how to force
that but if it means a NULL MyProcPort cuases a crash, I think
crashing server is needlessly too aggressive as the penatly.

> >> If user name or database name is not set, the name is replaced with
> >> "[unknown]". log_line_prefix needs this because log message may be
> >> output when they have not been set yet, e.g., at early stage of
> >> backend
> >> startup. But I wonder if application_name in postgres_fdw actually
> >> need that.. Thought?
> >Hmm, I think all of backend processes have username and database, but
> > here has been followed from Horiguchi-san's suggestion:
> >```
> > I'm not sure but even if user_name doesn't seem to be NULL, don't we
> > want to do the same thing with %u of log_line_prefix for safety?
> > Namely, setting [unknown] if user_name is NULL or "". The same can be
> > said for %d.
> > ```
> >But actually I don't have strong opinions.
>
> Ok, we can wait for more opinions about this to come.
> But if user name and database name should NOT be NULL
> in postgres_fdw, I think that it's better to add the assertion
> check for those conditions and get rid of "[unknown]" part.

It seems to me that postgres-fdw asumes a valid user id, but doesn't
make no use of databsae, server port, and process id. What I thought
here is that making it an assertion is too much. So just ignoring the
replacement is also fine to me.

That being said, if we are eager not to have unused code paths, it is
reasonable enough. I don't object strongly to replace it with an
assertion.

regards.

--
Kyotaro Horiguchi
NTT Open Source Software Center

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Greg Nancarrow 2021-11-05 03:20:15 Re: Data is copied twice when specifying both child and parent table in publication
Previous Message Tom Lane 2021-11-05 02:47:23 Re: inefficient loop in StandbyReleaseLockList()