Re: Why is pq_begintypsend so slow?

From: Andres Freund <andres(at)anarazel(dot)de>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Jack Christensen <jack(at)jncsoftware(dot)com>, David Fetter <david(at)fetter(dot)org>, Jeff Janes <jeff(dot)janes(at)gmail(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Why is pq_begintypsend so slow?
Date: 2020-07-31 15:50:18
Message-ID: 20200731155018.bvcoikaag4tgcoog@alap3.anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

On 2020-07-31 11:14:46 -0400, Robert Haas wrote:
> Here is some review for the first few patches in this series.

Thanks!

> I am generally in favor of applying 0001-0003 no matter what else we
> decide to do here. However, as might be expected, I have a few
> questions and comments.
>
> Regarding 0001:
>
> I dislike the name initStringInfoEx() because we do not use the "Ex"
> convention for function names anywhere in the code base. We do
> sometimes use "extended", so this could be initStringInfoExtended(),
> perhaps, or something more specific, like initStringInfoWithLength().

I dislike the length of the function name, but ...

> Regarding the FIXME in that function, I suggest that this should be
> the caller's job. Otherwise, there will probably be some caller which
> doesn't want the add-one behavior, and then said caller will subtract
> one to compensate, and that will be silly.

Fair point.

> I am not familiar with pg_restrict and don't entirely understand the
> motivation for it here. I suspect you have done some experiments and
> figured out that it produces better code, but it would be interesting
> to hear more about how you got there. Perhaps there could even be some
> brief comments about it. Locutions like this are particularly
> confusing to me:
>
> +static inline void
> +resetStringInfo(StringInfoData *pg_restrict str)
> +{
> + *(char *pg_restrict) (str->data) = '\0';
> + str->len = 0;
> + str->cursor = 0;
> +}

The restrict tells the compiler that 'str' and 'str->data' won't be
pointing to the same memory. Which can simpilify the code its
generating. E.g. it'll allow the compiler to keep str->data in a
register, instead of reloading it for the next
appendStringInfo*. Without the restrict it can't, because str->data = 0
could otherwise overwrite parts of the value of ->data itself, if ->data
pointed into the StringInfo. Similarly, str->data could be overwritten
by str->len in some other cases.

Partially the reason we need to add the markers is that we compile with
-fno-strict-aliasing. But even if we weren't, this case wouldn't be
solved without an explicit marker even then, because char * is allowed
to alias...

Besides keeping ->data in a register, the restrict can also just
entirely elide the null byte write in some cases, e.g. because the
resetStringInfo() is followed by a appendStringInfoString(, "constant").

> I don't understand how this can be saving anything. I think the
> restrict definitions here mean that str->data does not overlap with
> str itself, but considering that these are unconditional stores, so
> what? If the code were doing something like memset(str->data, 0,
> str->len) then I'd get it: it might be useful to know for optimization
> purposes that the memset isn't overwriting str->len. But what code can
> we produce for this that wouldn't be valid if str->data = &str? I
> assume this is my lack of understanding rather than an actual problem
> with the patch, but I would be grateful if you could explain.

I hope the above makes this make sene now? It's about subsequent uses of
the StringInfo, rather than the body of resetStringInfo itself.

> It is easier to see why the pg_restrict stuff you've introduced into
> appendBinaryStringInfoNT is potentially helpful: e.g. in
> appendBinaryStringInfoNT, it promises that memcpy can't clobber
> str->len, so the compiler is free to reorder without changing the
> results. Or so I imagine. But then the one in appendBinaryStringInfo()
> confuses me again: if str->data is already declared as a restricted
> pointer, then why do we need to cast str->data + str->len to be
> restricted also?

But str->data isn't declared restricted without the explicit use of
restrict? str is restrict'ed, but it doesn't apply "recursively" to all
pointers contained therein.

> In appendStringInfoChar, why do we need to cast to restrict twice? Can
> we not just do something like this:
>
> char *pg_restrict ep = str->data+str->len;
> ep[0] = ch;
> ep[1] = '\0';

I don't think that'd tell the compiler that this couldn't overlap with
str itself? A single 'restrict' can never (?) help, you need *two*
things that are marked as not overlapping in any way.

Greetings,

Andres Freund

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Andres Freund 2020-07-31 15:51:35 Re: Ought to use heap_multi_insert() for pg_attribute/depend insertions?
Previous Message Robert Haas 2020-07-31 15:14:46 Re: Why is pq_begintypsend so slow?