Re: SendRowDescriptionMessage() is slow for queries with a lot of columns

From: Andres Freund <andres(at)anarazel(dot)de>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: SendRowDescriptionMessage() is slow for queries with a lot of columns
Date: 2017-10-03 16:23:17
Message-ID: 20171003162317.ipzojkq3wdhed2ns@alap3.anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 2017-10-03 11:06:08 -0400, Robert Haas wrote:
> On Tue, Oct 3, 2017 at 3:55 AM, Andres Freund <andres(at)anarazel(dot)de> wrote:
> > Attached is a revised version of this patchset.
>
> I don't much like the functions with "_pre" affixed to their names.
> It's not at all clear that "pre" means "preallocated"; it sounds more
> like you're doing something ahead of time. I wonder about maybe
> calling these e.g. pq_writeint16, with "write" meaning "assume
> preallocation" and "send" meaning "don't assume preallocation". There
> could be other ideas, too.

I can live with write, although I don't think it jibes well with
the pq_send* naming.

> > 3) The use of restrict, with a configure based fallback, is something
> > we've not done before, but it's C99 and delivers significantly more
> > efficient code. Any arguments against?

> Also, unless I'm missing something, there's nothing to keep
> pq_sendintXX_pre from causing an alignment fault except unbridled
> optimism...

Fair argument, I'll replace it back with a fixed-length memcpy. At least
my gcc optimizes that away again - I ended up with the plain assignment
while debugging the above, due to the lack of restrict.

> It's pretty unobvious why it helps here. I think you should add
> comments.

Will. I'd stared at this long enough that I thought it'd be obvious. But
it took me a couple hours to get there, so ... yes. The reason it's
needed here is that given:

static inline void
pq_sendint8_pre(StringInfo restrict buf, int8 i)
{
int32 ni = pg_hton32(i);

Assert(buf->len + sizeof(i) <= buf->maxlen);
memcpy((char* restrict) (buf->data + buf->len), &ni, sizeof(i));
buf->len += sizeof(i);
}

without the restrict the compiler has no way to know that buf, buf->len,
*(buf->data + x) do not overlap. Therefore buf->len cannot be kept in a
register across subsequent pq_sendint*_pre calls, but has to be stored
and loaded before each of the the memcpy calls. There's two reasons for
that:

- We compile -fno-strict-aliasing. That prevents the compiler from doing
type based inference that buf and buf->len do not overlap with
buf->data
- Even with type based strict aliasing, using char * type data and
memcpy prevents that type of analysis - but restrict promises that
there's no overlap - which we know there isn't.

Makes sense?

Greetings,

Andres Freund

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2017-10-03 16:30:02 Re: [sqlsmith] stuck spinlock in pg_stat_get_wal_receiver after OOM
Previous Message Robert Haas 2017-10-03 16:06:46 Re: 64-bit queryId?