Re: [PATCH] unified frontend support for pg_malloc et al and palloc/pfree mulation (was xlogreader-v4)

From: Andres Freund <andres(at)2ndquadrant(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Heikki Linnakangas <hlinnakangas(at)vmware(dot)com>, Peter Eisentraut <peter_e(at)gmx(dot)net>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: [PATCH] unified frontend support for pg_malloc et al and palloc/pfree mulation (was xlogreader-v4)
Date: 2013-01-13 17:51:26
Message-ID: 20130113175126.GA26173@awork2.anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 2013-01-12 15:39:16 -0500, Tom Lane wrote:
> Heikki Linnakangas <hlinnakangas(at)vmware(dot)com> writes:
> > On 12.01.2013 20:42, Andres Freund wrote:
> >>> I don't care for that too much in detail -- if errstart were to return
> >>> false (it shouldn't, but if it did) this would be utterly broken,
>
> > With the current ereport(), we'll call abort() if errstart returns false
> > and elevel >= ERROR. That's even worse than making an error reporting
> > calls that elog.c isn't prepared for.
>
> No it isn't: you'd get a clean and easily interpretable abort. I am not
> sure exactly what would happen if we plow forward with calling elog.c
> functions after errstart returns false, but it wouldn't be good, and
> debugging a field report of such behavior wouldn't be easy.
>
> This is actually a disadvantage of the proposal to replace the abort()
> calls with __builtin_unreachable(), too. The gcc boys interpret the
> semantics of that as "if control reaches here, the behavior is
> undefined" --- and their notion of undefined generally seems to amount
> to "we will screw you as hard as we can". For example, they have no
> problem with using the assumption that control won't reach there to make
> deductions while optimizing the rest of the function. If it ever did
> happen, I would not want to have to debug it. The same goes for
> __attribute__((noreturn)), BTW.

We could make add an Assert() additionally?

> >> Yea, I didn't really like it all that much either - but anyway, I have
> >> yet to find *any* version with a local variable that doesn't lead to
> >> 200kb size increase :(.
>
> > Hmm, that's strange. Assuming the optimizer optimizes away the paths in
> > the macro that are never taken when elevel is a constant, I would expect
> > the resulting binary to be smaller than what we have now.
>
> I was wondering about that too, but haven't tried it locally yet. It
> could be that Andres was looking at an optimization level in which a
> register still got allocated for the local variable.

Nope, it was O2, albeit with -fno-omit-frame-pointer (for usable
hierarchical profiles).

> > Here's what I got with and without my patch on my laptop:
>
> > -rwxr-xr-x 1 heikki heikki 24767237 tammi 12 21:27 postgres-do-while-mypatch
> > -rwxr-xr-x 1 heikki heikki 24623081 tammi 12 21:28 postgres-unpatched
>
> > But when I build without --enable-debug, the situation reverses:
>
> > -rwxr-xr-x 1 heikki heikki 5961309 tammi 12 21:48 postgres-do-while-mypatch
> > -rwxr-xr-x 1 heikki heikki 5986961 tammi 12 21:49 postgres-unpatched
>
> Yes, I noticed that too: these patches make the debug overhead grow
> substantially for some reason. It's better to look at the output of
> "size" rather than the executable's total file size. That way you don't
> need to rebuild without debug to see reality. (I guess you could also

But the above is spot on. While I used strip earlier I somehow forgot it
here and so the debug overhead is part of what I saw. But, in comparison
to my baseline (which is replacing the abort() with
pg_unreachable()/__builtin_unreachable()) its still a loss
performancewise which is why I didn't doubt my results...

Pavel's testcase with changing ereport implementations (5 runs, minimal
time):
EREPORT_EXPR_OLD_NO_UNREACHABLE: 9964.015ms, 5530296 bytes (stripped)
EREPORT_EXPR_OLD_ABORT: 9765.916ms, 5466936 bytes (stripped)
EREPORT_EXPR_OLD_UNREACHABLE: 9646.502ms, 5462456 bytes (stripped)
EREPORT_STMT_HEIKKI: 10133.968ms, 5435704 bytes (stripped)
EREPORT_STMT_ANDRES: 9548.436ms, 5462232 bytes (stripped)

Where my variant is:
#define ereport_domain(elevel, domain, rest) \
do { \
const int elevel_ = elevel; \
if (errstart(elevel_,__FILE__, __LINE__, PG_FUNCNAME_MACRO, domain))\
{ \
errfinish rest; \
} \
if (elevel_>= ERROR) \
pg_unreachable(); \
} while(0)

So I suggest using that with an additional Assert(0) in the if(elevel_)
branch.

The assembler generated for my version is exactly the same when elevel
is constant in comparison with Heikki's but doesn't generate any
additional code in the case its not constant and I guess thats where it
gets faster?

Greetings,

Andres Freund

--
Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Andres Freund 2013-01-13 18:01:07 Re: Re: [PATCH] unified frontend support for pg_malloc et al and palloc/pfree mulation (was xlogreader-v4)
Previous Message Tom Lane 2013-01-13 17:44:44 Re: Re: logical changeset generation v3 - comparison to Postgres-R change set format