Re: pg_background (and more parallelism infrastructure patches)

From: Jim Nasby <Jim(dot)Nasby(at)BlueTreble(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: "Joshua D(dot) Drake" <jd(at)commandprompt(dot)com>, Andres Freund <andres(at)2ndquadrant(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: pg_background (and more parallelism infrastructure patches)
Date: 2014-10-31 23:12:57
Message-ID: 54541779.1010906@BlueTreble.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 10/24/14, 6:17 PM, Jim Nasby wrote:
>>>> - Does anyone have a tangible suggestion for how to reduce the code
>>>> duplication in patch #6?
>>>
>>> Between execute_sql_string() and tcop/exec_simple_query()? Is there stuff in
>>> exec_simple that's not safe for bgwriter? I'm not seeing why we can't use
>>> exec_simple. :/
>>
>> There are some differences if you compare them closely.
>
> Without doing a deep dive, I'm guessing that most of the extra stuff would be safe to re-use; it just wouldn't affect execute_sql_string. Obviously we could add a boolean to exec_simple_query for the case when it's being used by a bgwriter. Though, it seems like the biggest differences have to do with logging
>
> Here's the differences I see:
>
> - Disallowing transaction commands
> - Logging
> - What memory context is used (could we just use that differently in a pg_backend backend?)
> - Portal output format
> - What to do with the output of intermediate commands (surely there's other places we need to handle that? plpgsql maybe?)
>
> I think all of those except logging could be handled by a function serving both exec_simple_query and execute_sql_command that accepts a few booleans (or maybe just one to indicate the caller) and some if's. At least I don't think it'd be too bad, without actually writing it.
>
> I'm not sure what to do about logging. If the original backend has logging turned on, is it that horrible to do the same logging as exec_simple_query would do?
>
> Another option would be factoring out parts of exec_simple_query; the for loop over the parse tree might be a good candidate. But I suspect that would be uglier than a separate support function.
>
> I do feel uncomfortable with the amount of duplication there is right now though...

I took a stab at this by refactoring the guts of exec_simple_query (patch attached) into a new function called exec_query_string (also attached in raw form). As indicated it needs a bit more work. In particular, how it's dealing with excluding transactional commands is rather ugly. Why do we need to do that in pg_background?

Andres was concerned about the performance impact of doing this. I tested this by doing

for i in {1..999999}; do echo 'SELECT 1;' >> test.sql; done

and then

time psql -f test.sql > /dev/nul

It appears there may be a ~1% performance hit, but my laptop isn't repeatable enough to be sure. I did try manually in-lining exec_query_string to see if it was the function call causing the issue; it didn't seem to make a difference.
--
Jim Nasby, Data Architect, Blue Treble Consulting
Data in Trouble? Get it in Treble! http://BlueTreble.com

Attachment Content-Type Size
0001-Refactor-the-guts-of-exec_simple_query.patch text/plain 11.0 KB
exec_query_string.c text/plain 9.1 KB

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Noah Misch 2014-11-01 00:30:38 Re: infinite loop in _bt_getstackbuf
Previous Message Andres Freund 2014-10-31 22:58:51 Re: _mdfd_getseg can be expensive