Re: Perform streaming logical transactions by background workers and parallel apply

From: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
To: Peter Smith <smithpb2250(at)gmail(dot)com>
Cc: "houzj(dot)fnst(at)fujitsu(dot)com" <houzj(dot)fnst(at)fujitsu(dot)com>, Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>, "wangw(dot)fnst(at)fujitsu(dot)com" <wangw(dot)fnst(at)fujitsu(dot)com>, Dilip Kumar <dilipbalaut(at)gmail(dot)com>, "shiy(dot)fnst(at)fujitsu(dot)com" <shiy(dot)fnst(at)fujitsu(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: Perform streaming logical transactions by background workers and parallel apply
Date: 2022-11-28 13:10:42
Message-ID: CAA4eK1JmPz0Z3L3vBsvjfWgpbk5hTT4+d08cEuH+-B1Hm4xHsA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, Nov 28, 2022 at 12:49 PM Peter Smith <smithpb2250(at)gmail(dot)com> wrote:
>
...
>
> 17.
> @@ -388,10 +401,9 @@ static inline void cleanup_subxact_info(void);
> /*
> * Serialize and deserialize changes for a toplevel transaction.
> */
> -static void stream_cleanup_files(Oid subid, TransactionId xid);
> static void stream_open_file(Oid subid, TransactionId xid,
> bool first_segment);
> -static void stream_write_change(char action, StringInfo s);
> +static void stream_write_message(TransactionId xid, char action, StringInfo s);
> static void stream_close_file(void);
>
> 17a.
>
> I felt just saying "file/files" is too vague. All the references to
> the file should be consistent, so IMO everything would be better named
> like:
>
> "stream_cleanup_files" -> "stream_msg_spoolfile_cleanup()"
> "stream_open_file" -> "stream_msg_spoolfile_open()"
> "stream_close_file" -> "stream_msg_spoolfile_close()"
> "stream_write_message" -> "stream_msg_spoolfile_write_msg()"
>
> ~
>
> 17b.
> IMO there is not enough distinction here between function names
> stream_write_message and stream_write_change. e.g. You cannot really
> tell from their names what might be the difference.
>
> ~~~
>

I think the only new function needed by this patch is
stream_write_message so don't see why to change all others for that. I
see two possibilities to make name better (a) name function as
stream_open_and_write_change, or (b) pass a new argument (boolean
open) to stream_write_change

...
>
> src/include/replication/worker_internal.h
>
> 33. LeaderFileSetState
>
> +/* State of fileset in leader apply worker. */
> +typedef enum LeaderFileSetState
> +{
> + LEADER_FILESET_UNKNOWN,
> + LEADER_FILESET_BUSY,
> + LEADER_FILESET_ACCESSIBLE
> +} LeaderFileSetState;
>
> 33a.
>
> Missing from typedefs.list?
>
> ~
>
> 33b.
>
> I thought some more explanatory comments for the meaning of
> BUSY/ACCESSIBLE should be here.
>
> ~
>
> 33c.
>
> READY might be a better value than ACCESSIBLE
>
> ~
>
> 33d.
> I'm not sure what usefulness does the "LEADER_" and "Leader" prefixes
> give here. Maybe a name like PartialFileSetStat is more meaningful?
>
> e.g. like this?
>
> typedef enum PartialFileSetState
> {
> FS_UNKNOWN,
> FS_BUSY,
> FS_READY
> } PartialFileSetState;
>
> ~
>

All your suggestions in this point look good to me.

>
> ~~~
>
>
> 35. globals
>
> /*
> + * Indicates whether the leader apply worker needs to serialize the
> + * remaining changes to disk due to timeout when sending data to the
> + * parallel apply worker.
> + */
> + bool serialize_changes;
>
> 35a.
> I wonder if the comment would be better to also mention "via shared memory".
>
> SUGGESTION
>
> Indicates whether the leader apply worker needs to serialize the
> remaining changes to disk due to timeout when attempting to send data
> to the parallel apply worker via shared memory.
>
> ~
>

I think the comment should say " .. the leader apply worker serialized
remaining changes ..."

> 35b.
> I wonder if a more informative variable name might be
> serialize_remaining_changes?
>

I think this needlessly makes the variable name long.

--
With Regards,
Amit Kapila.

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Daniel Gustafsson 2022-11-28 13:13:16 Re: TAP output format in pg_regress
Previous Message Simon Riggs 2022-11-28 13:08:57 Re: O(n) tasks cause lengthy startups and checkpoints