Re: Proposal : For Auto-Prewarm.

From: Mithun Cy <mithun(dot)cy(at)enterprisedb(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Andres Freund <andres(at)anarazel(dot)de>, Peter Eisentraut <peter(dot)eisentraut(at)2ndquadrant(dot)com>, Beena Emerson <memissemerson(at)gmail(dot)com>, Jim Nasby <Jim(dot)Nasby(at)bluetreble(dot)com>, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, Michael Paquier <michael(dot)paquier(at)gmail(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Proposal : For Auto-Prewarm.
Date: 2017-05-30 04:46:54
Message-ID: CAD__Oujhwz+B9LeVpOSMkQRv_otCX=ZPcRJxYOYo6LX7q=A++g@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Thanks Robert,

On Wed, May 24, 2017 at 5:41 PM, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
> + *
> + * Once the "autoprewarm" bgworker has completed its prewarm task, it will
> + * start a new task to periodically dump the BlockInfoRecords related to blocks
> + * which are currently in shared buffer pool. Upon next server restart, the
> + * bgworker will prewarm the buffer pool by loading those blocks. The GUC
> + * pg_prewarm.dump_interval will control the dumping activity of the bgworker.
> + */
>
> Make this part of the file header comment.

-- Thanks, I have moved same as part of header description.

> Also, add an enabling GUC.
> The default can be true, but it should be possible to preload the
> library so that the SQL functions are available without a dynamic
> library load without requiring you to get the auto-prewarm behavior.
> I suggest pg_prewarm.autoprewarm = true / false.

-- Thanks, I have added a boolean GUC pg_prewarm.autoprewarm with
default true. I have made it as PGC_POSTMASTER level variable
considering the intention is here to avoid starting the autoprewarm
worker. Need help, have I missed anything? Currently, sql callable
functions autoprewarm_dump_now(), launch_autoprewarm_dump() are only
available after create extension pg_prewarm command will this change
now?
There is another GUC setting pg_prewarm.dump_interval if = -1 we stop
the running autoprewarm worker. I have a doubt should we combine these
2 entities into one such that it control the state of autoprewarm
worker?

> Your SigtermHandler and SighupHandler routines are still capitalized
> in a way that's not very consistent with what we do elsewhere. I
> think all of our other signal handlers have names_like_this() not
> NamesLikeThis().
>

-- handler functions are renamed for example apw_sigterm_handler, as
similar in autovacuum.c

> + * ============== types and variables used by autoprewam =============
>
> Spelling.

-- Fixed, Sorry.

> + * Meta-data of each persistent block which is dumped and used to load.
>
> Metadata

-- Fixed.

> +typedef struct BlockInfoRecord
> +{
> + Oid database; /* database */
> + Oid spcNode; /* tablespace */
> + Oid filenode; /* relation's filenode. */
> + ForkNumber forknum; /* fork number */
> + BlockNumber blocknum; /* block number */
> +} BlockInfoRecord;
>
> spcNode is capitalized differently from all of the other members.

-- renamed from spcNode to spcnode.
>
> + LWLock *lock; /* protects SharedState */
>
> Just declare this as LWLock lock, and initialize it using
> LWLockInitialize. The way you're doing it is more complicated.

-- Fixed as suggested
LWLockInitialize(&state->lock, LWLockNewTrancheId());

> +static dsa_area *AutoPrewarmDSA = NULL;
>
> DSA seems like more than you need here. There's only one allocation
> being performed. I think it would be simpler and less error-prone to
> use DSM directly. I don't even think you need a shm_toc. You could
> just do:
>
> seg = dsm_create(segsize, 0);
> blkinfo = dsm_segment_address(seg);
> Then pass dsm_segment_handle(seg) to the worker using bgw_main_arg or
> bgw_extra, and have it call dsm_attach. An advantage of this approach
> is that you only allocate the memory you actually need, whereas DSA
> will allocate more, expecting that you might do further allocations.

- Thanks Fixed. And we pass following arguments to subwrokers through bgw_extra
/*
* The block_infos allocated to each sub-worker to do prewarming.
*/
typedef struct prewarm_elem
{
dsm_handle block_info_handle; /* handle to dsm seg of block_infos */
Oid database; /* database to connect and load */
uint32 start_pos; /* start position within block_infos from
* which sub-worker start prewaring blocks. */
uint32 end_of_blockinfos; /* End of block_infos in dsm */
} prewarm_elem;

to distribute the prewarm work among subworkers.

>
> + pg_qsort(block_info_array, num_blocks, sizeof(BlockInfoRecord),
> + blockinfo_cmp);
>
> I think it would make more sense to sort the array on the load side,
> in the autoprewarm worker, rather than when dumping.

Fixed Now sorting block_infos just before loading the blocks

> + errmsg("autoprewarm: could not open \"%s\": %m",
> + dump_file_path)));
>
> Use standard error message wordings! Don't create new translatable
> messages by adding "autoprewarm" to error messages. There are
> multiple instances of this throughout the patch.

-- Removed autoprewarm as part of sufix in error message in all of the places.

> + snprintf(dump_file_path, sizeof(dump_file_path),
> + "%s", AUTOPREWARM_FILE);
>
> This is completely pointless. You can get rid of the dump_file_path
-- dump_file_path is removed.

>
> + SPI_connect();
> + PushActiveSnapshot(GetTransactionSnapshot());
>
> It is really unclear why you need this, since the code does not use
> SPI for anything, or do anything that needs a snapshot.

-Sorry Removed it now.

> + goto end_load;
>
> I think you should try to rewrite this function so that it doesn't
> need "goto". I also think in general that this function could be
> written in a much more direct way by getting rid of the switch and the
> BLKTYPE_* constants altogether. connect_to_db() can only happen once,
> so do that the beginning. After that, the logic can look roughly like
> this:

-- Fixed using exact code framework as you have suggested previously.

>
> + errhint("Kill all remaining database processes and restart"
> + " the database.")));
>
> Don't break strings across lines. Just put it all on one (long) line.

-- Fixed; I have tried to trim the string which was going beyond
80chars but has fixed it now as you have suggested.

> I don't think you should really need default_database. It seems like
> it should be possible to jigger things so that those blocks are loaded
> together with some other database (say, let the first worker do it).

-- Fixed, for block_infos with database 0 will be combined with next
database's block_info load. One problem which I have kept open is what
if there are only block_info's with database 0 in dump file, With
default_database we could have handled such case. After removing it I
am ignoring block_infos of 0 databases in such cases. Is that okay?.

--
Thanks and Regards
Mithun C Y
EnterpriseDB: http://www.enterprisedb.com

Attachment Content-Type Size
autoprewarm_09.patch application/octet-stream 37.0 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Thomas Munro 2017-05-30 05:09:22 Re: POC: Sharing record typmods between backends
Previous Message Amit Langote 2017-05-30 04:42:37 Re: "create publication..all tables" ignore 'partition not supported' error