From: | "houzj(dot)fnst(at)fujitsu(dot)com" <houzj(dot)fnst(at)fujitsu(dot)com> |
---|---|
To: | Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com> |
Cc: | Peter Geoghegan <pg(at)bowt(dot)ie>, Andres Freund <andres(at)anarazel(dot)de>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> |
Subject: | RE: parallel vacuum comments |
Date: | 2021-11-05 02:15:23 |
Message-ID: | OS0PR01MB57161B5970B657B31379A1CC948E9@OS0PR01MB5716.jpnprd01.prod.outlook.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Thur, Nov 4, 2021 1:25 PM Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com> wrote:
> On Wed, Nov 3, 2021 at 1:08 PM Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
> >
> > On Tue, Nov 2, 2021 at 11:17 AM Masahiko Sawada<sawada(dot)mshk(at)gmail(dot)com> wrote:
> > >
> > > On Tue, Nov 2, 2021 at 5:57 AM Peter Geoghegan <pg(at)bowt(dot)ie> wrote:
> > > >
> > >
> > > > Rather than inventing PARALLEL_VACUUM_KEY_INDVAC_CHECK (just for
> > > > assert-enabled builds), we should invent PARALLEL_VACUUM_STATS --
> > > > a dedicated shmem area for the array of LVSharedIndStats (no more
> > > > storing LVSharedIndStats entries at the end of the LVShared space
> > > > in an ad-hoc, type unsafe way). There should be one array element
> > > > for each and every index -- even those indexes where parallel
> > > > index vacuuming is unsafe or not worthwhile (unsure if avoiding
> > > > parallel processing for "not worthwhile" indexes actually makes
> > > > sense, BTW). We can then get rid of the bitmap/IndStatsIsNull()
> > > > stuff entirely. We'd also add new per-index state fields to
> > > > LVSharedIndStats itself. We could directly record the status of
> > > > each index (e.g., parallel unsafe, amvacuumcleanup processing
> > > > done, ambulkdelete processing done) explicitly. All code could
> > > > safely subscript the LVSharedIndStats array directly, using idx
> > > > style integers. That seems far more robust and consistent.
> > >
> > > Sounds good.
> > >
> > > During the development, I wrote the patch while considering using
> > > fewer shared memory but it seems that it brought complexity (and
> > > therefore the bug). It would not be harmful even if we allocate
> > > index statistics on DSM for unsafe indexes and “not worthwhile"
> > > indexes in practice.
> > >
> >
> > If we want to allocate index stats for all indexes in DSM then why not
> > consider it on the lines of buf/wal_usage means tack those via
> > LVParallelState? And probably replace bitmap with an array of bools
> > that indicates which indexes can be skipped by the parallel worker.
> >
>
> I've attached a draft patch. The patch incorporated all comments from Andres
> except for the last comment that moves parallel related code to another file.
> I'd like to discuss how we split vacuumlazy.c.
Hi,
I was recently reading the parallel vacuum code, and I think the patch can
bring a certain improvement.
Here are a few minor comments about it.
1)
+ * Reset all index status back to invalid (while checking that we have
+ * processed all indexes).
+ */
+ for (int i = 0; i < vacrel->nindexes; i++)
+ {
+ LVSharedIndStats *stats = &(lps->lvsharedindstats[i]);
+
+ Assert(stats->status == INDVAC_STATUS_COMPLETED);
+ stats->status = INDVAC_STATUS_INITIAL;
+ }
Do you think it might be clearer to report an error here ?
2)
+prepare_parallel_index_processing(LVRelState *vacrel, bool vacuum)
For the second paramater 'vacuum'. Would it be clearer if we pass a
LVIndVacStatus type instead of the boolean value ?
Best regards,
Hou zj
From | Date | Subject | |
---|---|---|---|
Next Message | Michael Paquier | 2021-11-05 02:40:00 | Re: inefficient loop in StandbyReleaseLockList() |
Previous Message | Tomas Vondra | 2021-11-05 01:10:55 | Re: [sqlsmith] Failed assertion in brin_minmax_multi_distance_float4 on REL_14_STABLE |