Re: TODO : Allow parallel cores to be used by vacuumdb [ WIP ]

From: Dilip kumar <dilip(dot)kumar(at)huawei(dot)com>
To: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>
Cc: Sawada Masahiko <sawada(dot)mshk(at)gmail(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, Jan Lentfer <Jan(dot)Lentfer(at)web(dot)de>, "Euler Taveira" <euler(at)timbira(dot)com(dot)br>
Subject: Re: TODO : Allow parallel cores to be used by vacuumdb [ WIP ]
Date: 2014-07-01 04:25:34
Message-ID: 4205E661176A124FAF891E0A6BA913526633B700@szxeml509-mbs.china.huawei.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 01 July 2014 03:48, Alvaro Wrote,

> > In particular, pgpipe is almost an exact duplicate between them,
> > except the copy in vac_parallel.c has fallen behind changes made to
> > parallel.c. (Those changes would have fixed the Windows warnings).
> I
> > think that this function (and perhaps other parts as
> > well--"exit_horribly" for example) need to refactored into a common
> > file that both files can include. I don't know where the best place
> > for that would be, though. (I haven't done this type of refactoring
> > myself.)
>
> I think commit d2c1740dc275543a46721ed254ba3623f63d2204 is apropos.
> Maybe we should move pgpipe back to src/port and have pg_dump and this
> new thing use that. I'm not sure about the rest of duplication in
> vac_parallel.c; there might be a lot in common with what
> pg_dump/parallel.c does too. Having two copies of code is frowned upon
> for good reasons. This patch introduces 1200 lines of new code in
> vac_parallel.c, ugh.

>
> If we really require 1200 lines to get parallel vacuum working for
> vacuumdb, I would question the wisdom of this effort. To me, it seems
> better spent improving autovacuum to cover whatever it is that this
> patch is supposed to be good for --- or maybe just enable having a
> shell script that launches multiple vacuumdb instances in parallel ...

Thanks for looking into the patch,

I think if we use shell script for launching parallel vacuumdb, we cannot get complete control of dividing the task,
If we directly divide table b/w multiple process, it may happen some process get very big tables then it will be as good as one process is doing operation.

In this patch at a time we assign only one table to each process and whichever process finishes fast, we assign new table, this way all process get equal sharing of the task.

Thanks & Regards,
Dilip Kumar

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Amit Kapila 2014-07-01 05:14:19 Re: better atomics - v0.5
Previous Message Dilip kumar 2014-07-01 03:58:23 Re: TODO : Allow parallel cores to be used by vacuumdb [ WIP ]