Re: TODO : Allow parallel cores to be used by vacuumdb [ WIP ]

From: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
To: Dilip kumar <dilip(dot)kumar(at)huawei(dot)com>
Cc: Magnus Hagander <magnus(at)hagander(dot)net>, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, Jan Lentfer <Jan(dot)Lentfer(at)web(dot)de>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, Sawada Masahiko <sawada(dot)mshk(at)gmail(dot)com>, Euler Taveira <euler(at)timbira(dot)com(dot)br>
Subject: Re: TODO : Allow parallel cores to be used by vacuumdb [ WIP ]
Date: 2014-09-26 06:53:55
Message-ID: CAA4eK1LrKuWViZ=suL2L+h7RdRs1PTdPe9iNQ1M+7dfVTEvYmQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, Sep 24, 2014 at 3:18 PM, Dilip kumar <dilip(dot)kumar(at)huawei(dot)com> wrote:
> On 24 August 2014 11:33, Amit Kapila Wrote
>
> >7. I think in new mechanism cancel handler will not work.
>
> >In single connection vacuum it was always set/reset
>
> >in function executeMaintenanceCommand(). You might need
>
> >to set/reset it in function run_parallel_vacuum().
>
>
>
> Good catch, Now i have called SetCancelConn(pSlot[0].connection), on
first connection. this will enable cancle
>
> handler to cancle the query on first connection so that select loop will
come out.
>

I don't think this can handle cancel requests properly because
you are just setting it in GetIdleSlot() what if the cancel
request came during GetQueryResult() after sending sql for
all connections (probably thats the reason why Jeff is not able
to cancel the vacuumdb when using parallel option).

>
>
>
> >1. I could see one shortcoming in the way the patch has currently
parallelize the
>
> > work for --analyze-in-stages. Basically patch is performing the work
for each stage
>
> > for multiple tables in concurrent connections that seems okay for the
cases when
>
> > number of parallel connections is less than equal to number of
tables, but for
>
> > the case when user has asked for more number of connections than
number of tables,
>
> > then I think this strategy will not be able to use the extra
connections.
>
>
>
> I think --analyze-in-stages should maintain the order.

Yes, you are right. So lets keep the code as it is for this case.

>
>
> >2. Similarly for the case of multiple databases, currently it will not
be able
>
> > to use connections more than number of tables in each database
because the
>
> > parallelizing strategy is to just use the conncurrent connections for
>
> > tables inside single database.
>
>
>
> I think for multiple database doing the parallel execution we need to
maintain the multiple connection with multiple databases.
> And we need to maintain a table list for all the databases together to
run them concurrently. I think this may impact the startup cost,
> as we need to create a multiple connection and disconnect for preparing
the list

Yeah probably startup cost will be bit higher, but that cost we
will anyway incur during overall operation.

> and i think it will increase the complexity also.

I understand that complexity of code might increase and the strategy
to parallelize can also be different incase we want to parallelize
for all databases case, so lets leave as it is unless someone else
raises voice for the same.

Today while again thinking about the startegy used in patch to
parallelize the operation (vacuum database), I think we can
improve the same for cases when number of connections are
lesser than number of tables in database (which I presume
will normally be the case). Currently we are sending command
to vacuum one table per connection, how about sending multiple
commands (example Vacuum t1; Vacuum t2) on one connection.
It seems to me there is extra roundtrip for cases when there
are many small tables in database and few large tables. Do
you think we should optimize for any such cases?

Few other points
1.
+ vacuum_parallel(const char *dbname, bool full, bool verbose,
{
..
+ connSlot = (ParallelSlot*)pg_malloc(concurrentCons *
sizeof(ParallelSlot));
+ connSlot[0].connection = conn;

a.
Does above memory gets freed anywhere, if not isn't it
good idea to do the same
b. For slot 0, you are not seeting it as PQsetnonblocking,
where as I think it can be used to run commands like any other
connection.

2.
+ /*
+ * If user has given the vacuum of complete db, then if
+ * any of the object vacuum
failed it can be ignored and vacuuming
+ * of other object can be continued, this is the same
behavior as
+ * vacuuming of complete db is handled without --jobs option
+ */

s/object/object's

3.
+ if(!completedb ||
+ (sqlState && strcmp(sqlState,
ERRCODE_UNDEFINED_TABLE) != 0))
+ {
+
+ fprintf(stderr, _("%s: vacuuming of
database \"%s\" failed: %s"),
+ progname, dbname, PQerrorMessage
(conn));

Indentation on both places is wrong. Check other palces for
similar issues.

4.
+ bool analyze_only, bool freeze, int numAsyncCons,

In code still there is reference to AsyncCons, as decided lets
change it to concurrent_connections | conc_cons

With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Erik Rijkers 2014-09-26 07:18:08 copy & pastos in atomics.h comments
Previous Message Alexander Korotkov 2014-09-26 06:49:42 Re: KNN-GiST with recheck