Re: TODO : Allow parallel cores to be used by vacuumdb [ WIP ]

From: Dilip kumar <dilip(dot)kumar(at)huawei(dot)com>
To: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
Cc: Magnus Hagander <magnus(at)hagander(dot)net>, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, Jan Lentfer <Jan(dot)Lentfer(at)web(dot)de>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, Sawada Masahiko <sawada(dot)mshk(at)gmail(dot)com>, Euler Taveira <euler(at)timbira(dot)com(dot)br>
Subject: Re: TODO : Allow parallel cores to be used by vacuumdb [ WIP ]
Date: 2014-09-24 09:48:50
Message-ID: 4205E661176A124FAF891E0A6BA91352663613E8@szxeml509-mbs.china.huawei.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 24 August 2014 11:33, Amit Kapila Wrote

Thanks for you comments, i have worked on both the review comment lists, sent on 19 August, and 24 August.

Latest patch is attached with the mail..

on 19 August:
------------
>You can compare against SQLSTATE by using below API.
>val = PQresultErrorField(res, PG_DIAG_SQLSTATE);

>You need to handle *42P01* SQLSTATE, also please refer below
>usage where we are checking SQLSTATE.

>fe-connect.c
>PQresultErrorField(conn->result, PG_DIAG_SQLSTATE),
> ERRCODE_INVALID_PASSWORD) == 0)

DONE

>Few more comments:
>
>1.
>* If user has not given the vacuum of complete db, then if
>
>I think here you have said reverse of what code is doing.
>You don't need *not* in above sentence.

DONE

>2.
>+ appendPQExpBuffer(&sql, "\"%s\".\"%s\"", nspace, relName);
>I think here you need to use function fmtQualifiedId() or fmtId()
>or something similar to handle quotes appropriately.

DONE

>3.
>
>+ */
>+ if (!r && !completedb)
>Here the usage of completedb variable is reversed which means
>that it goes into error path when actually whole database is
>getting vacuumed and the reason is that you are setting it
>to false in below code:
>+ /* Vaccuming full database*/
>+ vacuum_tables = false;

FIXED

>4.
>Functions prepare_command() and vacuum_one_database() contain
>duplicate code, is there any problem in using prepare_command()
>function in vacuum_one_database(). Another point in this context
>is that I think it is better to name function prepare_command()
>as append_vacuum_options() or something on that lines, also it will
>be better if you can write function header for this function as well.

DONE

>5.
>+ if (error)
>+ {
>+ for (i = 0; i < max_slot; i++)
>+ {
>+ DisconnectDatabase(&connSlot[i]);
>+ }
>
>Here why do we need DisconnectDatabase() type of function?
>Why can't we simply call PQfinish() as in base code?

Beacause base code jut need to handle main connection, and when sending the PQfinish, means either its completed or error,
In both the cases, control is with client, But in multi connection case, if one connection fails then we need to send
cancle to on other connection that wwhat is done DisconnectDatabase.

>
>6.
>+ /*
>+ * if table list is not provided then we need to do vaccum for whole DB
>+ * get the list of all tables and prpare the list
>+ */
>spelling of prepare is wrong. I have noticed spell mistake
>in comments at some other place as well, please check all
>comments once

FIXED

>
>7. I think in new mechanism cancel handler will not work.
>In single connection vacuum it was always set/reset
>in function executeMaintenanceCommand(). You might need
>to set/reset it in function run_parallel_vacuum().

Good catch, Now i have called SetCancelConn(pSlot[0].connection), on first connection. this will enable cancle
handler to cancle the query on first connection so that select loop will come out.

24 August
---------
>
>1. I could see one shortcoming in the way the patch has currently parallelize the
> work for --analyze-in-stages. Basically patch is performing the work for each stage
> for multiple tables in concurrent connections that seems okay for the cases when
> number of parallel connections is less than equal to number of tables, but for
> the case when user has asked for more number of connections than number of tables,
> then I think this strategy will not be able to use the extra connections.

I think --analyze-in-stages should maintain the order.

>2. Similarly for the case of multiple databases, currently it will not be able
> to use connections more than number of tables in each database because the
> parallelizing strategy is to just use the conncurrent connections for
> tables inside single database.

I think for multiple database doing the parallel execution we need to maintain the multiple connection with multiple databases.
And we need to maintain a table list for all the databases together to run them concurrently. I think this may impact the startup cost,
as we need to create a multiple connection and disconnect for preparing the list and i think it will increase the complexity also.

>I am not completely sure whether current strategy is good enough or
>we should try to address the above problems. What do you think?

>3.
>+ do
>+ {
>+ i = select_loop(maxFd, &slotset);
>+ Assert(i != 0);
>
>Could you explain the reason of using this loop, I think you
>want to wait for data on socket descriptor, but why for maxFd?
>Also it is better if you explain this logic in comments.

We are sending vacuum job to the connection and when non of the connection slot is free, we are waiting on all the socket, and
wait until one of them get freed.

>4.
>+ for (i = 0; i < max_slot; i++)
>+ {
>+ if (!FD_ISSET(pSlot[i].sock, &slotset))
>+ continue;
>+
>+ PQconsumeInput(pSlot[i].connection);
>+ if (PQisBusy(pSlot[i].connection))
>+ continue;
>
>I think it is better to call PQconsumeInput() only if you find
>connection is busy.

I think here logic is bit different, in other places of code they call PQconsumeInput untill it shows PQisBusy in a loop to consume all data,
but in over case, we have sent the query now we consume if there is something on n/w and then check PQisBusy,
if not busy means this connection is freed.
And PQconsumeInput functionality is if input available then only consume it so i think we need not to checkk for PQisBusy externally,

However the case where user have to fetch complete data, that time they need to call PQisBusy and then PQconsumeInput in loop so PQconsumeInput
will not get called in tight loop, and will be called only when data connection is busy.

Regards,
Dilip

Attachment Content-Type Size
vacuumdb_parallel_v14.patch application/octet-stream 24.4 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Heikki Linnakangas 2014-09-24 10:17:13 Re: Extending COPY TO
Previous Message Andres Freund 2014-09-24 09:09:24 Re: a fast bloat measurement tool (was Re: Measuring relation free space)