Re: TODO : Allow parallel cores to be used by vacuumdb [ WIP ]

From: Dilip kumar <dilip(dot)kumar(at)huawei(dot)com>
To: Andres Freund <andres(at)2ndquadrant(dot)com>, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
Cc: Magnus Hagander <magnus(at)hagander(dot)net>, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, Jan Lentfer <Jan(dot)Lentfer(at)web(dot)de>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, Sawada Masahiko <sawada(dot)mshk(at)gmail(dot)com>, Euler Taveira <euler(at)timbira(dot)com(dot)br>
Subject: Re: TODO : Allow parallel cores to be used by vacuumdb [ WIP ]
Date: 2015-01-16 03:19:13
Message-ID: 4205E661176A124FAF891E0A6BA913526639D56F@szxeml509-mbs.china.huawei.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 04 January 2015 07:27, Andres Freund Wrote,
> On 2014-12-31 18:35:38 +0530, Amit Kapila wrote:
> > + <term><option>-j <replaceable
> class="parameter">jobs</replaceable></option></term>
> > + <term><option>--jobs=<replaceable
> class="parameter">njobs</replaceable></option></term>
> > + <listitem>
> > + <para>
> > + Number of concurrent connections to perform the operation.
> > + This option will enable the vacuum operation to run on
> asynchronous
> > + connections, at a time one table will be operated on one
> connection.
> > + So at one time as many tables will be vacuumed parallely as
> number of
> > + jobs. If number of jobs given are more than number of
> tables then
> > + number of jobs will be set to number of tables.
>
> "asynchronous connections" isn't a very well defined term. Also, the
> second part of that sentence doesn't seem to be gramattically correct.

I have changed this to concurrent connections, is this ok?

> > + </para>
> > + <para>
> > + <application>vacuumdb</application> will open
> > + <replaceable class="parameter"> njobs</replaceable>
> connections to the
> > + database, so make sure your <xref linkend="guc-max-
> connections">
> > + setting is high enough to accommodate all connections.
> > + </para>
>
> Isn't it njobs+1?

The main connections what we are using for getting table information, same is use as first slot connections, so total number of connections are still njobs.


> > @@ -141,6 +199,7 @@ main(int argc, char *argv[])
> > }
> > }
> >
> > + optind++;
>
> Hm, where's that coming from?

This is wrong, I have removed it.

>
> > + PQsetnonblocking(connSlot[0].connection, 1);
> > +
> > + for (i = 1; i < concurrentCons; i++)
> > + {
> > + connSlot[i].connection = connectDatabase(dbname, host, port,
> username,
> > + prompt_password,
> progname, false);
> > +
> > + PQsetnonblocking(connSlot[i].connection, 1);
> > + connSlot[i].isFree = true;
> > + connSlot[i].sock = PQsocket(connSlot[i].connection);
> > + }
>
> Are you sure about this global PQsetnonblocking()? This means that you
> might not be able to send queries... And you don't seem to be waiting
> for sockets waiting for writes in the select loop - which means you
> might end up being stuck waiting for reads when you haven't submitted
> the query.
>
> I think you might need a more complex select() loop. On nonfree
> connections also wait for writes if PQflush() returns != 0.

1. In GetIdleSlot we are making sure that, only if connection is busy, means if we have sent query on that connections, only in that case we will wait.
2. When all the connections are busy in that case we are doing select on all FD to make sure some response on connections, and if there is any response on connections
Select will come out, then we consume the input and check whether connection is idle, or it's just a intermediate response, if it not busy then we process all the result and set it as free.

>
> > +/*
> > + * GetIdleSlot
> > + * Process the slot list, if any free slot is available then return
> > + * the slotid else perform the select on all the socket's and wait
> > + * until atleast one slot becomes available.
> > + */
> > +static int
> > +GetIdleSlot(ParallelSlot *pSlot, int max_slot, const char *dbname,
> > + const char *progname, bool completedb) {
> > + int i;
> > + fd_set slotset;
>
>
> Hm, you probably need to limit -j to FD_SETSIZE - 1 or so.

I will change this in next patch..

> > + int firstFree = -1;
> > + pgsocket maxFd;
> > +
> > + for (i = 0; i < max_slot; i++)
> > + if (pSlot[i].isFree)
> > + return i;
>
> > + FD_ZERO(&slotset);
> > +
> > + maxFd = pSlot[0].sock;
> > +
> > + for (i = 0; i < max_slot; i++)
> > + {
> > + FD_SET(pSlot[i].sock, &slotset);
> > + if (pSlot[i].sock > maxFd)
> > + maxFd = pSlot[i].sock;
> > + }
>
> So we're waiting for idle connections?
>
> I think you'll have to have to use two fdsets here, and set the write
> set based on PQflush() != 0.

I did not get this ?

The logic here is, we are waiting for any connections to respond, and wait using select on all fds.
When select come out, we check all the socket that which all are not busy, mark all the finished connection as idle at once,
If none of the connection free, we go to select again, otherwise will return first idle connection.

> > +/*
> > + * A select loop that repeats calling select until a descriptor in
> > +the read
> > + * set becomes readable. On Windows we have to check for the
> > +termination event
> > + * from time to time, on Unix we can just block forever.
> > + */
>
> Should a) mention why we have to check regularly on windows b) that on
> linux we don't have to because we send a cancel event from the signal
> handler.

I have added the comments..

> > +static int
> > +select_loop(int maxFd, fd_set *workerset) {
> > + int i;
> > + fd_set saveSet = *workerset;
> >
> > +#ifdef WIN32
> > + /* should always be the master */
>
> Hm?
>
>
> I have to say, this is a fairly large patch for such a minor feature...
>
> Greetings,
>
> Andres Freund
>
> --
> Andres Freund http://www.2ndQuadrant.com/
> PostgreSQL Development, 24x7 Support, Training & Services

Attachment Content-Type Size
vacuumdb_parallel_v22.patch application/octet-stream 27.6 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Amit Kapila 2015-01-16 03:54:23 Re: Merging postgresql.conf and postgresql.auto.conf
Previous Message Kyotaro HORIGUCHI 2015-01-16 01:41:02 Re: Overhauling our interrupt handling