Re: TODO : Allow parallel cores to be used by vacuumdb [ WIP ]

From: Michael Paquier <michael(dot)paquier(at)gmail(dot)com>
To: Andres Freund <andres(at)2ndquadrant(dot)com>
Cc: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, Dilip kumar <dilip(dot)kumar(at)huawei(dot)com>, Magnus Hagander <magnus(at)hagander(dot)net>, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, Jan Lentfer <Jan(dot)Lentfer(at)web(dot)de>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, Sawada Masahiko <sawada(dot)mshk(at)gmail(dot)com>, Euler Taveira <euler(at)timbira(dot)com(dot)br>
Subject: Re: TODO : Allow parallel cores to be used by vacuumdb [ WIP ]
Date: 2015-01-15 08:02:34
Message-ID: CAB7nPqQMzLNXWDY7R312ma0Ar-0pruofi453=0bFH5E5DOLRRw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Sun, Jan 4, 2015 at 10:57 AM, Andres Freund <andres(at)2ndquadrant(dot)com> wrote:
> On 2014-12-31 18:35:38 +0530, Amit Kapila wrote:
>> + <term><option>-j <replaceable class="parameter">jobs</replaceable></option></term>
>> + <term><option>--jobs=<replaceable class="parameter">njobs</replaceable></option></term>
>> + <listitem>
>> + <para>
>> + Number of concurrent connections to perform the operation.
>> + This option will enable the vacuum operation to run on asynchronous
>> + connections, at a time one table will be operated on one connection.
>> + So at one time as many tables will be vacuumed parallely as number of
>> + jobs. If number of jobs given are more than number of tables then
>> + number of jobs will be set to number of tables.
>
> "asynchronous connections" isn't a very well defined term. Also, the
> second part of that sentence doesn't seem to be gramattically correct.
>
>> + </para>
>> + <para>
>> + <application>vacuumdb</application> will open
>> + <replaceable class="parameter"> njobs</replaceable> connections to the
>> + database, so make sure your <xref linkend="guc-max-connections">
>> + setting is high enough to accommodate all connections.
>> + </para>
>
> Isn't it njobs+1?
>
>> @@ -141,6 +199,7 @@ main(int argc, char *argv[])
>> }
>> }
>>
>> + optind++;
>
> Hm, where's that coming from?
>
>> + PQsetnonblocking(connSlot[0].connection, 1);
>> +
>> + for (i = 1; i < concurrentCons; i++)
>> + {
>> + connSlot[i].connection = connectDatabase(dbname, host, port, username,
>> + prompt_password, progname, false);
>> +
>> + PQsetnonblocking(connSlot[i].connection, 1);
>> + connSlot[i].isFree = true;
>> + connSlot[i].sock = PQsocket(connSlot[i].connection);
>> + }
>
> Are you sure about this global PQsetnonblocking()? This means that you
> might not be able to send queries... And you don't seem to be waiting
> for sockets waiting for writes in the select loop - which means you
> might end up being stuck waiting for reads when you haven't submitted
> the query.
>
> I think you might need a more complex select() loop. On nonfree
> connections also wait for writes if PQflush() returns != 0.
>
>
>> +/*
>> + * GetIdleSlot
>> + * Process the slot list, if any free slot is available then return
>> + * the slotid else perform the select on all the socket's and wait
>> + * until atleast one slot becomes available.
>> + */
>> +static int
>> +GetIdleSlot(ParallelSlot *pSlot, int max_slot, const char *dbname,
>> + const char *progname, bool completedb)
>> +{
>> + int i;
>> + fd_set slotset;
>
>
> Hm, you probably need to limit -j to FD_SETSIZE - 1 or so.
>
>> + int firstFree = -1;
>> + pgsocket maxFd;
>> +
>> + for (i = 0; i < max_slot; i++)
>> + if (pSlot[i].isFree)
>> + return i;
>
>> + FD_ZERO(&slotset);
>> +
>> + maxFd = pSlot[0].sock;
>> +
>> + for (i = 0; i < max_slot; i++)
>> + {
>> + FD_SET(pSlot[i].sock, &slotset);
>> + if (pSlot[i].sock > maxFd)
>> + maxFd = pSlot[i].sock;
>> + }
>
> So we're waiting for idle connections?
>
> I think you'll have to have to use two fdsets here, and set the write
> set based on PQflush() != 0.
>
>> +/*
>> + * A select loop that repeats calling select until a descriptor in the read
>> + * set becomes readable. On Windows we have to check for the termination event
>> + * from time to time, on Unix we can just block forever.
>> + */
>
> Should a) mention why we have to check regularly on windows b) that on
> linux we don't have to because we send a cancel event from the signal
> handler.
>
>> +static int
>> +select_loop(int maxFd, fd_set *workerset)
>> +{
>> + int i;
>> + fd_set saveSet = *workerset;
>>
>> +#ifdef WIN32
>> + /* should always be the master */
>
> Hm?
>
>
> I have to say, this is a fairly large patch for such a minor feature...
Andres, this patch needs more effort from the author, right? So
marking it as returned with feedback.
--
Michael

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Michael Paquier 2015-01-15 08:04:50 Re: HINTing on UPDATE foo SET foo.bar = ..;
Previous Message Michael Paquier 2015-01-15 08:00:23 Re: WIP: multivariate statistics / proof of concept