|From:||Amit Khandekar <amitdkhan(dot)pg(at)gmail(dot)com>|
|To:||Robert Haas <robertmhaas(at)gmail(dot)com>|
|Cc:||Ashutosh Bapat <ashutosh(dot)bapat(at)enterprisedb(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>|
|Subject:||Re: Parallel Append implementation|
|Views:||Raw Message | Whole Thread | Download mbox|
On 23 March 2017 at 16:26, Amit Khandekar <amitdkhan(dot)pg(at)gmail(dot)com> wrote:
> On 23 March 2017 at 05:55, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
>> On Wed, Mar 22, 2017 at 4:49 AM, Amit Khandekar <amitdkhan(dot)pg(at)gmail(dot)com>
>>> Attached is the updated patch that handles the changes for all the
>>> comments except the cost changes part. Details about the specific
>>> changes are after the cost-related points discussed below.
>>> For non-partial paths, I was checking following 3 options :
>>> Option 1. Just take the sum of total non-partial child costs and
>>> divide it by number of workers. It seems to be getting close to the
>>> actual cost.
>> If the costs for all children are about equal, then that works fine.
>> But when they are very unequal, then it's highly misleading.
>>> Option 2. Calculate exact cost by an algorithm which I mentioned
>>> before, which is pasted below for reference :
>>> Persubpath cost : 20 16 10 8 3 1, with 3 workers.
>>> After 10 time units (this is minimum of first 3 i.e. 20, 16, 10), the
>>> times remaining are :
>>> 10 6 0 8 3 1
>>> After 6 units (minimum of 10, 06, 08), the times remaining are :
>>> 4 0 0 2 3 1
>>> After 2 units (minimum of 4, 2, 3), the times remaining are :
>>> 2 0 0 0 1 1
>>> After 1 units (minimum of 2, 1, 1), the times remaining are :
>>> 1 0 0 0 0 0
>>> After 1 units (minimum of 1, 0 , 0), the times remaining are :
>>> 0 0 0 0 0 0
>>> Now add up above time chunks : 10 + 6 + 2 + 1 + 1 = 20
>> This gives the same answer as what I was proposing
> Ah I see.
>> but I believe it's more complicated to compute.
> Yes a bit, particularly because in my algorithm, I would have to do
> 'n' subtractions each time, in case of 'n' workers. But it looked more
> natural because it follows exactly the way we manually calculate.
>> The way my proposal would work in this
>> case is that we would start with an array C (since there are three
>> workers], with all entries 0. Logically C[i] represents the amount of
>> work to be performed by worker i. We add each path in turn to the
>> worker whose array entry is currently smallest; in the case of a tie,
>> just pick the first such entry.
>> So in your example we do this:
>> C += 20;
>> C += 16;
>> C += 10;
>> /* C is smaller than C or C at this point, so we add the next
>> path to C */
>> C += 8;
>> /* after the previous line, C is now the smallest, so add to that
>> entry next */
>> C += 3;
>> /* now we've got C = 20, C = 19, C = 18, so add to C */
>> C += 1;
>> /* final result: C = 20, C = 19, C = 19 */
>> Now we just take the highest entry that appears in any array, which in
>> this case is C, as the total cost.
> Wow. The way your final result exactly tallies with my algorithm
> result is very interesting. This looks like some maths or computer
> science theory that I am not aware.
> I am currently coding the algorithm using your method.
While I was coding this, I was considering if Path->rows also should
be calculated similar to total cost for non-partial subpath and total
cost for partial subpaths. I think for rows, we can just take
total_rows divided by workers for non-partial paths, and this
approximation should suffice. It looks odd that it be treated with the
same algorithm we chose for total cost for non-partial paths.
Meanwhile, attached is a WIP patch v10. The only change in this patch
w.r.t. the last patch (v9) is that this one has a new function defined
append_nonpartial_cost(). Just sending this to show how the algorithm
looks like; haven't yet called it.
|Next Message||Fabrízio de Royes Mello||2017-03-23 19:14:58||Re: [PATCH] Move all am-related reloption code into src/backend/access/[am-name] and get rid of relopt_kind for custom AM|
|Previous Message||Alvaro Herrera||2017-03-23 18:58:49||Re: [PATCH] Move all am-related reloption code into src/backend/access/[am-name] and get rid of relopt_kind for custom AM|