Re: Getting sorted data from foreign server for merge join

From: Rushabh Lathia <rushabh(dot)lathia(at)gmail(dot)com>
To: Ashutosh Bapat <ashutosh(dot)bapat(at)enterprisedb(dot)com>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Getting sorted data from foreign server for merge join
Date: 2015-12-02 11:45:01
Message-ID: CAGPqQf0yBsShamO+8qQhoEUyau6eL09quJGkVrN9jFVnrMpCEg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Thanks Ashutosh.

Re-reviewed and Re-verified the patch, pg_sort_all_pd_v5.patch
looks good to me.

On Fri, Nov 27, 2015 at 3:02 PM, Ashutosh Bapat <
ashutosh(dot)bapat(at)enterprisedb(dot)com> wrote:

> Thanks Rushabh for your review and comments.
>
> On Thu, Nov 26, 2015 at 5:39 PM, Rushabh Lathia <rushabh(dot)lathia(at)gmail(dot)com>
> wrote:
>
>> Hi Ashutosh,
>>
>> I reviewed your latest version of patch and over all the implementation
>> and other details look good to me.
>>
>> Here are few cosmetic issues which I found:
>>
>> 1) Patch not getting applied cleanly - white space warning
>>
>>
> Done.
>
>
>> 2)
>>
>> - List *usable_pathkeys = NIL;
>> + List *useful_pathkeys_list = NIL; /* List of all pathkeys
>> */
>>
>> Code alignment is not correct with other declared variables.
>>
>>
> Incorporated the change in the patch.
>
> 3)
>>
>> + {
>> + PathKey *pathkey;
>> + List *pathkeys;
>> +
>> + pathkey = make_canonical_pathkey(root, cur_ec,
>> +
>> linitial_oid(cur_ec->ec_opfamilies),
>> + BTLessStrategyNumber,
>> + false);
>> + pathkeys = list_make1(pathkey);
>> + useful_pathkeys_list = lappend(useful_pathkeys_list,
>> pathkeys);
>> + }
>>
>> Code alignment need to fix at make_canonical_pathkey().
>>
>
> Incorporated the change in the patch.
>
> I have also removed the TODO item in the prologue of this function, since
> none has objected to externalization of make_canonical_pathkeys till now
> and it's not expected to be part of the final commit.
>
>
>>
>> 4)
>>
>> I don't understand the meaning of following added testcase into
>> postgres_fdw.
>>
>> +++ b/contrib/postgres_fdw/sql/postgres_fdw.sql
>> @@ -171,20 +171,35 @@ SELECT COUNT(*) FROM ft1 t1;
>> -- join two tables
>> SELECT t1.c1 FROM ft1 t1 JOIN ft2 t2 ON (t1.c1 = t2.c1) ORDER BY t1.c3,
>> t1.c1 OFFSET 100 LIMIT 10;
>> -- subquery
>> SELECT * FROM ft1 t1 WHERE t1.c3 IN (SELECT c3 FROM ft2 t2 WHERE c1 <=
>> 10) ORDER BY c1;
>> -- subquery+MAX
>> SELECT * FROM ft1 t1 WHERE t1.c3 = (SELECT MAX(c3) FROM ft2 t2) ORDER BY
>> c1;
>> -- used in CTE
>> WITH t1 AS (SELECT * FROM ft1 WHERE c1 <= 10) SELECT t2.c1, t2.c2,
>> t2.c3, t2.c4 FROM t1, ft2 t2 WHERE t1.c1 = t2.c1 ORDER BY t1.c1;
>> -- fixed values
>> SELECT 'fixed', NULL FROM ft1 t1 WHERE c1 = 1;
>> +-- getting data sorted from the foreign table for merge join
>> +-- Since we are interested in merge join, disable other joins
>> +SET enable_hashjoin TO false;
>> +SET enable_nestloop TO false;
>> +-- inner join, expressions in the clauses appear in the equivalence
>> class list
>> +EXPLAIN (VERBOSE, COSTS false)
>> + SELECT t1.c1, t2."C 1" FROM ft2 t1 JOIN "S 1"."T 1" t2 ON (t1.c1 =
>> t2."C 1") ORDER BY t1.c1 OFFSET 100 LIMIT 10;
>> +SELECT t1.c1, t2."C 1" FROM ft2 t1 JOIN "S 1"."T 1" t2 ON (t1.c1 = t2."C
>> 1") ORDER BY t1.c1 OFFSET 100 LIMIT 10;
>> +-- outer join, expression in the clauses do not appear in equivalence
>> class list
>> +-- but no output change as compared to the previous query
>> +EXPLAIN (VERBOSE, COSTS false)
>> + SELECT t1.c1, t2."C 1" FROM ft2 t1 LEFT JOIN "S 1"."T 1" t2 ON
>> (t1.c1 = t2."C 1") ORDER BY t1.c1 OFFSET 100 LIMIT 10;
>> +SELECT t1.c1, t2."C 1" FROM ft2 t1 LEFT JOIN "S 1"."T 1" t2 ON (t1.c1 =
>> t2."C 1") ORDER BY t1.c1 OFFSET 100 LIMIT 10;
>> +SET enable_hashjoin TO true;
>> +SET enable_nestloop TO true;
>>
>> Because, I removed the code changes of the patch and then I run the test
>> seem like it has nothing to do with the code changes. Above set of test
>> giving
>> same result with/without patch.
>>
>> Am I missing something ?
>>
>
> Actually, the output of merge join is always ordered by the pathkeys used
> for merge join. That routed through LIMIT node remains ordered. So, we
> actually do not need ORDER BY t1.c1 clause in the above queries. Without
> that clause, the tests will show difference output with and without patch.
> I have changed the attached patch accordingly.
>
>
>>
>> Apart from this I debugged the patch for each scenario (query pathkeys and
>> pathkeys arising out of the equivalence classes) and so far patch looks
>> good
>> to me.
>>
>>
> Thanks.
>
>
>> Attaching update version of patch by fixing the cosmetic changes.
>>
>>
> Attached version of patch contains your changes.
> --
> Best Wishes,
> Ashutosh Bapat
> EnterpriseDB Corporation
> The Postgres Database Company
>
>
>
>

--
Rushabh Lathia

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Ashutosh Bapat 2015-12-02 11:45:39 Re: postgres_fdw join pushdown (was Re: Custom/Foreign-Join-APIs)
Previous Message Joel Jacobson 2015-12-02 11:39:37 Re: Remaining 9.5 open items