From: | Nandakumar M <m(dot)nanda92(at)gmail(dot)com> |
---|---|
To: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
Cc: | Pgsql Performance <pgsql-performance(at)lists(dot)postgresql(dot)org> |
Subject: | Re: Need help with optimising simple query |
Date: | 2018-07-10 05:36:59 |
Message-ID: | CANcFUu57CVD_29kh+fmwUT_2fN=gUhrmwRjQoyZNf_JvYnxzGA@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-performance |
Hi Tom,
Is there something that I can do to improve the performance of such
queries (where ordering is done based on child table column and join
is left join)? Maybe a combined index or something like that? Or is it
possible to modify the query to get same result but execute faster.
One ad-hoc optimisation (which gives somewhat better performance) that
came to mind is to have a sub query for child table like
performance_test=# explain analyze select * from parent left join
(select * from child order by name limit 10) as child on
parent.child_id = child.id order by child.name limit 10;
QUERY PLAN
-----------------------------------------------------------------------------------------------------------------------------------------------------------------
Limit (cost=42714.84..42714.86 rows=10 width=59) (actual
time=311.623..311.624 rows=10 loops=1)
-> Sort (cost=42714.84..45214.84 rows=1000000 width=59) (actual
time=311.622..311.622 rows=10 loops=1)
Sort Key: child.name
Sort Method: top-N heapsort Memory: 26kB
-> Hash Left Join (cost=1.19..21105.20 rows=1000000
width=59) (actual time=0.120..204.386 rows=1000000 loops=1)
Hash Cond: (parent.child_id = child.id)
-> Seq Scan on parent (cost=0.00..17353.00
rows=1000000 width=29) (actual time=0.073..73.052 rows=1000000
loops=1)
-> Hash (cost=1.06..1.06 rows=10 width=19) (actual
time=0.035..0.035 rows=10 loops=1)
Buckets: 1024 Batches: 1 Memory Usage: 9kB
-> Limit (cost=0.42..0.96 rows=10 width=19)
(actual time=0.014..0.027 rows=10 loops=1)
-> Index Scan using child_name_unique on
child (cost=0.42..5448.56 rows=100000 width=19) (actual
time=0.013..0.024 rows=10 loops=1)
Planning time: 0.505 ms
Execution time: 311.682 ms
(13 rows)
Time: 312.673 ms
Is there something I can do that will improve the query performance
much more than this?
Thanks.
Regards,
Nanda
On Mon, 9 Jul 2018, 19:53 Tom Lane, <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>
> Nandakumar M <m(dot)nanda92(at)gmail(dot)com> writes:
> > I am having a query that has an order by and a limit clause. The
> > column on which I am doing order by is indexed (default b tree index).
> > However the index is not being used. On tweaking the query a bit I
> > found that when I use left join index is not used whereas when I use
> > inner join the index is used.
>
> The reason the index isn't being used is that the sort order the query
> requests isn't the same as the order provided by the index. Here:
>
> > performance_test=# explain analyze select * from parent left join
> > child on parent.child_id = child.id order by child.name limit 10;
>
> you're asking to sort by a column that will include null values for
> child.name anywhere that there's a parent row without a match for
> child_id. Those rows aren't even represented in the index on child.name,
> much less placed in the right order.
>
> regards, tom lane
From | Date | Subject | |
---|---|---|---|
Next Message | Lincoln Swaine-Moore | 2018-07-10 15:07:42 | Improving Performance of Query ~ Filter by A, Sort by B |
Previous Message | Bruce Momjian | 2018-07-09 15:19:28 | Re: add default parallel query to v10 release notes? (Re: [PERFORM] performance drop after upgrade (9.6 > 10)) |