Re: Slow join

From: Roman Kushnir <roman(dot)kushnir(at)ad2games(dot)com>
To: Justin Pryzby <pryzby(at)telsasoft(dot)com>
Cc: pgsql-performance(at)postgresql(dot)org
Subject: Re: Slow join
Date: 2018-06-25 19:39:32
Message-ID: 9A5A439B-2C8F-4027-9EFA-297754EC1941@ad2games.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

Hi Justin,

Thank you for your comments.

As you mentioned the size of shared buffers, my first thought was to just switch to a larger machine as this one only has 2 gigs of RAM. But then it occurred to me that the whole videos table is getting loaded into memory while only 2 small columns are actually used! So I created a covering index on videos (channel_id, id) and the query now completes in 190ms!

Thanks, you helped me a lot.

> On Jun 25, 2018, at 6:45 PM, Justin Pryzby <pryzby(at)telsasoft(dot)com> wrote:
>
> Hi,
>
> Thanks for providing all this info :)
>
> On Mon, Jun 25, 2018 at 05:55:49PM +0200, Roman Kushnir wrote:
>> Hello,
>>
>> The following basic inner join is taking too much time for me. (I’m using count(videos.id <http://videos.id/>) instead of count(*) because my actual query looks different, but I simplified it here to the essence).
>> I’ve tried following random people's suggestions and adjusting the random_page_cost(decreasing it from 4 to 1.1) without a stable improvement. Any hints on what is wrong here? Thank you.
>
>> Running on Amazon RDS, with default 10.1 parameters
>
> All default ?
> https://wiki.postgresql.org/wiki/Server_Configuration
>
> It looks like nearly the entire time is spent reading this table:
>
> Parallel Seq Scan on videos ... (ACTUAL TIME=0.687..55,555.774...)
> Buffers: shared hit=7138 read=464357
>
> Perhaps shared_buffers should be at least several times larger, and perhaps up
> to 4gb to keep the entire table in RAM. You could maybe also benefit from
> better device readahead (blockdev --setra or lvchange -r or
> /sys/block/sd?/queue/read_ahead_kb)
>
> Also, it looks like there's a row count misestimate, which probably doesn't
> matter for the query you sent, but maybe affects your larger query:
> Hash Join (... ROWS=365,328 ... ) (... ROWS=45,307 ... )
>
> If that matters, maybe it'd help to increase statistics on channel_id.
> Actually, I see both tables have FK into channels.id:
>
>> "fk_rails_11d6d9bea2" FOREIGN KEY (channel_id) REFERENCES channels(id)
>> "fk_rails_257f68ae55" FOREIGN KEY (channel_id) REFERENCES channels(id)
>
> I don't see the definition of "channels" (and it looks like the query I put on
> the wiki doesn't show null_frac), but I think that postgres since 9.6 should be
> able to infer good join statistics from the existence of the FKs. Maybe that
> only works if you actually JOIN to the channels table (?). But if anything
> that's only a 2ndary problem, if at all.
>
> Justin

In response to

Browse pgsql-performance by date

  From Date Subject
Next Message Elvir Kurić 2018-06-26 12:21:00 "set primary keys..." is missing when using hight values for transactions / scaling factor with pgbench
Previous Message Justin Pryzby 2018-06-25 16:45:22 Re: Slow join