Re: Simple join doesn't use index

From: Merlin Moncure <mmoncure(at)gmail(dot)com>
To: Alex Vinnik <alvinnik(dot)g(at)gmail(dot)com>
Cc: pgsql-performance(at)postgresql(dot)org
Subject: Re: Simple join doesn't use index
Date: 2013-01-29 02:31:32
Message-ID: CAHyXU0wshWt-e5d8rBBu5sMXEaT48nA1sPLvK0ZwE3GjLOgjMQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

On Mon, Jan 28, 2013 at 5:43 PM, Alex Vinnik <alvinnik(dot)g(at)gmail(dot)com> wrote:
> It sure turned out that default settings are not a good fit. Setting
> random_page_cost to 1.0 made query to run in 2.6 seconds and I clearly see
> that indexes are being used in explain plan and IO utilization is close to
> 0.
>
> QUERY PLAN
> Sort (cost=969787.23..970288.67 rows=200575 width=8) (actual
> time=2176.045..2418.162 rows=241238 loops=1)
> Sort Key: visits.id, views.id
> Sort Method: external sort Disk: 4248kB
> -> Nested Loop (cost=0.00..950554.81 rows=200575 width=8) (actual
> time=0.048..1735.357 rows=241238 loops=1)
> -> Index Scan using visits_created_at_index on visits
> (cost=0.00..5459.16 rows=82561 width=4) (actual time=0.032..178.591
> rows=136021 loops=1)
> Index Cond: ((created_at >= '2012-12-15 00:00:00'::timestamp
> without time zone) AND (created_at < '2012-12-16 00:00:00'::timestamp
> without time zone))
> -> Index Scan using views_visit_id_index on views
> (cost=0.00..11.33 rows=12 width=8) (actual time=0.004..0.006 rows=2
> loops=136021)
> Index Cond: (visit_id = visits.id)
> Total runtime: 2635.169 ms
>
> However I noticed that sorting is done using disk("external sort Disk:
> 4248kB") which prompted me to take a look at work_mem. But it turned out
> that small increase to 4MB from default 1MB turns off index usage and query
> gets x10 slower. IO utilization jumped to 100% from literally nothing. so
> back to square one...
>
> QUERY PLAN
> Sort (cost=936642.75..937144.19 rows=200575 width=8) (actual
> time=33200.762..33474.443 rows=241238 loops=1)
> Sort Key: visits.id, views.id
> Sort Method: external merge Disk: 4248kB
> -> Hash Join (cost=6491.17..917410.33 rows=200575 width=8) (actual
> time=7156.498..32723.221 rows=241238 loops=1)
> Hash Cond: (views.visit_id = visits.id)
> -> Seq Scan on views (cost=0.00..832189.95 rows=8768395 width=8)
> (actual time=0.100..12126.342 rows=8200704 loops=1)
> -> Hash (cost=5459.16..5459.16 rows=82561 width=4) (actual
> time=353.683..353.683 rows=136021 loops=1)
> Buckets: 16384 Batches: 2 (originally 1) Memory Usage:
> 4097kB
> -> Index Scan using visits_created_at_index on visits
> (cost=0.00..5459.16 rows=82561 width=4) (actual time=0.032..175.051
> rows=136021 loops=1)
> Index Cond: ((created_at >= '2012-12-15
> 00:00:00'::timestamp without time zone) AND (created_at < '2012-12-16
> 00:00:00'::timestamp without time zone))
> Total runtime: 33698.000 ms
>
> Basically PG is going through all views again and not using "Index Scan
> using views_visit_id_index on views". Looks like setting work_mem confuses
> planner somehow. Any idea what can be done to do sorting in memory. I
> suspect it should make query even more faster. Thanks -Alex

hm, what happens when you set work_mem a fair amount higher? (say,
64mb). You can set it for one session by going "set work_mem='64mb';
" as opposed to the entire server in postgresql.conf.

merlin

In response to

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Heikki Linnakangas 2013-01-29 10:12:41 Re: [PERFORM] pgbench to the MAXINT
Previous Message Filip Rembiałkowski 2013-01-29 00:55:10 Re: Simple join doesn't use index