Re: simple join uses indexes, very slow

From: george young <gry(at)ll(dot)mit(dot)edu>
To: pgsql-performance(at)postgresql(dot)org
Subject: Re: simple join uses indexes, very slow
Date: 2006-03-29 15:29:15
Message-ID: 20060329102915.20a82762.gry@ll.mit.edu
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

On Wed, 29 Mar 2006 01:08:15 -0500
stef <stef(at)ummon(dot)com> threw this fish to the penguins:

>
> If your looking for suggestions, I would suggest updating the 8.1.x you
> have installed to the latest version, as of typing this is 8.1.3 ;) Most
> notable is some of the -bug- fixes that are in since 8.1.0, for example;
>
> * Fix incorrect optimizations of outer-join conditions (Tom)
>
> You know, minor point releases aren't adding new features or changing
> basic functionality, they are pure and simple bugfixes. If I was in
> -your- position, I would run (don't walk ;) and install upto 8.1.3

I just did this(8.1.3). I also moved the server to a host with more
ram and faster cpu. And I did cluster on the main index of the large
parameters table. The result is less than a second instead of 70
seconds.

Sorry I didn't have time to isolate the individual effects
of the above changes, but sometimes you just have to do "a bunch of
good things" and move on. For your enjoyment here's the latest analyze:

newschm3=> explain analyze SELECT ro.run, ro.opset_num, p.step_num FROM run_opsets ro, parameters p WHERE ro.run = p.run AND ro.opset_num = p.opset_num and ro.run='team9';
QUERY PLAN
-------------------------------------------------------------------------------------------------------------------------------------------
Nested Loop (cost=0.00..6194.18 rows=9186 width=22) (actual time=0.477..175.554 rows=43050 loops=1)
-> Index Scan using run_opsets_pkey on run_opsets ro (cost=0.00..122.27 rows=68 width=18) (actual time=0.222..1.093 rows=263 loops=1)
Index Cond: (run = 'team9'::text)
-> Index Scan using parameters_idx on parameters p (cost=0.00..88.72 rows=46 width=22) (actual time=0.023..0.498 rows=164 loops=263)
Index Cond: (('team9'::text = p.run) AND ("outer".opset_num = p.opset_num))
Total runtime: 190.821 ms

Thank you all very much for you help!

-- George Young

>
> of course, thats jst my 2c, feel free to ignore :D
> Regards
> Stef
>
> Chris wrote:
>
> > george young wrote:
> >
> >> [PostgreSQL 8.1.0 on i686-pc-linux-gnu, compiled by GCC gcc (GCC) 4.0.1]
> >> I have a simple join on two tables that takes way too long. Can you
> >> help
> >> me understand what's wrong? There are indexes defined on the
> >> relevant columns.
> >> I just did a fresh vacuum --full --analyze on the two tables.
> >> Is there something I'm not seeing?
> >> [CPU is 950Mhz AMD, 256MB RAM, 15k rpm scsi disk]
> >> -- George Young
> >>
> >> Table sizes: parameters has 2.1512e+07 tuples, run_opsets has 211745
> >> tuples.
> >>
> >> explain analyze SELECT ro.run, ro.opset_num, p.step_num FROM
> >> run_opsets ro, parameters p WHERE ro.run = p.run AND ro.opset_num =
> >> p.opset_num and ro.run='team9';
> >>
> >> QUERY PLAN
> >> --------------------------------------------------------------------------------------------------------------------------------------------
> >>
> >> Nested Loop (cost=2.16..7957.40 rows=6707 width=22) (actual
> >> time=14.986..70197.129 rows=43050 loops=1)
> >> -> Index Scan using run_opsets_pkey on run_opsets ro
> >> (cost=0.00..128.75 rows=71 width=18) (actual time=0.386..62.959
> >> rows=263 loops=1)
> >> Index Cond: (run = 'team9'::text)
> >> -> Bitmap Heap Scan on parameters p (cost=2.16..109.93 rows=27
> >> width=22) (actual time=1.591..266.211 rows=164 loops=263)
> >> Recheck Cond: (('team9'::text = p.run) AND
> >> ("outer".opset_num = p.opset_num))
> >> -> Bitmap Index Scan on parameters_idx (cost=0.00..2.16
> >> rows=27 width=0) (actual time=1.153..1.153 rows=164 loops=263)
> >> Index Cond: (('team9'::text = p.run) AND
> >> ("outer".opset_num = p.opset_num))
> >> Total runtime: 70237.727 ms
> >> (8 rows)
> >>
> >> Table "public.run_opsets"
> >> Column | Type | Modifiers
> >> --------------+-----------------------------+-------------------------
> >> run | text | not null
> >> opset | text |
> >> opset_ver | integer |
> >> opset_num | integer | not null
> >> status | opset_status |
> >> date_started | timestamp without time zone |
> >> date_done | timestamp without time zone |
> >> work_started | timestamp without time zone |
> >> lock_user | text | default 'NO-USER'::text
> >> lock_pid | integer |
> >> needs_review | text |
> >> Indexes:
> >> "run_opsets_pkey" PRIMARY KEY, btree (run, opset_num) CLUSTER
> >>
> >>
> >> -- Table "public.parameters"
> >> Column | Type | Modifiers
> >> -----------+---------+-------------------------------
> >> run | text | not null
> >> opset_num | integer | not null
> >> opset | text | not null
> >> opset_ver | integer | not null
> >> step_num | integer | not null
> >> step | text | not null
> >> step_ver | integer | not null
> >> name | text | not null
> >> value | text |
> >> split | boolean | not null default false
> >> wafers | text[] | not null default '{}'::text[]
> >> Indexes:
> >> "parameters_idx" btree (run, opset_num, step_num, opset,
> >> opset_ver, step, step_ver, name, split, wafers)
> >> "parameters_opset_idx" btree (opset, step, name)
> >> "parameters_step_idx" btree (step, name)
> >
> >
> > More for my own information (because nobody else has suggested it),
> > would it make a difference if 'run' was a varchar field rather than text?
> >
> >
>
>
> ---------------------------(end of broadcast)---------------------------
> TIP 4: Have you searched our list archives?
>
> http://archives.postgresql.org
>

--
"Are the gods not just?" "Oh no, child.
What would become of us if they were?" (CSL)

In response to

Browse pgsql-performance by date

  From Date Subject
Next Message Vivek Khera 2006-03-29 15:32:54 Re: Decide between Postgresql and Mysql (help of comunity)
Previous Message Merlin Moncure 2006-03-29 14:05:12 Re: Slow performance on Windows .NET and OleDb