Re: query speed joining tables

From: Christopher Smith <christopherl_smith(at)yahoo(dot)com>
To: Tomasz Myrta <jasiek(at)klaster(dot)net>, pgsql-sql(at)postgresql(dot)org
Subject: Re: query speed joining tables
Date: 2003-01-13 22:03:00
Message-ID: 20030113220300.69330.qmail@web14107.mail.yahoo.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-sql


my mistakes, zips_max should be zips_300.
Tomasz Myrta <jasiek(at)klaster(dot)net> wrote:Christopher Smith wrote:

> I have 4 tables that I need to query... 3 of the tables are links by the
> field userid.
>
> below are table sql fragments. as well as the query. The subquery
> causes very high
>
> CPU usages. It typically returns ~3000 matches. Is there another way to
> rewrite this?
>
> SELECT user_login.userid FROM user_login,user_details_p,user_match_details
> WHERE user_login.userid = user_details_p.userid AND
> user_details_p.userid = user_match_details.userid AND
> user_details_p.gender ='W' AND
> user_details_p.seekgender ='M' AND
> user_details_p.age >=18 AND
> user_details_p.age <=50 AND
> user_match_details.min_age <= 30 AND
> user_match_details.max_age >= 30 AND
> user_details_p.ethnictype = 'Caucasian (White)' AND
> strpos(user_match_details.ethnicity,'Asian') !=0 AND
> user_details_p.zipcode in (select zips_max.destination from zips_max
> where zips_max.origin='90210' )
> order by user_login.last_login desc;

explicit joins show better idea of your query and helps postgres
choosing indexing.

select userid
from
user_login
join user_details using (userid)
join user_match_details using (userid)
where
user_details_p.gender ='W' AND
user_details_p.seekgender ='M' AND
user_details_p.age between 18 and 50 and
user_match_details.min_age <= 30 AND
user_match_details.max_age >= 30 AND
user_details_p.ethnictype = 'Caucasian (White)' AND
strpos(user_match_details.ethnicity,'Asian') !=0 AND
user_details_p.zipcode in (select zips_max.destination from zips_max
where zips_max.origin='90210' )
order by user_login.last_login desc;

How can I help you with subquery if you didn't write even zips_max
definition?!?

If origin is unique value in that table, you can change subquery into
join on "from" list.

> Table "public.user_login"
> Column | Type | Modifiers
> ------------+--------------------------+---------------
> userid | character varying(30) | not null
> password | character varying(30) | not null
> email | character varying(50) | not null
> last_login | timestamp with time zone | not null
> Indexes: user_login_pkey primary key btree (userid),

Do you really need userid as varchar?
indexing on int4 or int8 would be much faster than varchar

Why do you have 3 tables? It looks like only one table would be enough.
Remember, that null values don't take too much space.

>
>
>
>
> Table "public.user_details_p"
> Column | Type | Modifiers
> -----------------+--------------------------+---------------
> userid | character varying(30) | not null
> gender | character varying(1) |
> age | integer |
> height | character varying(10) |
> ethnicty pe | character varying(30) |
> education | character varying(30) |
> createdate | timestamp with time zone | default now()
> zipcode | character varying(5) |
> birthdate | date | default now()
> zodiac | character varying(40) |
> seekgender | character varying(2) |
> Indexes: user_details_p_pkey primary key btree (userid),
> user_details_p_age_idx btree (age),
> &nb sp; user_details_p_ethnic_idx btree (ethnictype),
> user_details_p_gender_idx btree (gender),

>
> user_details_p_last_login_idx btree (last_login),
> user_details_p_seekgender_idx btree (seekgender),
> user_details_p_state_idx btree (state)

There is too many indexes -
index on gender (2 possible values) is useless,
index on ethnic (how many values - I think not too many?) is possibly
useless

Consider creating single index on several fields for queries like this:
select
...
where
user_details_p.gender ='W' AND
user_details_p.seekgender ='M' AND
user_details_p.age between 18 and 50 and

index on (age,seekgender,gender)

>
> Foreign Key constraints: $1 FOREIGN KEY (userid) REFERENCES
> user_login(userid) ON UPDATE NO ACTI
> ON ON DELETE CASCADE
>
>
> Table "public.user_match_details"
> Column | Type | Modifiers
> ------------------+------------------------+-----------
> userid | character varying(30) | not null
> soughtmate | character varying(200) |
> ethnicity | character varying(200) |
> marital_status | character varying(200) |
> min_age | integer |
> max_age | integer &nbs p; |
> city | character varying(50) |
> state | character varying(2) |
> zipcode | integer |
> match_distance | integer |
> Indexes: user_match_details_pkey primary key btree (userid)
> Foreign Key constraints: $1 FOREIGN KEY (userid) REFERENCES
> user_login(userid) ON UPDATE NO ACTION ON DELETE CASCADE
>
> Table "public.zips_300"
> Column | Type | Modifiers
> -------------+----------------------+-----------
> origin | character varying(5) |
> destination | character varying(5) |
> Indexes: zips_300_origin_idx btree (origin)

If you need more specific answer, you have to add more information - how
many records do you have in your tables and how many possible values do
you use for example for zipcodes, ethnicity etc.

Regards,
Tomasz Myrta

---------------------------------
Do you Yahoo!?
Yahoo! Mail Plus - Powerful. Affordable. Sign up now

In response to

Responses

Browse pgsql-sql by date

  From Date Subject
Next Message Ron Peterson 2003-01-13 22:11:29 Re: insert rule doesn't see id field
Previous Message Tomasz Myrta 2003-01-13 22:01:39 Re: query speed joining tables