Re: query plan with index having a btrim is different for strings of different length

From: Richard Yen <dba(at)richyen(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: pgsql-performance(at)postgresql(dot)org
Subject: Re: query plan with index having a btrim is different for strings of different length
Date: 2008-12-10 21:54:50
Message-ID: 8FA59A9A-BCFB-4EF3-BD33-097CAA08D0B5@richyen.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

On Dec 10, 2008, at 11:34 AM, Tom Lane wrote:

> Richard Yen <dba(at)richyen(dot)com> writes:
>> You guys are right. I tried "Miller" and gave me the same result.
>> Is
>> there any way to tune this so that for the common last names, the
>> query run time doesn't jump from <1s to >300s?
>
> If the planner's estimation is that far off then there must be
> something
> very weird about the table statistics, but you haven't given us any
> clue
> what.

Wow, thanks for helping me out here. I don't have much experience
with deconstructing queries and working with stats, so here's what I
could gather. If you need more information, please let me know.

tii=# select * from pg_stat_all_tables where relname =
'm_object_paper' or relname = 'm_assignment';
-[ RECORD 1 ]----+------------------------------
relid | 17516
schemaname | public
relname | m_assignment
seq_scan | 274
seq_tup_read | 1039457272
idx_scan | 372379230
idx_tup_fetch | 2365235708
n_tup_ins | 5641638
n_tup_upd | 520684
n_tup_del | 30339
n_tup_hot_upd | 406929
n_live_tup | 5611665
n_dead_tup | 11877
last_vacuum |
last_autovacuum | 2008-12-04 17:44:57.309717-08
last_analyze | 2008-10-20 15:09:50.943652-07
last_autoanalyze | 2008-08-15 17:16:14.588153-07
-[ RECORD 2 ]----+------------------------------
relid | 17792
schemaname | public
relname | m_object_paper
seq_scan | 83613
seq_tup_read | 184330159906
idx_scan | 685219945
idx_tup_fetch | 222892138627
n_tup_ins | 71564825
n_tup_upd | 27558792
n_tup_del | 3058
n_tup_hot_upd | 22410985
n_live_tup | 71559627
n_dead_tup | 585467
last_vacuum | 2008-10-24 14:36:45.134936-07
last_autovacuum | 2008-12-05 07:02:40.52712-08
last_analyze | 2008-11-25 14:42:04.185798-08
last_autoanalyze | 2008-08-15 17:20:28.42811-07

tii=# select * from pg_statio_all_tables where relname =
'm_object_paper' or relname = 'm_assignment';
-[ RECORD 1 ]---+---------------
relid | 17516
schemaname | public
relname | m_assignment
heap_blks_read | 22896372
heap_blks_hit | 1753777105
idx_blks_read | 7879634
idx_blks_hit | 1157729592
toast_blks_read | 0
toast_blks_hit | 0
tidx_blks_read | 0
tidx_blks_hit | 0
-[ RECORD 2 ]---+---------------
relid | 17792
schemaname | public
relname | m_object_paper
heap_blks_read | 2604944369
heap_blks_hit | 116307527781
idx_blks_read | 133534908
idx_blks_hit | 3601637440
toast_blks_read | 0
toast_blks_hit | 0
tidx_blks_read | 0
tidx_blks_hit | 0

Also, yes, we've kicked around the idea of doing an index on the
concatenation of the first and last names--that would definitely be
more unique, and I think we're actually going to move to that. Just
thought I'd dig deeper here to learn more.

Thanks!
--Richard

In response to

Browse pgsql-performance by date

  From Date Subject
Next Message Richard Yen 2008-12-10 22:04:57 Re: query plan with index having a btrim is different for strings of different length
Previous Message David Wilson 2008-12-10 21:37:27 Re: Experience with HP Smart Array P400 and SATA drives?