Re: Query performance strangeness..

From: Steve Spicklemire <steve(at)spvi(dot)com>
To: Richard Huxton <dev(at)archonet(dot)com>
Cc: Steve Spicklemire <steve(at)spvi(dot)com>, pgsql-general(at)postgresql(dot)org
Subject: Re: Query performance strangeness..
Date: 2007-07-19 10:23:21
Message-ID: 4E21DD73-0FF5-47E0-AB2B-1A2D05DF08E8@spvi.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

Hi Richard,

On Jul 19, 2007, at 12:49 AM, Richard Huxton wrote:

> Steve Spicklemire wrote:
>> I also have a function "get_cem_for_directBurial(personid)" that
>> is expensive to call, but it's also indexed, so I hoped that the
>> index would normally be used (essentially as a cache). It returns
>> a 'cemetery code' so I can search for folks buried in a particular
>> cemetery. (The cemetery code was added to a different table after
>> the 'people' table was more or less frozen.. I'd like to keep it
>> that way if possible.)
>
> How is this function defined? Is it marked "Immutable" or similar?
> The body might be interesting too.

Here is the function body... the data is stored in and XML "pickle".
I had hoped that it would only be called in building the index.

Since the query uses it in the 'filter' step.. I'm not sure if it's
using the index or not.

Does marking the function immutable help the planner know whether it
can use the index or not?

CCOC=# \df+ get_cem_for_directBurial
List of functions
Schema | Name | Result data type | Argument
data types | Owner | Language | Source code | Description
--------+--------------------------+------------------
+---------------------+---------+----------+-------------+-------------
public | get_cem_for_directburial | text | character
varying | webuser | plpgsql |
DECLARE
personID ALIAS for $1;
qResult RECORD;

BEGIN
SELECT INTO qResult get_xml_value('/params/param/value/
struct/member/*[contains(text(),''cemid'')]/parent::*/value/string/
text()','People',personID,'') as cem;
return qResult.cem;
END;
|
(1 row)

>
>> Sometimes I need to search for rows from the view that satisfy
>> certain criteria, sorted in some specific order. Here's where the
>> trouble starts. In the view I compute something I call 'lc3key',
>> defined as: lower_concat3(p."last", p."first", (p.middle::text ||
>> p.personid::text)::character varying) where 'lower_concat3' just
>> returns a lower case version of three strings all concatenated
>> together. The string is basically lastname, firstname, middle and
>> personid (to guarantee uniqueness). It seems like most of the time
>> sorting by last, first, middle should be the same as sorting by
>> lc3key (all of these things are indexed BTW). So here goes:
>
> Definitions for the three tables and their indexes would be nice to
> check against too.

CCOC=# \d people
Table
"public.people"
Column | Type
| Modifiers
------------+-----------------------------
+-----------------------------------------------------------------------
-------------
personid | character varying(40) | not null default
('AUTO'::text || (nextval(('People_seq'::text)::regclass))::text)
modified | timestamp without time zone | default now()
created | timestamp without time zone | default now()
enabled | boolean |
first | character varying(40) |
middle | character varying(15) |
last | character varying(80) |
gender | character varying(2) |
sbirthdate | character varying(30) |
sdeathdate | character varying(30) |
status | character varying(30) |
Indexes:
"people_pkey" PRIMARY KEY, btree (personid)
"idx_people_cemid" btree (get_cem_for_directburial(personid))
"idx_people_lower_concat3_last" btree (lower_concat3("last",
"first", (middle::text || personid::text)::character varying))
"idx_people_servicenum" btree
(get_numeric_servicenumber_for_personid(personid))
"idx_people_status" btree (status)
"idx_people_take4_personid_" btree (take4(personid))

CCOC=# \d intermentspacepersons
Table
"public.intermentspacepersons"
Column | Type
| Modifiers
------------------------+-----------------------------
+-----------------------------------------------------------------------
----------------------------
intermentspacepersonid | character varying(40) | not null
default ('AUTO'::text || (nextval
(('IntermentSpacePersons_seq'::text)::regclass))::text)
modified | timestamp without time zone | default now()
created | timestamp without time zone | default now()
enabled | boolean |
objectid | character varying(30) |
personid | character varying(30) |
roleid | character varying(30) |
Indexes:
"idx_intermentspacepersons_obje" btree (objectid)
"idx_intermentspacepersons_pers" btree (personid)

CCOC=# \d intermentspaceroles
Table
"public.intermentspaceroles"
Column | Type
| Modifiers
----------------------+-----------------------------
+-----------------------------------------------------------------------
--------------------------
intermentspaceroleid | character varying(40) | not null
default ('AUTO'::text || (nextval
(('IntermentSpaceRoles_seq'::text)::regclass))::text)
modified | timestamp without time zone | default now()
created | timestamp without time zone | default now()
enabled | boolean |
name | character varying(30) |
label | character varying(30) |
reverseroleid | character varying(30) |
objecttype | character varying(30) |
Indexes:
"intermentspaceroles_pkey" PRIMARY KEY, btree
(intermentspaceroleid)

>
>> -> Index Scan using idx_people_lower_concat3_last
>> on people p (cost=0.00..130784.91 rows=43872 width=40) (actual
>> time=0.366..47.016 rows=171 loops=1)
>> Index Cond: (lower_concat3("last", "first",
>> (((middle)::text || (personid)::text))::character varying) >=
>> 'jonesAAAAAAAAAAAAAAAAAAAAAAAAA'::text)
>> Filter: (('STJ'::text =
>> get_cem_for_directburial(personid)) AND ((status)::text <>
>> 'R'::text) AND ((status)::text <> 'F'::text))
>
>> OK.. not too bad. If I do the same query... but ask for 'HCC'
>> rather than 'STJ', just a different cemetery code, I get 91
>> seconds... about 1000 times longer!
>
>> -> Index Scan using idx_people_lower_concat3_last
>> on people p (cost=0.00..130784.91 rows=759 width=40) (actual
>> time=8.722..91396.606 rows=256 loops=1)
>> Index Cond: (lower_concat3("last", "first",
>> (((middle)::text || (personid)::text))::character varying) >=
>> 'jonesAAAAAAAAAAAAAAAAAAAAAAAAA'::text)
>> Filter: (('HCC'::text =
>> get_cem_for_directburial(personid)) AND ((status)::text <>
>> 'R'::text) AND ((status)::text <> 'F'::text))
>
> In this case, look at the actual times. There are two possibilities:
> 1. The first query had its data/indexes in cache whereas the second
> didn't. Run each three times in a row and see if the times stay
> roughly constant.
>
> 2. Calls to get_cem_for_directburial() can vary widely in their
> execution time.
>

If the query is actually calling get_cem_for_directburial during the
query... then I'm sunk. I'll have to move that data to a regular
indexed field. I can do that... it's just a lot of work. ;-(

thanks for your help!

-steve

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Richard Huxton 2007-07-19 11:06:42 Re: Query performance strangeness..
Previous Message Ottavio Campana 2007-07-19 10:15:17 memory leakage in libpg?