Re: 8.3devel slower than 8.2 under read-only load

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: "Guillaume Smet" <guillaume(dot)smet(at)gmail(dot)com>
Cc: "Gregory Stark" <stark(at)enterprisedb(dot)com>, "Simon Riggs" <simon(at)2ndquadrant(dot)com>, "Jonah H(dot) Harris" <jonah(dot)harris(at)gmail(dot)com>, "Greg Smith" <gsmith(at)gregsmith(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: 8.3devel slower than 8.2 under read-only load
Date: 2007-11-26 00:35:23
Message-ID: 23854.1196037323@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

"Guillaume Smet" <guillaume(dot)smet(at)gmail(dot)com> writes:
> Sure, it's the same queries I posted earlier. My pgbench script is the
> following:
> BEGIN

> select libvil from vilsitelang where codelang='FRA' and codevil='LYO'
> select TL.motsclesmetatags, TL.descriptifmeta, TL.motcleoverture_l,
> TL.motcleoverture_c, TL.baselinetheme from themelang TL where
> TL.codeth = 'ASS' and TL.codelang = 'FRA'
> SELECT libvilpubwoo, codelang, codepays, petiteville FROM vilsite
> WHERE codevil = 'LYO'
> select libvil from vilsitelang where codelang='FRA' and codevil='LYO'

> END

I poked into this a bit, and it seems the extra overhead is all coming
from resolving the ambiguous "=" operators. That didn't show up in my
test because my query had "int4_column = int4_const" which is an exact
match to a pg_operator entry. But since your columns are varchar,
which doesn't have any operators of its own, we have to go through
oper_select_candidate(), which is noticeably slower than before. The
slowdown seems to have two causes:

1. Datatype bloat: there are 58 "=" operators in pg_operator today,
versus 54 at the beginning of the year. That's 7% more work right
there to sort through the additional operators.

2. Removal of pg_cast entries associated with explicit varchar
coercions: when there's not a pg_cast entry for the desired coercion,
find_coercion_pathway does a second catalog lookup to see if it
might be an array case. That happens more often in this test case
than it did at the start of the year, because I got rid of pg_cast
entries that could be replaced by the generic CoerceViaIO mechanism.

I'm not sure how big a hit #2 really is. Presumably the removal of the
redundant entries has some distributed savings associated with it, which
would partially counteract the extra lookup; but I don't have any tools
that can isolate the cost of those particular SearchSysCache calls out
of all the rest. In any case, #2 is specific to varchar and text while
effect #1 is an issue for just about everything.

The cost of resolving ambiguous operators has been an issue for a long
time, of course, but it seems particularly bad in this case --- gprof
blames 37% of the runtime on oper_select_candidate(). It might be time
to think about caching the results of operator searches somehow. Too
late for 8.3 though.

regards, tom lane

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Guillaume Smet 2007-11-26 00:54:44 Re: 8.3devel slower than 8.2 under read-only load
Previous Message Andreas 'ads' Scherbaum 2007-11-25 22:35:46 Re: quote_literal(integer) does not exist