Re: string casting for index usage

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Michael Adler <adler(at)pobox(dot)com>
Cc: pgsql-performance(at)postgresql(dot)org
Subject: Re: string casting for index usage
Date: 2004-03-19 23:28:50
Message-ID: 2125.1079738930@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

Michael Adler <adler(at)pobox(dot)com> writes:
> On Thu, Mar 18, 2004 at 03:39:12PM -0500, Tom Lane wrote:
>> A desultory test didn't show any difference between 7.2.4 and 7.3.6
>> in this respect, however. Perhaps you forgot to ANALYZE yet in the
>> new database?

> I have a test with sample data and queries to demonstrate what I'm seeing.

Ah. I had been testing the equivalent of this query with an INNER join
instead of a LEFT join. Both 7.2 and 7.3 pick a plan with an inner
indexscan on t1 in that case. The LEFT join prevents use of such a
plan, and the only way to do it quickly in those releases is to use an
inner indexscan on t2.

7.2 is really cheating here, because what is happening under the hood is
that the parser resolves the query as "textcol texteq varcharcol::text",
there not being any direct text=varchar operator. (text is chosen as
the preferred type over varchar when it would otherwise be a coin flip.)
But then the planner would simply assume that it's okay to substitute
varchareq for texteq, apparently on the grounds that if the input types
are binary compatible then the operators must be interchangeable. That
made it possible to match the join clause to the varchar-opclass index
on t2. But of course this theory is ridiculous on its face ... it
happens to be okay for varchar and text but in general you'd not have
the same comparison semantics for two different operators. (As an
example, int4 and OID are binary compatible but their index operators
are definitely not interchangeable, because one is signed comparison and
the other unsigned.)

7.3 is an intermediate state in which we'd ripped out the bogus planner
assumption but not developed fully adequate substitutes.

7.4 is substantially smarter than either, and can generate merge and
hash joins as well as ye plain olde indexed nestloop for this query.
In a quick test, it seemed that all three plan types yielded about the
same runtimes for this query with this much data. I didn't have time
to try scaling up the amount of data to see where things went, but I'd
expect the nestloop to be a loser at large scales even with an inner
indexscan.

Anyway, bottom line is that 7.4 and CVS tip are competitive with 7.2
again, only they do it right this time ...

regards, tom lane

In response to

Browse pgsql-performance by date

  From Date Subject
Next Message Kevin Brown 2004-03-20 03:48:17 Re: [HACKERS] fsync method checking
Previous Message Michael Adler 2004-03-19 22:22:17 Re: string casting for index usage