Re: [HACKERS] 'a' == 'a '

From: "Dann Corbit" <DCorbit(at)connx(dot)com>
To: <josh(at)agliodbs(dot)com>, <pgsql-hackers(at)postgresql(dot)org>
Cc: "Stephan Szabo" <sszabo(at)megazone(dot)bigpanda(dot)com>, "Terry Fielder" <terry(at)ashtonwoodshomes(dot)com>, "Tino Wildenhain" <tino(at)wildenhain(dot)de>, "Marc G(dot) Fournier" <scrappy(at)postgresql(dot)org>, <Richard_D_Levine(at)raytheon(dot)com>, <pgsql-general(at)postgresql(dot)org>
Subject: Re: [HACKERS] 'a' == 'a '
Date: 2005-10-20 00:33:38
Message-ID: D425483C2C5C9F49B5B7A41F8944154757D210@postal.corporate.connx.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general pgsql-hackers

If there is a significant performance benefit to not expanding text columns in comparison operations, then it seems it should be OK.

I probably read the standard wrong, but it seems to me that varchar, char, and bpchar columns should all behave the same (e.g. if you do not expand with <blank> or the PAD character (whatever that is) then all char type columns should behave the same. I guess that there could be different default collations for different column types though (that is clearly allowed in the standard). Perhaps it just needs to be documented in such a way that even a blockhead like me can comprehend it easily.

> -----Original Message-----
> From: Josh Berkus [mailto:josh(at)agliodbs(dot)com]
> Sent: Wednesday, October 19, 2005 5:06 PM
> To: pgsql-hackers(at)postgresql(dot)org
> Cc: Dann Corbit; Stephan Szabo; Terry Fielder; Tino Wildenhain; Marc G.
> Fournier; Richard_D_Levine(at)raytheon(dot)com; pgsql-general(at)postgresql(dot)org
> Subject: Re: [HACKERS] 'a' == 'a '
>
> Dann,
>
> > I think that whatever is done ought to be whatever the standard says.
> > If I misinterpret the standard and PostgreSQL is doing it right, then
> > that is fine.  It is just that PostgreSQL is very counter-intuitive
> > compared to other database systems that I have used in this one
> > particular area.  When I read the standard, it looked to me like
> > PostgreSQL was not performing correctly.  It is not unlikely that I read
> > it wrong.
>
> AFAIT, the standard says "implementation-specific". So we're standard.
>
> The main cost for comparing trimmed values is performance; factoring an
> rtrim into every comparison will add significant overhead to the already
> CPU-locked process of, for example, creating indexes. We're looking for
> ways to make the comparison operators lighter-weight, not heavier.
>
> My general perspective on this is that if trailing blanks are a
> significant
> hazard for your application, then trim them on data input. That requires
> a *lot* less peformance overhead than doing it every time you compare
> something.
>
> Changing the behaviour would break backwards compatibility for some users.
> For that matter, I've been subscribed to 8 PostgreSQL mailing lists since
> 1999, and this is the first time I can recall someone complaining about
> this comparison behavior. So it's obviously not a widespread issue.
>
> --
> --Josh
>
> Josh Berkus
> Aglio Database Solutions
> San Francisco

Browse pgsql-general by date

  From Date Subject
Next Message Terry Fielder 2005-10-20 01:09:39 Re: [pgsql-advocacy] Oracle buys Innobase
Previous Message Josh Berkus 2005-10-20 00:06:14 Re: [HACKERS] 'a' == 'a '

Browse pgsql-hackers by date

  From Date Subject
Next Message Joshua D. Drake 2005-10-20 00:43:26 RSS feeds of CVS revision logs
Previous Message Josh Berkus 2005-10-20 00:06:14 Re: [HACKERS] 'a' == 'a '