Quick Links

Re: Duplicate Values or Not?!

From:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To:	Greg Stark <gsstark(at)mit(dot)edu>
Cc:	John Seberg <johnseberg(at)yahoo(dot)com>, pgsql-general(at)postgresql(dot)org
Subject:	Re: Duplicate Values or Not?!
Date:	2005-09-17 14:22:42
Message-ID:	7576.1126966962@sss.pgh.pa.us
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Thread:
Lists:	pgsql-general

Greg Stark <gsstark(at)mit(dot)edu> writes:
> John Seberg <johnseberg(at)yahoo(dot)com> writes:
>> This tells me that Postgresql is not using the same
>> method for determining duplicates when GROUPING and
>> INDEXing.

> You might try running the GROUP BY query after doing:
> set enable_hashagg = false;

> With that false it would have to sort the results which should be exactly the
> same code as the index is using. I think.

If that does change the results, it indicates you've got strings which
are bytewise different but compare equal according to strcoll(). We've
seen this and other misbehaviors from some locale definitions when faced
with data that is invalid per the encoding the locale expects.

So, yeah, the answer is to fix your encoding problems. In particular,
don't ever use a locale like that with a SQL_ASCII database encoding,
because then Postgres won't prevent strcoll from seeing data it fails
on. The only safe locale setting for a SQL_ASCII database is C,
I think.

regards, tom lane

In response to

Re: Duplicate Values or Not?! at 2005-09-17 05:36:50 from Greg Stark

Responses

Re: Duplicate Values or Not?! at 2005-09-17 14:51:10 from Greg Stark

Browse pgsql-general by date

	From	Date	Subject
Next Message	Greg Stark	2005-09-17 14:51:10	Re: Duplicate Values or Not?!
Previous Message	hubert depesz lubaczewski	2005-09-17 11:31:21	ltree and ordering - what index?