Skip site navigation (1) Skip section navigation (2)

Bug #610: collation fails sorting because of strcoll() bug

From: pgsql-bugs(at)postgresql(dot)org
To: pgsql-bugs(at)postgresql(dot)org
Subject: Bug #610: collation fails sorting because of strcoll() bug
Date: 2002-03-07 19:36:16
Message-ID: 20020307193616.1D835476A44@postgresql.org (view raw or flat)
Thread:
Lists: pgsql-bugs
Mathias August Gruber (mgruber) reports a bug with a severity of 2
The lower the number the more severe it is.

Short Description
collation fails sorting because of strcoll() bug

Long Description
Hi there,

I was trying to migrate a MS-SQL Server database to a Postgresql platform about two years ago and could not make things work because I needed collation.
Although documentation states that collation will work, this is not true when using string separated by blanks.
What happens is the strings are sorted as if they had no spaces.
This was really bad.
Nowadays I've taken this project again and noticed the problem is still there. So I started to read all docs and the source code and made lots of tests.
Also your regression tests lacks on this topic. You are only sorting single worded strings.

Now I have a verdict: The problem is on the GNU-C libraries strcoll()
function.

I have attached a little C program that reproduces this behavior. Just
compile it (and don't forget to set LC_ALL to any western language; I've
tested with pt_BR but the problem occurs almost with any other
configuration).

Hope I could help you with this superb project.

Very Best Regards


Sample Code

#include <stdio.h>
#include <string.h>
#include <locale.h>



int main(int argc, char **arv)
{
	int i;
	char src[4][32] =
	{
		"Joseval Almeida",
		"Jose Valter",
		"JOSE CAMARGO",
		"Jose Americo",
	};
	char arr[4][32];

	memcpy(arr, src, sizeof(src));

	/* Use current locale settings (in my case LC_ALL=pt_BR), that uses
	coventional LATIN 1 collation settings. */
	setlocale(LC_ALL, "");

	/* Print current array */
	puts("The input array is:\n");
	for(i = 0; i < 4; i++)
		puts(arr[i]);

	/* Sort the array */
	qsort(arr, 4, sizeof(char)*32, strcmp);

	/* Print the output */
	puts("\nThe strcmp sorted array is:\n");
	for(i = 0; i < 4; i++)
		puts(arr[i]);

	/* Sort the array */
	memcpy(arr, src, sizeof(src));
	qsort(arr, 4, sizeof(char)*32, strcasecmp);

	/* Print the output */
	puts("\nThe strcasecmp sorted array is:\n");
	for(i = 0; i < 4; i++)
		puts(arr[i]);

	/* Sort the array */
	memcpy(arr, src, sizeof(src));
	qsort(arr, 4, sizeof(char)*32, strcoll);

	/* Print the output */
	puts("\nThe strcoll sorted array is:\n");
	for(i = 0; i < 4; i++)
		puts(arr[i]);

	return 0;
}


No file was uploaded with this report


Responses

pgsql-bugs by date

Next:From: Tom LaneDate: 2002-03-07 21:21:43
Subject: Re: Bug #610: collation fails sorting because of strcoll() bug
Previous:From: pgsql-bugsDate: 2002-03-07 17:27:39
Subject: Bug #609: CREATE TABLE with implicit index should not fail if index already exists

Privacy Policy | About PostgreSQL
Copyright © 1996-2014 The PostgreSQL Global Development Group