Quick Links

Re: Patch: add conversion from pg_wchar to multibyte

From:	Robert Haas <robertmhaas(at)gmail(dot)com>
To:	Alexander Korotkov <aekorotkov(at)gmail(dot)com>
Cc:	Erik Rijkers <er(at)xs4all(dot)nl>, pgsql-hackers(at)postgresql(dot)org
Subject:	Re: Patch: add conversion from pg_wchar to multibyte
Date:	2012-05-02 12:50:04
Message-ID:	CA+TgmoYgS-EC4cV5rFw1ebD=uPJYn_vUdz7+XU-N0KXBgqXEYw@mail.gmail.com
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

On Tue, May 1, 2012 at 6:02 PM, Alexander Korotkov <aekorotkov(at)gmail(dot)com> wrote:
> Right. When number of trigrams is big, it is slow to scan posting list of
> all of them. The solution is this case is to exclude most frequent trigrams
> from index scan. But, it require some kind of statistics of trigrams
> frequencies which we don't have. We could estimate frequencies using some
> hard-coded assumptions about natural languages. Or we could exclude
> arbitrary trigrams. But I don't like both these ideas. This problem is also
> relevant for LIKE/ILIKE search using trigram indexes.

I was thinking you could perhaps do it just based on the *number* of
trigrams, not necessarily their frequency.

> Probably you have some comments on idea of conversion from pg_wchar to
> multibyte? Is it acceptable at all?

Well, I'm not an expert on encodings, but it seems like a logical
extension of what we're doing right now, so I don't really see why
not. I'm confused by the diff hunks in pg_mule2wchar_with_len,
though. Presumably either the old code is right (in which case, don't
change it) or the new code is right (in which case, there's a bug fix
needed here that ought to be discussed and committed separately from
the rest of the patch). Maybe I am missing something.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

In response to

Re: Patch: add conversion from pg_wchar to multibyte at 2012-05-01 22:02:23 from Alexander Korotkov

Responses

Re: Patch: add conversion from pg_wchar to multibyte at 2012-05-02 13:35:03 from Alexander Korotkov

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Tom Lane	2012-05-02 13:17:06	Re: proposal: additional error fields
Previous Message	Robert Haas	2012-05-02 12:41:02	Re: index-only scans vs. Hot Standby, round two