Skip site navigation (1) Skip section navigation (2)

BUG #5219: Segfault in to_tsvector

From: "Kenaniah Cerny" <kenaniah(at)gmail(dot)com>
To: pgsql-bugs(at)postgresql(dot)org
Subject: BUG #5219: Segfault in to_tsvector
Date: 2009-11-29 02:56:44
Message-ID: (view raw, whole thread or download thread mbox)
Lists: pgsql-bugs
The following bug has been logged online:

Bug reference:      5219
Logged by:          Kenaniah Cerny
Email address:      kenaniah(at)gmail(dot)com
PostgreSQL version: 8.4.1
Operating system:   Centos5.2 -- Linux 2.6.18-92.1.10.el5 #1 SMP i686 athlon
i386 GNU/Linux
Description:        Segfault in to_tsvector

Full backtrace:

The issue takes place running this query:

Crash is attributed to this index definition:
CREATE INDEX "anime_titles_idx_name_simple_text" ON "public"."anime_titles"
  USING gin ((to_tsvector('simple'::regconfig, name)));

I believe the issue is caused by possibly non-UTF-8 data. Both the server
and the client (a PHP script using PDO's pgsql driver) are using UTF-8. The
string causing this issue is stored in the database in a text field and
looks like this:

After output into an HTML input field and resubmission through firefox, the
string that is passed through to the DB looks like this:

(The &# characters were manually omitted in submission)

I don't profess to know anything about encodings, but I don't think this is
valid UTF-8 input. I might be wrong. All I do know is that this causes the
to_tsvector part of the gin index to throw a segfault in the insert
statement, rather than returning an invalid UTF-8 input error or just plain


pgsql-bugs by date

Next:From: Tom LaneDate: 2009-11-29 03:24:17
Subject: Re: BUG #5219: Segfault in to_tsvector
Previous:From: Russell WallaceDate: 2009-11-28 22:34:11
Subject: BUG #5218: Easy strategic feature requests

Privacy Policy | About PostgreSQL
Copyright © 1996-2017 The PostgreSQL Global Development Group