BUG #14038: substring cuts unicode char in half, allowing to save broken utf8 into table

From: rpegues(at)tripwire(dot)com
To: pgsql-bugs(at)postgresql(dot)org
Subject: BUG #14038: substring cuts unicode char in half, allowing to save broken utf8 into table
Date: 2016-03-21 14:39:27
Message-ID: 20160321143927.2903.55302@wrigleys.postgresql.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

The following bug has been logged on the website:

Bug reference: 14038
Logged by: Reece
Email address: rpegues(at)tripwire(dot)com
PostgreSQL version: 9.5.1
Operating system: Centos6
Description:

We have a table with an update trigger where if you modify a certain column,
we change the name of the row by calling a function.

In the function, substring() the name and then add a random string to that.
However, the substring appears to cut a unicode character in half, and the
update trigger then updates the name with the broken string.

After this, just doing a select * on the table returns:
invalid byte sequence for encoding "UTF8": 0xe5 0x5b 0x44

I don't think the update trigger should allow saving an invalid utf8 string
to the table for one, but the substring() cutting the unicode character in
half also might be a bug?

REPRODUCE: run the sql below.

CREATE OR REPLACE FUNCTION public.deleted_name(text, integer)
RETURNS text AS
$BODY$
declare
v_old_name alias for $1;
v_max_length alias for $2;
l_new_name text;
l_tag varchar(16);
l_tag_length integer := 0;
begin
l_tag := '[DEL ' || int4(extract(epoch from now())) || ']';
l_tag_length := length(l_tag);
l_new_name := substring(v_old_name, 1, v_max_length - l_tag_length) ||
l_tag;
return l_new_name;
end;
$BODY$
LANGUAGE plpgsql VOLATILE
COST 100;
ALTER FUNCTION public.deleted_name(text, integer)
OWNER TO postgres;


CREATE OR REPLACE FUNCTION public.test_bug_update()
RETURNS trigger AS
$BODY$
declare
begin
if ( old.deleted = 'f' and new.deleted = 't') then
new.name := deleted_name(new.name, 64);
end if;
return new;
end;
$BODY$
LANGUAGE plpgsql VOLATILE
COST 100;
ALTER FUNCTION public.nc_scan_profile_update()
OWNER TO postgres;




DROP TABLE test_bug;
CREATE TABLE public.test_bug
(
id bigint NOT NULL,
name character varying(256) NOT NULL,
deleted boolean NOT NULL DEFAULT false
);
ALTER TABLE public.test_bug
OWNER TO postgres;
GRANT ALL ON TABLE public.test_bug TO postgres;

CREATE TRIGGER test_bug_trigger
BEFORE UPDATE
ON public.test_bug
FOR EACH ROW
EXECUTE PROCEDURE public.test_bug_update();


INSERT INTO test_bug (id, name) values (1, 'JST Standard Scan
Profile-外部機関用情報共有システム');
UPDATE test_bug SET deleted=true where id=1;
SELECT * FROM test_bug;

Responses

Browse pgsql-bugs by date

  From Date Subject
Next Message Tom Lane 2016-03-21 14:53:46 Re: BUG #14033: cross-compilation to ARM fails
Previous Message Srivathson KK 2016-03-21 11:02:43 Re: BUG #14028: FATAL: cannot perform encoding conversion outside a transaction