Re: How do I remove selected words from text field?

From: Frank Bax <fbax(at)sympatico(dot)ca>
To: pgsql-sql(at)postgresql(dot)org
Subject: Re: How do I remove selected words from text field?
Date: 2010-07-01 15:56:34
Message-ID: BLU0-SMTP7888ACA3DFCEDA21C88B69ACCD0@phx.gbl
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-sql


Create some tables; then add some data:

create table t1 (i int, v varchar);
insert into t1 values(1,'A B C D');
insert into t1 values(2,'B D E F');
insert into t1 values(3,'G H I J');
insert into t1 values(4,'E');
create table t2 (q varchar, z varchar);
insert into t2 values('A','vowel');
insert into t2 values('B','consonant');
insert into t2 values('E','vowel');
insert into t2 values('K','consonant');

I am trying to write a function which will:
* split the argument into "words" (separated by blanks);
* remove words that meet a certain condition in another table
(in this example 'vowel');
* reassemble "words" into a string;
* return the result
This query does that job (Thanks Osvaldo):

SELECT i, array_to_string(array_agg(word), ' ') "new-v" FROM
(SELECT * FROM (SELECT i, regexp_split_to_table(v, E'\\s+') AS word FROM
t1) bar
LEFT OUTER JOIN t2 ON (bar.word=t2.q)
WHERE z IS DISTINCT FROM 'vowel') foo
GROUP BY i;

i | new-v
---+---------
1 | B C D
3 | G H I J
2 | B D F
(3 rows)

When I try to create a function to do the same thing; it only works for
(4,'E') and not the other tuples.

CREATE OR REPLACE FUNCTION notvowel(text) RETURNS text AS $$
SELECT array_to_string(array_agg(word),' ') FROM
(SELECT * FROM (SELECT regexp_split_to_table($1, E'\s+') AS word) bar
LEFT OUTER JOIN t2 ON (bar.word=t2.q)
WHERE z IS DISTINCT FROM 'vowel') foo
GROUP BY word
$$ LANGUAGE SQL;

select *,notvowel(v::text) from t1;
i | v | notvowel
---+---------+----------
1 | A B C D | A B C D
2 | B D E F | B D E F
3 | G H I J | G H I J
4 | E |
(4 rows)

I wonder if it has something to do with pattern passed to
regexp_split_to_table() since inside the function, E'\\s+' results in:
i | v | notvowel
---+---------+----------
1 | A B C D | C
2 | B D E F | B
3 | G H I J | G
4 | E |
(4 rows)

In response to

Responses

Browse pgsql-sql by date

  From Date Subject
Next Message Osvaldo Kussama 2010-07-01 23:11:32 Re: How do I remove selected words from text field?
Previous Message Eduardo Schweigert 2010-06-30 18:28:26 Invitation to connect on LinkedIn