Re: How do I remove selected words from text field?

From: Frank Bax <fbax(at)sympatico(dot)ca>
To: pgsql-sql(at)postgresql(dot)org
Subject: Re: How do I remove selected words from text field?
Date: 2010-06-26 13:03:39
Message-ID: BLU0-SMTP15F262059DF646A24B8853ACC80@phx.gbl
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-sql

Osvaldo Kussama wrote:
> 2010/6/25 Frank Bax <fbax(at)sympatico(dot)ca>:
>> I'm not quite sure how to ask for the query I want, so let's start with
>> data:
>>
>> create table t1 (i int, val varchar);
>> insert into t1 values(1,'A B C D');
>> insert into t1 values(2,'B D E F');
>> insert into t1 values(3,'G H I J');
>> create table t2 (q varchar, z varchar);
>> insert into t2 values('A','vowel');
>> insert into t2 values('B','consonant');
>> insert into t2 values('E','vowel');
>> insert into t2 values('K','consonant');
>>
>> t1.val will contain "words" separated by blanks. It might be better if each
>> "word" were a separate row in another table; but that's not how the legacy
>> database was built. I understand this can be simulated by:
>>
>> select i,a[s] from (select i, generate_subscripts(string_to_array(val,'
>> '),1) as s, string_to_array(val,' ') as a from t1) foo;
>>
>> In my "real life" situation, the "words" are not single letters.
>>
>> I'd like to write a function that removes selected "words" from t1.val based
>> on select on t2.
>> In the above example; let's exclude all vowels, so I end up with:
>> 1 'B C D'
>> 2 'B D F'
>> 3 'G H I J'
>>
>> For some "words" in val; there may not be a row when joining to t2.q; these
>> words must be included in final result. In the above example; there is no
>> row in t2 where q="I"; so it is included in result.
>>
>> How do I write such a function? Can it be done with SQL only?
>>
>
>
> Try:
>
> SELECT i, array_to_string(array_agg(foo), ' ') "Val"
> FROM (SELECT * FROM (SELECT i, regexp_split_to_table(val, E'\\s+') AS
> foo FROM t1) bar
> LEFT OUTER JOIN t2 ON (bar.foo = t2.q) WHERE z IS DISTINCT FROM 'vowel') foobar
> GROUP BY i;
> i | Val
> ---+---------
> 1 | B C D
> 3 | G H I J
> 2 | B D F
> (3 linhas)
>
> Osvaldo
>

Excellent! Thanks! I've never seen "is distinct from" before. Looks
like that was the missing piece to my puzzle.

When I write this as a function...

CREATE OR REPLACE FUNCTION novowel(text) RETURNS text AS $$
SELECT array_to_string(array_agg(word),' ') FROM
(SELECT * FROM (SELECT regexp_split_to_table($1, E'\\s+') AS word) bar
LEFT OUTER JOIN t2 ON (bar.word=t2.q)
WHERE z IS DISTINCT FROM 'vowel') foo
GROUP BY word
$$ LANGUAGE SQL

It only works when t1.val is a single word/vowel...

insert into t1 values (4,'E');

select *,novowel(val) from t1;
i | val | novowel
---+-----------+-----------
1 | A B C D | A B C D
2 | B C D E F | B C D E F
3 | G H I J K | G H I J K
4 | E |
(4 rows)

In response to

Responses

Browse pgsql-sql by date

  From Date Subject
Next Message Jasen Betts 2010-06-26 13:09:05 Re: Round integer division
Previous Message Frank Bax 2010-06-26 12:30:03 Re: How do I remove selected words from text field?