Re: How do I remove selected words from text field?

From: Frank Bax <fbax(at)sympatico(dot)ca>
To: pgsql-sql(at)postgresql(dot)org
Subject: Re: How do I remove selected words from text field?
Date: 2010-06-26 12:30:03
Message-ID: BLU0-SMTP60238A150A2C1F00567762ACC80@phx.gbl
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-sql

Osvaldo Kussama wrote:
> 2010/6/25 Frank Bax <fbax(at)sympatico(dot)ca>:
>> I'm not quite sure how to ask for the query I want, so let's start with
>> data:
>>
>> create table t1 (i int, val varchar);
>> insert into t1 values(1,'A B C D');
>> insert into t1 values(2,'B D E F');
>> insert into t1 values(3,'G H I J');
>> create table t2 (q varchar, z varchar);
>> insert into t2 values('A','vowel');
>> insert into t2 values('B','consonant');
>> insert into t2 values('E','vowel');
>> insert into t2 values('K','consonant');
>>
>> t1.val will contain "words" separated by blanks. It might be better if each
>> "word" were a separate row in another table; but that's not how the legacy
>> database was built. I understand this can be simulated by:
>>
>> select i,a[s] from (select i, generate_subscripts(string_to_array(val,'
>> '),1) as s, string_to_array(val,' ') as a from t1) foo;
>>
>> In my "real life" situation, the "words" are not single letters.
>>
>> I'd like to write a function that removes selected "words" from t1.val based
>> on select on t2.
>> In the above example; let's exclude all vowels, so I end up with:
>> 1 'B C D'
>> 2 'B D F'
>> 3 'G H I J'
>>
>> For some "words" in val; there may not be a row when joining to t2.q; these
>> words must be included in final result. In the above example; there is no
>> row in t2 where q="I"; so it is included in result.
>>
>> How do I write such a function? Can it be done with SQL only?
>>
>
>
> Try:
>
> SELECT i, array_to_string(array_agg(foo), ' ') "Val"
> FROM (SELECT * FROM (SELECT i, regexp_split_to_table(val, E'\\s+') AS
> foo FROM t1) bar
> LEFT OUTER JOIN t2 ON (bar.foo = t2.q) WHERE z IS DISTINCT FROM 'vowel') foobar
> GROUP BY i;
> i | Val
> ---+---------
> 1 | B C D
> 3 | G H I J
> 2 | B D F
> (3 linhas)
>
> Osvaldo
>

Excellent! Thanks! I've never seen "is distinct from" before. Looks
like that was the missing piece to my puzzle.

In response to

Browse pgsql-sql by date

  From Date Subject
Next Message Frank Bax 2010-06-26 13:03:39 Re: How do I remove selected words from text field?
Previous Message Adrian Klaver 2010-06-26 01:51:26 Re: COPY command and required file permissions