From: | Colin Wetherbee <cww(at)denterprises(dot)org> |
---|---|
To: | Jeff Frost <jeff(at)frostconsultingllc(dot)com> |
Cc: | pgsql-sql(at)postgresql(dot)org |
Subject: | Re: finding columns that have three or fewer distinct characters |
Date: | 2008-03-05 17:56:01 |
Message-ID: | 47CEDEB1.3050902@denterprises.org |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-sql |
Jeff Frost wrote:
> I've got an interesting one...I'm trying to find columns that have three
> or fewer distinct characters (for example, "aaaaaaaaaa"). Wondering if
> I need to write a function or if someone has an idea how to do it with
> built in functions and/or pattern matching?
>
> I think the thing to do would be to lowercase everything, then remove
> all duplicate chars and spaces, then use length() on that, but it's not
> obvious to me how I might remove the duplicate chars with the pattern
> matching support in the docs.
It's interesting, indeed.
Here's how you might do it with a PL/Perl function. :)
CREATE OR REPLACE FUNCTION remove_duplicates(TEXT) RETURNS TEXT AS
$$
my ($text) = @_;
while ($text =~ s/(.)(.*)\1/$1$2/g != 0) {};
return $text;
$$ LANGUAGE plperl;
cww=# SELECT
remove_duplicates('ffffoooooooobbbbaaaarrrr(dot)!(at)#$(dot)foobar-baz-qux');
remove_duplicates
-------------------
fobar(dot)!(at)#$-zqux
(1 row)
Colin
From | Date | Subject | |
---|---|---|---|
Next Message | Osvaldo Rosario Kussama | 2008-03-05 18:46:59 | Re: finding columns that have three or fewer distinct characters |
Previous Message | Steve Midgley | 2008-03-05 17:38:20 | Re: using copy from in function |