Re: crosstab function

From: Morris de Oryx <morrisdeoryx(at)gmail(dot)com>
To: Martin Mueller <martinmueller(at)northwestern(dot)edu>
Cc: pgsql-general <pgsql-general(at)postgresql(dot)org>
Subject: Re: crosstab function
Date: 2019-02-28 11:02:11
Message-ID: CAKqncchz8m1cMvCjz8QACLPoLLWONgSDrsf9jM=O_XvsN+HM2Q@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

Professor Mueller! I believe that we met, long ago. I graduated from your
department in 1984 where I worked closely with the wonderful, late Prof.
Dipple.

Postgres.app is a very easy way to work with Postgres, and it does include
support for tablefunc. If you ever want to check which extensions are
installed, run this line:

select * from pg_available_extensions order by name;

Your code looks correct on the face of it:

CREATE EXTENSION IF NOT EXISTS tablefunc;

Or, if you have set up schemas other than the default "public", you can
install into a specific schema:

CREATE EXTENSION IF NOT EXISTS tablefunc WITH SCHEMA extensions;

If you aren't already using custom schemas...I'll leave it alone for now.

As noted, you're installing into a specific database, so make sure that
you've connected where you expect and are in the database you mean. It's
fairly easy for a tool to default to something other than your custom
database. If it's not clear from the UI, or you just feel like testing by
hand, run this line:

SELECT current_database();

It's worth knowing that a Postgres extension is a packaging system. An
extension may include C code, setup scripts, straight SQL, a variety of
resources. Sometimes, you can open one up and harvest little bits of SQL
you want. For details:

https://www.postgresql.org/docs/10/extend-extensions.html

After a quick googling, it looks like you may be interested in textual
analysis. If so, Postgres has a *lot* of tools that can be of assistance.
Within Postgres.app, I can see at least the following:

citext
If you haven't noticed, and care, Postgres' default varchar/text field type
is case-sensitive. Ugh. The citext extension is searchable
case-insensitively out of the box. I use this for alpha/text fields when I
don't care about case-sensitive searches. For where that is, read
"everywhere".

fuzzystrmatch
https://www.postgresql.org/docs/10/fuzzystrmatch.html

Basic, name/word-matching fuzzy algorithms. The "phonetic" ones are not so
great, but Levenshtein is quite good, if a bit expensive to run.

Full Text Search
Huge subject, lots of options, modern versions of Postgres are quite strong
here.

unaccent
The description reads, "text search dictionary that removes accents." I
haven't needed it, and wonder if specifying a collation might not work
better?

pg_pgtrgm
https://www.postgresql.org/docs/10/pgtrgm.html

*N*-grams of length 3. This is a fantastic tool. N-grams have proven
themselves down the years for fuzzy string matching in multiple domains.
I've mostly used it historically on name data, but it works well on larger
text blocks as well. This holds up with many languages other than English.
It's pretty easy to use this extension.

There's another appealing extension named pg_similarity that includes a
huge range of text comparison and fuzzy ranking tools, but I do not know
how to compile it for macOS or get it to run with Postgres.app. If you are
interested in a specific algorithm, many are easily implemented in a SQL
statement or stored function. For example, Jaccard (and similar) ranking
metrics are produced arithmetically, so they're easy to reimplement.

In response to

Browse pgsql-general by date

  From Date Subject
Next Message Thomas Güttler 2019-02-28 11:47:49 Where **not** to use PostgreSQL?
Previous Message Thomas Kellerer 2019-02-28 10:22:33 Re: Overloaded && operator from intarray module prevents index usage.