Re: text fields and performance for ETL

From: "David G(dot) Johnston" <david(dot)g(dot)johnston(at)gmail(dot)com>
To: grega(dot)jesih(at)gmail(dot)com, Pg Docs <pgsql-docs(at)lists(dot)postgresql(dot)org>
Subject: Re: text fields and performance for ETL
Date: 2021-11-03 14:38:19
Message-ID: CAKFQuwa_02fEtL8zJMt7MR=xvrp-De1pQtHFtg7fSHhSm1ajnA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-docs

On Wed, Nov 3, 2021 at 6:55 AM PG Doc comments form <noreply(at)postgresql(dot)org>
wrote:

> But performance in ETL processes related to such data type is decreased
> dramatically, because any process that takes this kind of data needs to
> calculate its size on a row level and cannot take bigger chunks of data
> based on max size.
>

All of my ETL simply reads in the entire contents of a text field. There
is no chunking. The documentation assumes that the sizes involved here are
reasonable for such behavior. If you have a situation where you've chosen
to use varchar(n) and can defend that choice more power to you. Those
special circumstances are not of particular interest here. For the vast
majority of users they use varchar(n) because they (or more likely their
teachers) come from systems where it is required. The goal in our docs is
to point out that using an arbitrary length specification is not required
in PostgreSQL.

David J.

In response to

Responses

Browse pgsql-docs by date

  From Date Subject
Next Message Grega Jesih 2021-11-03 15:35:51 RE: text fields and performance for ETL
Previous Message Bruce Momjian 2021-11-03 14:07:55 Re: text fields and performance for ETL