Re: have you feel anything when you read this ?

From: Markus Schaber <schabi(at)logix-tt(dot)com>
To: pgsql-sql(at)postgresql(dot)org
Subject: Re: have you feel anything when you read this ?
Date: 2006-04-10 16:31:12
Message-ID: 443A8850.8050103@logix-tt.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-sql

Hi, Eugene,

Eugene E. wrote:

> he did not request this representation. it is _by_default_

He used a function that provided it by default. He could use the other
function that allows him to select which representation he wants.

> if you wish to provide it by request, please do it.

I cannot provide anything, because I'm not a PostgreSQL developer. To be
honest, I can provide the user with nice Java Objects, because I'm the
maintainer of the PostGIS Java extension, but that's all.

>> A user that does not have a need in C-Strings can fetch the binary
>> representation, getting higher efficency for all datatypes.
> and lose the pretty good representation of all other columns in the same
> request.

It is not "pretty good". It is human readable, but it are C-Strings with
some text. It uses much more CPU power on both sides, Date Formats can
be misleading, and PostGIS even reverted to a binary like format for
their canonical text representation because the users complained about
coordinate drift due to rounding errors in the floating point
input/output routines, so pg_dump and restoring the database changed the
data.

So I cannot see that the textual from is superior for _any_ of the
datatypes. IMHO, its sole purpose is to have a generic way to represent
the data for generic tools as pg_dump -F p, pgadmin3, psql and such,
which cannot know the (possibly user-defined) datatypes in advance.

>> There is no philosophy but orthogonality.
>> There's a textual and a binary form of datatypes. For varchar, byta,
>> int4, float, PostGIS geometries etc...
> good. i ask you to slightly change "textual" representation of bytea.

This will achieve at least the following:

- It will break all libraries, tools and applications that rely on the
current behaviour.

- It will break the guarantee for generic applications that the text
representation of every datatype can be handled as text.

- It will break pg_dump -F p (which is the default for pg_dump by the
way), thus making it _impossible_ to have "plaintext" dumps of a
database, with no easy way of reinventing this behaviour. Those dumps
are to be edited with text editors, which don't cope easily with
nullbytes and other waste...

>> The text representation is pretty useful for human readers for _most_
>> datatypes, the binary representation is much easier to parse for
>> programs.
> You are right.
> but
> Who said that i can not display something ?
> i thougth, human-readability of some data depends completely on how
> CLIENT-SIDE interpret it.
> server do not know and should not know
> what data is human readable or printable... etc.

So what you say is that the canonical text representation should be
abandoned completely. Fine.

The problem is that all generic applications, that don't know about the
concrete datatypes, will get impossible. psql, pg_dump, pgadmin and others.

Different from "normal" applications which can have their specific
datatypes hardwired in the code or whatever, those applications cannot
be taught about how to present the data to an user in a generic way, if
there's nothing in the backend.

Users and extensions can invent new datatypes as they want, how do you
expect the authors of pgadmin or psql to cope with proprietary in-house
datatypes of a certain PostgreSQL user?

>> So use the binary representation for everything if you don't want to
>> display the data to the user directly.
> The problem we discuss is not about displaing or printig at all.
> Some applications want "textual-form" -- most applications
> but not only to display
> and in the _same_ query the same applications want bytea...

Why do you try so hard to resist understanding the whole point?

Those applications _get_ bytea. They just get a Cstring-safe
representation of it. It's just like you have to put "quotes" around and
\escapes into a string in your program sources if you use any of the
weird characters.

You have the decision between text and binary format for your query. As
libpq is a low level API, it does not abstract you from this difference.

You can you use a higher level API that abstracts over the whole issue
and gives you nice Objects (like the jdbc library), then you don't have
to cope with those representations at all.

It also may make sense to provide an extension for lipq that lets you
select binary and textual representation column-wise (which might need a
protocol extension, I don't have the specs in mind).

But it absolutely does not make any sense to break the whole concept of
text representations by making it binary for a single datatype.

HTH,
Markus
--
Markus Schaber | Logical Tracking&Tracing International AG
Dipl. Inf. | Software Development GIS

Fight against software patents in EU! www.ffii.org www.nosoftwarepatents.org

In response to

Browse pgsql-sql by date

  From Date Subject
Next Message Neil Harkins 2006-04-10 21:36:47 concatenation with a null column (using ||) nulls the result?
Previous Message Yasir Malik 2006-04-10 16:12:36 Re: how to use recursion to find end nodes of a tree