COPY BINARY file format proposal

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: pgsql-hackers(at)postgresql(dot)org
Subject: COPY BINARY file format proposal
Date: 2000-12-06 20:26:37
Message-ID: 12674.976134397@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Well, no one seemed very unhappy at the idea of changing the file format
for binary COPY, so here is a proposal.

The objectives of this change are:

1. Get rid of the tuple count at the front of the file. This requires
an extra pass over the relation, which is a lot more trouble than the
count is worth. Use an explicit EOF marker instead.
2. Send fields of a tuple individually, instead of dumping out raw tuples
(complete with alignment padding and so forth) as is currently done.
This is mainly to simplify TOAST-related processing.
3. Make the format somewhat self-identifying, so that the reader has at
least some chance of detecting it when the data doesn't match the table
it's supposed to be loaded into.

The proposed format consists of a file header, zero or more tuples, and a
file trailer.

The file header will just be a 32-bit magic number; it's present so that a
reader can reject non-COPY-binary input data, as well as detect problems
like incompatible endianness. (We could also use changes in the magic
number as a flag for future format changes.)

Each tuple begins with an int16 count of the number of fields in the
tuple. (Presently, all tuples in a table will have the same count, but
that might not always be true.) Then, repeated for each field in the
tuple, there is an int16 typlen word possibly followed by field data.
The typlen field is interpreted thus:

Zero Field is NULL. No data follows.

> 0 Field is a fixed-length datatype. Exactly N
bytes of data follow the typlen word.

-1 Field is a varlena datatype. The next four
bytes are the varlena header, which contains
the total value length including itself.

< -1 Reserved for future use.

For non-NULL fields, the reader can check that the typlen matches the
expected typlen for the destination column. This provides a simple
but very useful check that the data is as expected.

There is no alignment padding or any other extra data between fields.
Note also that the format does not distinguish whether a datatype is
pass-by-reference or pass-by-value. Both of these provisions are
deliberate: they might help improve portability of the files (although
of course endianness and floating-point-format issues can still keep
you from moving a binary file across machines).

The file trailer consists of an int16 word containing -1. This is
easily distinguished from a tuple's field-count word.

A reader should report an error if a field-count word is neither -1
nor the expected number of columns. This provides a pretty strong
check against somehow getting out of sync with the data.

Comments?

regards, tom lane

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2000-12-06 20:36:55 Re: COPY BINARY file format proposal
Previous Message Mark Stosberg 2000-12-06 20:06:07 select cash_out('2'); crashes backend on 7.0.2