Re: numbering rows on import from file

From: Steve Crawford <scrawford(at)pinpointresearch(dot)com>
To: Alexy Khrabrov <deliverable(at)gmail(dot)com>
Cc: sql pgsql <pgsql-sql(at)postgresql(dot)org>
Subject: Re: numbering rows on import from file
Date: 2008-05-03 00:18:27
Message-ID: 481BAF53.50101@pinpointresearch.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-sql

Alexy Khrabrov wrote:
> Greetings -- I have a huge table of the form
> (integer,integer,smallint,date). Its origin is an ASCII file which I
> load with \copy. Now I want to number the rows, adding an id column
> as an autoincrement from a sequence. How should I do the import now
> for the sequence to work -- should I add the id column last, so it
> will not be filled by copy and presumably autoincrement?
>
> Or, once the table is already in, can I add a column and force it to
> be filled with consecutive numbers, effectively numbering the rows?
>
Depends on your goal.

Do you just want a unique ID, make the id column a serial. Done. If you
already have the table, add the column, update the table setting
id=nextval('the serial columns sequence name') before bringing in
additional data. If you were using plain \copy, you will now have to
name the input columns since you don't have data for the id column:
\copy (int1, int2, smallint3, date4) from ..... This is usually good
practice anyway as additions to your table or column ordering changes
won't affect your import.

Do you want the IDs to match row numbers in the source file? Use
something like nl or whatever scripting language you like to add numbers
in the source file.

Alternately, if the table is static, you can create a temporary sequence
to fill the id column on import.

You can also fill the id column after import if necessary by updating
the table setting id=nextval('yoursequence') but this will generate lots
of empty space by updating all tuples (not good with a "huge" table) and
may not associate tuples with source-file line-numbers.

Regardless of how you initially fill the id column, if you add/delete
records your numbers will not be consecutive. Deletes and rollbacks will
cause holes.

If you just need consecutive row-numbering on output (not in the table)
and if the row numbering doesn't need to match the same record each
time, you can create a temporary sequence and select
nextval('tempsequence'),.... from yourtable.

Cheers,
Steve

In response to

Browse pgsql-sql by date

  From Date Subject
Next Message Alexy Khrabrov 2008-05-03 08:05:07 update with multiple fields as aggregates
Previous Message Scott Marlowe 2008-05-02 23:00:33 Re: numbering rows on import from file