bcp.exe Fix Triggers

From: "yazicivo" <yazicivo(at)ttmail(dot)com>
To: <pgsql-sql(at)postgresql(dot)org>
Subject: bcp.exe Fix Triggers
Date: 2008-06-02 08:25:08
Message-ID: E27F5064C8BEA44EBCBF216E6A70A43B19BE70@STT1EVS44.TTHMC.LOCAL
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-sql

Hi,

I'm trying to import some relatively huge (~300GiB) set of data from
Microsoft SQL Server database to PostgreSQL. For this purpose, I use
bcp.exe (bulk copy utility) comes with MSSQL. Despite there are command
line arguments which you can specify batch size, escape characters, null
fields, etc.; %90 of these arguments are unsuprisingly ignored by
bcp.exe. The problem is, bcp.exe

- Uses nothing to specify null fields, which is equivalent to

COPY ... WITH NULL AS ''

in PostgreSQL.

- Uses \x00 character to specify empty strings.

(Sorry not any single part of this mess is configurable.) To
successfully accept such an input during "COPY ... FROM ..." in
PostgreSQL, I decided to write middleware triggers to tables including
columns of type character varying. Below is the procedure I come up with

CREATE OR REPLACE FUNCTION create_bcp_fix_triggers (_table_schema text)
RETURNS void AS $$
DECLARE
_table record;
_column record;
_create_stmt text;
BEGIN
SET standard_conforming_strings TO off;
SET escape_string_warning TO off;
-- Find every table having a column of type 'character varying'.
FOR _table
IN SELECT table_name
FROM information_schema.columns
WHERE table_schema = _table_schema AND
data_type = 'character varying'
GROUP BY table_name
ORDER BY table_name
LOOP _create_stmt = 'BEGIN;\n' ||
'CREATE OR REPLACE\n' ||
'FUNCTION ' || _table.table_name || '_bcp_fix ()\n' ||
'RETURNS "trigger" AS $bcp-fix$\n' ||
'BEGIN\n';
-- Create appropriate bcp.exe fix clauses for every found column.
FOR _column
IN SELECT column_name
FROM information_schema.columns
WHERE table_schema = _table_schema AND
table_name = _table.table_name
LOOP _create_stmt = _create_stmt ||
' IF NEW.' || _column.column_name || ' = E''\0'' THEN\n' ||
' NEW.' || _column.column_name || ' = ''''\n' ||
' END IF;\n';
END LOOP;
_create_stmt = _create_stmt ||
' RETURN NEW;\n' ||
'END;' ||
'$bcp-fix$ LANGUAGE plpgsql;\n' ||
'CREATE TRIGGER ' || _table.table_name || '_bcp_fix\n' ||
' BEFORE INSERT ON ' || _table.table_name || '\n' ||
' FOR EACH ROW EXECUTE ' ||
' PROCEDURE ' || _table.table_name || '_bcp_fix();\n' ||
'COMMIT;';
EXECUTE _create_stmt;
END LOOP;
END;
$$ LANGUAGE plpgsql;

But executing this procedure fails for some reason I couldn't
understand.

> SELECT public.create_bcp_fix_triggers('commsrv');
ERROR: syntax error at or near "AS $"
LINE 4: RETURNS "trigger" AS $bcp-fix$
^
QUERY: BEGIN;
CREATE OR REPLACE
FUNCTION xyz_bcp_fix ()
RETURNS "trigger" AS $bcp-fix$
BEGIN
IF NEW.foo = E' NEW.foo = ''
END IF;
IF NEW.bar = E' NEW.bar = ''
END IF;
RETURN NEW;
END;$bcp-fix$ LANGUAGE plpgsql;
CREATE TRIGGER xyz_bcp_fix
BEFORE INSERT ON xyz
FOR EACH ROW EXECUTE PROCEDURE xyz_bcp_fix();
COMMIT;
CONTEXT: PL/pgSQL function "create_bcp_fix_triggers" line 41 at
EXECUTE statement

Can anybody help me to spot the problem?

Regards.

P.S. For data transfer during migratition, I first considered using "EMS
Data Export for SQL Server" tool, but it lacks of performance.
(~1000rows/sec) Namely, I'm open to any bcp.exe alternatives.

Responses

Browse pgsql-sql by date

  From Date Subject
Next Message Joe 2008-06-02 12:36:17 Re: bcp.exe Fix Triggers
Previous Message Rodrigo E. De León Plicet 2008-05-30 17:19:55 Re: Weeks elapsed to Months elapsed conversion