| From: | Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com> |
|---|---|
| To: | Srinivas Karthik V <skarthikv(dot)iitb(at)gmail(dot)com> |
| Cc: | PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org> |
| Subject: | Re: Bulk Insert into PostgreSQL |
| Date: | 2018-06-27 11:25:06 |
| Message-ID: | CAFj8pRA+3xfHkTWnEr89xXWhC9dBMMF9a1z5ZdaQMmEemghDkg@mail.gmail.com |
| Views: | Whole Thread | Raw Message | Download mbox | Resend email |
| Thread: | |
| Lists: | pgsql-hackers |
2018-06-27 13:18 GMT+02:00 Srinivas Karthik V <skarthikv(dot)iitb(at)gmail(dot)com>:
> Hi,
> I am performing a bulk insert of 1TB TPC-DS benchmark data into PostgreSQL
> 9.4. It's taking around two days to insert 100 GB of data. Please let me
> know your suggestions to improve the performance. Below are the
> configuration parameters I am using:
> shared_buffers =12GB
> maintainence_work_mem = 8GB
> work_mem = 1GB
> fsync = off
> synchronous_commit = off
> checkpoint_segments = 256
> checkpoint_timeout = 1h
> checkpoint_completion_target = 0.9
> checkpoint_warning = 0
> autovaccum = off
> Other parameters are set to default value. Moreover, I have specified the
> primary key constraint during table creation. This is the only possible
> index being created before data loading and I am sure there are no other
> indexes apart from the primary key column(s).
>
The main factor is using COPY instead INSERTs.
load 100GB database should to get about few hours, not two days.
Regards
Pavel
> Regards,
> Srinivas Karthik
>
>
>
| From | Date | Subject | |
|---|---|---|---|
| Next Message | ROS Didier | 2018-06-27 11:46:38 | RE: Bulk Insert into PostgreSQL |
| Previous Message | Pavel Stehule | 2018-06-27 11:22:10 | Re: [HACKERS] proposal: schema variables |