I'm trying to load a CSV of about 100M records (around 8GB on disk) into Postgres via the copy command: copy mytable from 'path/to/myfile.csv' with CSV;
I have been monitoring the progress by checking the reported table size in pgAdmin and comparing it with the CSV size. I know that's going to be a loose comparison at best, and I'd love to hear if there's a better way to monitor progress.
Here's the issue: this load has been going on for quite a long time (too long, I think) and as I keep checking the table size, the loading seems to be decelerating. That is, it takes much longer now to load in a new 100MB of data than it did earlier on in the load. Why?
Is there any tuning, configuration, or alternate approach I can take for a faster load other than breaking up my CSV into many smaller files?
Update: schema/data specifics
One representative data row:
1234567890,FOOBARF,2010-01-15 03:07:05,0.924700,0.925000
Complete schema definition:
CREATE TABLE mytable
(
id integer NOT NULL,
rname character varying(7) NOT NULL,
ts timestamp without time zone NOT NULL,
stat1 numeric NOT NULL,
stat2 numeric NOT NULL,
CONSTRAINT pk_id PRIMARY KEY (id)
)
WITH (
OIDS=FALSE
);
ALTER TABLE mytable OWNER TO postgres;
You might have better luck if you can disable indexes. But this is not a good answer as you did not provide enough information about the table.
Please post table defn, constraints, indexes!!!, and triggers.
Also, are you [make] sure the CSV data is correct and matches your table?