Indexing Bottleneck.

Post Reply
foosh101
Posts: 61
Joined: Tue Oct 22, 2002 2:13 pm

Indexing Bottleneck.

Post by foosh101 »

When Importing a large amount of data into a table from a csv file using timport, and then creating several complex indexes on that data, which is most likely the thing to slow it down the most - the CPU or the IO?
User avatar
John
Site Admin
Posts: 2622
Joined: Mon Apr 24, 2000 3:18 pm
Location: Cleveland, OH
Contact:

Indexing Bottleneck.

Post by John »

For the import it will depend on the size of the record. You should see the lower of several MB/s or several thousand records per second during import.

What do you mean by complex indexes?
John Turnbull
Thunderstone Software
foosh101
Posts: 61
Joined: Tue Oct 22, 2002 2:13 pm

Indexing Bottleneck.

Post by foosh101 »

Our records are somewhat large, containing many small fields (int, date, varchar(~50)) and a couple large ones (varchar(~8000)). We have several compund indexes on these tables, which contain several fields.

d:\morph3\tsql -d d:\morph3\texis\mktDB "CREATE INDEX mkt1011999ListingID ON mkt1011999 (ListingID)"
d:\morph3\tsql -d d:\morph3\texis\mktDB "SET ignorecase=1;CREATE INDEX mkt1011999TableID ON mkt1011999 (TableID)"

d:\morph3\tsql -d d:\morph3\texis\mktDB "SET ignorecase=1;set addexp='\alnum+[\/\.]{1}\alnum+';CREATE METAMORPH INVERTED INDEX mkt1011999Inverted ON mkt1011999 (Headline\ListingDescription\CustomListingDescription,DateStart,Price,DirmaID,ID1,ID2,ID3,ID4,ID5,ID6,ID7,ID8)"
d:\morph3\tsql -d d:\morph3\texis\mktDB "SET ignorecase=1;set addexp='\alnum+[\/\.]{1}\alnum+';CREATE METAMORPH INVERTED INDEX mkt1011999Headline ON mkt1011999 (Headline,DateStart,Price,DirmaID,ID1,ID2,ID3,ID4,ID5,ID6,ID7,ID8)"
d:\morph3\tsql -d d:\morph3\texis\mktDB "SET ignorecase=1;set addexp='\alnum+[\/\.]{1}\alnum+';CREATE METAMORPH INVERTED INDEX mkt1011999Custom ON mkt1011999 (CustomListingDescription,DateStart,Price,DirmaID,ID1,ID2,ID3,ID4,ID5,ID6,ID7,ID8)"
User avatar
Kai
Site Admin
Posts: 1272
Joined: Tue Apr 25, 2000 1:27 pm

Indexing Bottleneck.

Post by Kai »

If the data you're importing will be the vast majority of the final table size (ie. it was small or non-existent before the import), drop any indexes not needed during the import and re-build them afterwards. This will improve import speed somewhat as the indexes don't need to be updated during the import (mostly significant for Metamorph indexes).

During regular index creation, I/O is typically the limiting factor for large-row-size tables, as the entire table must be read but only a fraction of the data indexed. Metamorph index creation is usually limited by CPU, as more processing is going on.
Post Reply