Page 1 of 1

Indexing Bottleneck.

Posted: Thu Jul 31, 2003 4:19 pm
by foosh101
When Importing a large amount of data into a table from a csv file using timport, and then creating several complex indexes on that data, which is most likely the thing to slow it down the most - the CPU or the IO?

Indexing Bottleneck.

Posted: Mon Aug 04, 2003 10:04 am
by foosh101
Our records are somewhat large, containing many small fields (int, date, varchar(~50)) and a couple large ones (varchar(~8000)). We have several compund indexes on these tables, which contain several fields.

d:\morph3\tsql -d d:\morph3\texis\mktDB "CREATE INDEX mkt1011999ListingID ON mkt1011999 (ListingID)"
d:\morph3\tsql -d d:\morph3\texis\mktDB "SET ignorecase=1;CREATE INDEX mkt1011999TableID ON mkt1011999 (TableID)"

d:\morph3\tsql -d d:\morph3\texis\mktDB "SET ignorecase=1;set addexp='\alnum+[\/\.]{1}\alnum+';CREATE METAMORPH INVERTED INDEX mkt1011999Inverted ON mkt1011999 (Headline\ListingDescription\CustomListingDescription,DateStart,Price,DirmaID,ID1,ID2,ID3,ID4,ID5,ID6,ID7,ID8)"
d:\morph3\tsql -d d:\morph3\texis\mktDB "SET ignorecase=1;set addexp='\alnum+[\/\.]{1}\alnum+';CREATE METAMORPH INVERTED INDEX mkt1011999Headline ON mkt1011999 (Headline,DateStart,Price,DirmaID,ID1,ID2,ID3,ID4,ID5,ID6,ID7,ID8)"
d:\morph3\tsql -d d:\morph3\texis\mktDB "SET ignorecase=1;set addexp='\alnum+[\/\.]{1}\alnum+';CREATE METAMORPH INVERTED INDEX mkt1011999Custom ON mkt1011999 (CustomListingDescription,DateStart,Price,DirmaID,ID1,ID2,ID3,ID4,ID5,ID6,ID7,ID8)"

Indexing Bottleneck.

Posted: Mon Aug 04, 2003 11:23 am
by Kai
If the data you're importing will be the vast majority of the final table size (ie. it was small or non-existent before the import), drop any indexes not needed during the import and re-build them afterwards. This will improve import speed somewhat as the indexes don't need to be updated during the import (mostly significant for Metamorph indexes).

During regular index creation, I/O is typically the limiting factor for large-row-size tables, as the entire table must be read but only a fraction of the data indexed. Metamorph index creation is usually limited by CPU, as more processing is going on.