Indexes on large data

Post Reply
skalyanaraman
Posts: 109
Joined: Tue May 29, 2001 9:13 pm

Indexes on large data

Post by skalyanaraman »

hi, Thanks for the help with the indexing problem. It seems to be working now(I will wait to see what happens). But, looking at the amount of data and the maintanence of it we are are trying to find out other solutions here to alleviate the time it takes to index. It takes 24+hrs to index the 41.9GB blb + 2.3GB TBl file.

We were exploring other ideas and we had the following questions:

1. Could we somehow, run multiple indexes from different machine looking at the same database? would there be a locking or contention on the SYSINDEX or other sys tables?

2. can we use the indexspace set to create index on other drives and ultimately copy the index to the database and change the sysindex somehow?

3. I know that Thunderstone is working on a solution where the database could be partitioned across multiple machines physically but logically looks like a single database. What is the status on this?

4. Could we ourselves split the big table into smaller pieces and index separately and somehow merge them all to become the index for the whole table?

5. How do other Texis users get around this problem of long index times on large data?

Any help or pointers would be helpful!!

Thanks
skalyanaraman
Posts: 109
Joined: Tue May 29, 2001 9:13 pm

Indexes on large data

Post by skalyanaraman »

Thanks.
1. So, for multi CPU box we do not need any special code right. just run multiple indexing commands at the same time? or we can somehow pin the indexes to a particular CPU.

2. In this case, we can just index the data on separate machines and copy to the main db later and just run an update statement on the SYSINDEX? What fields should we be changing on the SYSINDEX table? any other table need to be changed? How do I set the temp file directories? why would that improve performance?

5. But, if a table already has the indexes, the insert into that table seems slow as the indexes are being touched for every insert or delete right? I think we can set the index not to be touched during this processes,can we? That is why we drop the index, insert the data and reindex the whole thing. is it a bad idea?

thanks
User avatar
Kai
Site Admin
Posts: 1272
Joined: Tue Apr 25, 2000 1:27 pm

Indexes on large data

Post by Kai »

When creating multiple indexes simultaneously be sure to reduce indexmem for each index so the machine doesn't thrash. The default is 40%; the sum total of indexmem for all simultaneous indexes shouldn't exceed, say 50-60% to leave room for disk caching etc. (Reducing indexmem does not generally have any adverse affect on creation speed.)
User avatar
mark
Site Admin
Posts: 5519
Joined: Tue Apr 25, 2000 6:56 pm

Indexes on large data

Post by mark »

2. there's no need to move anything around. specify "set indexspace='somedir'" before creating an index and that index will be created there and texis will always expect it to be there. having indices and temp files in various disks can improve performance by reducing disk head seek times.

5. partial indexing is generally much faster than full indexing if you're not updating most of the database. the exact value of "most" will depend on your setup.
skalyanaraman
Posts: 109
Joined: Tue May 29, 2001 9:13 pm

Indexes on large data

Post by skalyanaraman »

After reading the above answers and relating it to what we do, I still have these questions:

1. To run simultaneous indexes on the same machine(with multiprocessors), except for indexmem's, do we have to use any special set commands to force the index to use different processors?

2. if I set up the temp directory to be on different set of drives, maybe a local set of drives on the box, how will it improve performance? Such a broad question, but still curious.

Thanks!!
User avatar
mark
Site Admin
Posts: 5519
Joined: Tue Apr 25, 2000 6:56 pm

Indexes on large data

Post by mark »

1. No. It's up to the OS to distribute the processing load amongst the processors.

2. it can read/write tmp files without causing head motion on the drives with the table or resulting index. ideally the table would be on drive, the temp on another, and the index on a third.
Post Reply