I have a text file that lists ~30 million URLs, 1 per line. I want to load it into a Texis database table where I have an id,url column. I tried to use the readln and do an insert into the table but after 14 hours, I have only loaded 6 million URLs. Is there a faster way?
Using timport stand alone executable should be faster. Do you have any indexes on the table? Also we have noticed that if the table is on a Veritas file system it will slow down as the table gets bigger. We do have some workarounds for that that should help, but may currently require a program using the C API.
On a normal filesystem using timport you should be able to load all the records in about 90 minutes, and using the C API maybe 30-45 minutes.
Just to confirm, on the <READLN> did you use the ROW option, and use NOVARS on the SQL insert? Using those options will help ensure that you don't keep excessive amounts of data in RAM, which can also slow things down.
Working on a vortex script to automate this process. My current process uses a Perl script to extract the Host portion of the URL. I would like to do this in a vortex script using <TIMPORT>
The problem I am having is coming up with a recexpr to get the entire URL and the host portion of the URL. So if I have http://somesite.com/dir/file.html I would want to get