Indexing Single Page or Multiple Pages

jasondwitt
Posts: 7
Joined: Tue Jul 17, 2001 4:19 pm

Indexing Single Page or Multiple Pages

Post by jasondwitt »

I know about the single page (-g) option but I
was hoping for perhaps another way of pulling
single pages from a site... with a little more
flexibility.

I'm writing a perl script to automate some spider
stuff. Sometimes a disk based url list will only
require a spider of the single page specified in
the url on each line in the file. But at other
times the spider will need to pull "X" number of
pages from the site starting with the home page
url.

Can i not use some combination of depth and breath
to accomplish this so i just need to change the
actual depth/breadth values in the command line
that is generated by the perl script?

For example,

-b1 -D1 for single pages
-b1 -D2 for pages 2 deep

Its late and do admit that i'm kinda guessing
here on these examples.

thanks
-jason
User avatar
mark
Site Admin
Posts: 5519
Joined: Tue Apr 25, 2000 6:56 pm

Indexing Single Page or Multiple Pages

Post by mark »

Yes, but -b doesn't take a number and the starting level is 0 not 1.

To get just the specified url, use -D0 .
Also do -wipetodo afterwards to prevent it's children from being fetched by a subsequent deeper walk of another url.