relative path links

Post Reply
aitchon
Posts: 119
Joined: Mon Jan 22, 2007 10:30 am

relative path links

Post by aitchon »

I'm fetching a page that has links with relative paths. When I run the following code, I get a link that looks like http://testurl.com/dir1/dir2/filename/. Instead the path should be http://testurl.com/dir2/filename. (I don't want to use urllinks off of the fetch of the $mainpath page, since I'm having problems extracing links that way.) How can I make the <fetch $mainpath $htmllink> statement work to give me the correct path?

<$mainpath="http://testurl.com/dir1/">
<fetch $mainpath>
<$htmlpage=$ret>
<!-- sample relative path href would be "dir2/filename/" -->
<rex row "<a\space=!</a>+</a>" $htmlpage>
<$htmllink=$ret>
<fetch $mainpath $htmllink>
<urllinks>
<loop max=1 $ret>
<$relativepathlink=$ret>
</loop>
</rex>
User avatar
mark
Site Admin
Posts: 5519
Joined: Tue Apr 25, 2000 6:56 pm

relative path links

Post by mark »

Given a url of "http://testurl.com/dir1/" and an href of "dir2/filename/" the correct full url is "http://testurl.com/dir1/dir2/filename/" so I don't see the problem.

Also, what kind of problem are you having using urllinks directly from the first fetch? You're doing a ton of extra work to accomplish the same thing.
aitchon
Posts: 119
Joined: Mon Jan 22, 2007 10:30 am

relative path links

Post by aitchon »

If you look at this page http://www.autoexperience.de/20-Audi-TT-Forum/, the href's look like:

<a href="thread/117/Auch-Gaeste-koennen-Themen-und-Beitraege-erstellen/">Auch G&auml;ste k&ouml;nnen Themen und Beitr&auml;ge erstellen!!</a>

When I do <fetch $mainpath $htmllink> using these 2 values, I get:

http://www.autoexperience.de/20-Audi-TT ... erstellen/

But the value should be:

http://www.autoexperience.de/thread/117 ... erstellen/

This is the value that urllinks produces. The reason I'm not using urllinks is that I've found it unreliable on a page that might have broken html.
User avatar
John
Site Admin
Posts: 2622
Joined: Mon Apr 24, 2000 3:18 pm
Location: Cleveland, OH
Contact:

relative path links

Post by John »

The page itself has a base in it:

<base href="http://www.autoexperience.de/" />

so relative links will be relative to that, not the page's URL. <urllinks> sees that, and gets the right answer.
John Turnbull
Thunderstone Software
Post Reply