PDA

View Full Version : Cannonical URLs


jvarsoke
12-16-2005, 03:55 AM
Like most people my ISP blocks port 80. So to get around that I have my own domain, and I have Zoneedit.com forward requests to port 5000.

The client then sees "http://www.foo.com:5000/index.php" as the url in the URL bar, instead of "http://www.foo.com/index.php". This can be solved by using Zoneedit's "cloak forwarding" where basically they answer the request on their httpd server and spit back a page with a frameset that calls the 5000 port. So this looks like the port 80 is answering.

Unfortunately, relative URLs still use the port 5000 address. But you don't really want someone bookmarking this. In fact, you'd rather they didn't know there was a port 5000 at all (looks unprofessional and confuses people).

I haven't found a slick way to solve this (mod_rewrite just made cyclic requests). So I'm planning on making a "Cannonical URL" hack for pixelpost.

It would basically just make every relative URL an absolute URL for whatever you say the real URL is for your site.

What I'd like to know is if someone has already done this, or if pixelpost 1.4.3 or 1.5 will have this functionality in the near future?

Thanks,

-j

Connie
12-16-2005, 04:38 AM
no hack please, we only suggest ADDONS!!!!!

jvarsoke
12-16-2005, 08:25 AM
Considering all the auto-generated URLs like <IMAGE_PREVIOUS> need to be re-written as absolute URLs instead of relative, I'm not sure an add-on will suffice.

To create the tag <ABSOLUTE_URL> is about a 10 line add-on, but really only helps for links in the template.

Unless there's some HTML / Javascript trick to set all your relative links on a page to an Absolute base URL. (I'm not an html guru)

-j

tinyblob
12-16-2005, 09:18 AM
not a problem i've encountered really, most people with enough bandwidth to host-from-home have ISPs that are okay with it.. and generally the rest have passages in their EULAs that forbid it.

making an addon for this is easy-ish.

the addons are executed at the last stage of runtime, then the pixelpost webpages are created. all you'd need to do is write a statement that replaces absolute urls.
so in my case i'd replace all instances of "http://www.touchnothing.net/" with nothing.

off the top of my head, that would work provided that it's the last addon executed, any addons executed afterwards might create absolute urls.

i believe that the addons are fired in alphabetical order, so just call yours something like "zzzz_cannonical.php" :D

jvarsoke
12-16-2005, 11:56 AM
so in my case i'd replace all instances of "http://www.touchnothing.net/" with nothing.

Yes, that's the eas[ier/y] version. I'd like to replace all relative URLs with absolute URLs. (just the reverse of your suggestion).

I guess I'll just whip up some regex to get /<a href="[^hH]/

hmm, yeah, that might be do-able.

Zz

tinyblob
12-16-2005, 12:37 PM
ah, sorry, i obviously misread.
if you already know regex then i won't preach to the choir. finding relative urls should be fairly elementary, just remember not to replace the url, but replace the entire href="*" part, or you might find that you replace an element of an already absolute url :)

GeoS
12-16-2005, 07:10 PM
You can mask all what you want i.e. using mod_rewrite :)

jvarsoke
12-16-2005, 10:28 PM
You can mask all what you want i.e. using mod_rewrite :)

I've found that using Apache's mod_rewrite gives the browser circular references.

main = www.foo.com (on port 80)
fake = ww2.foo.com:5000

If you do the following:

<IfModule mod_rewrite.c>
RewriteEngine On
RewriteCond %{HTTP_HOST} ^ww2\..*\.com [NC]
RewriteCond %{SERVER_PORT} !^80$
RewriteRule ^/(.*) http://www.foo.com/$1 [L]
</IfModule>


you get circular references. I believe the browser asks for the fake address, then the server redirects it to the main address, and so the browser sends another request to the main address, which is forwarded by Zoneedit to the fake address again, etc etc etc.

Note: if my understanding of mod_rewrite is off, let me know how to fix it.

About 3/4 of the way through writing: zzz_canonical I found a much easier way to do this:


<base href="http://www.foo.com"/>


That seems to solve the problem.

-j

GeoS
12-17-2005, 10:54 AM
I would say that mod_rewrite works like that:
1) you are GETing some address
2) server checks if this if normal or rewrite address
3) if it is rewrite it try to get correct content and serve it

From livehaders firefox's addon it looks like above. No additional redirects (I was testing in same domain and server so maybe it is a reason).