Pixelpost

Authentic Photoblog Flavour


Go Back   Pixelpost Forum > SUPPORT / INFORMATION > Pixelpost FAQ

Thread Closed
 
Thread Tools
  #1  
Old 02-10-2007, 03:58 PM
Connie
Guest
 
Posts: n/a
Post How to protect your photos from hotlinking - A Primer

This post is meant as a tutorial and it will come in "chapters".
So there is no public reply function. You are free to discuss this in other postings, but treat this as a tutorial...

And also: I collected these informations and they work for me with different accounts, but I cannot guarantee that everything will work for everybody!

================================================== ===

A photoblog is aimed to show your photos.
You want to present your photos in the context which you create / design / layout.
You do not want to find your photos at other websites, in some blogs or community-portals ..

Well, I noticed some phenomen in the last time regarding my photos at photografitti.net and the server-load there increased:

increasing hits at photografitti.net, but not so many hits to index.php

what was going on?
I did a websearch and found out, that many of my images were "hotlinked" from other sites.
I checked these sites and found that some of my photos sat in very disgusting, obscene or stupid teeny-postings and other blablabla-pages

these people just entered the URL of my photos when they added images to their blog-posts

but how did they find my photos? I never saw these people in my stats..

the answer was: they found my images at images.google.com and just copied the URL there, they never visited my site..

What to do?

There are different steps to prevent hotlinking.
I will describe them in the following posts. Step by Step.
  1. how to stop hotlinking
  2. how to stop images.google.com to index your photos and to present them in images.google.com
  3. how to decrease the searchability of your photos (not for your website!)

Last edited by Connie; 02-15-2007 at 07:13 PM.
  #2  
Old 02-10-2007, 04:29 PM
Connie
Guest
 
Posts: n/a
#I How to prevent hotlinking with .htaccess

Wikipedia describes hotlinking the following:

Quote:
Inline linking, also known as hotlinking, leeching, or direct linking is the placing of a linked object, often an image, from one site in a web page belonging to a second site. The second site is said to have an inline link to the one where the object is located.
  1. The easiest way: your Webadministration Panel
  2. Do it yourself

Check whether your webhoster offers you the chance to stop hotlinking.
If your webspace sits on an apache server, the chance to stop hotlinking is very good.

If you use CPANEL, you will use the option Hotlink Protection
If you do not find it at first glance, look for the advanced menu.
In that section first you will find a description of Hotlinks and a form to define your rules.

Quote:
HotLink protection prevents other websites from directly linking to files (as specified below) on your website. Other sites will still be able to link to any file type that you don't specify below (ie. html files). An example of hotlinking would be using a <img> tag to display an image from your site from somewhere else on the net. The end result is that the other site is stealing your bandwidth. You should ensure that all sites that you wish to allow direct links from are in the list below. This system attempts add all sites it knows you own to the list, however you may need to add others.
You will define which URLs are allowed to access which fileformats (defined by their extensions) and you will decide which URL will be delivered to all other accesses.

Example: All direct accesses to files with the extension .jpg from the domain myimages.com will be served with the respective file, all other accesses from other domains will be directed to another URL, for example to another image-file.

So you name myimages.com in the list of allowed URLs, the extensions gif,jpg,GIF,JPG in the allowed extensions and the URL for the redirect.

As a result, you will get a .htaccess-file in the root of your server whith rules like these:

Quote:
RewriteEngine on
RewriteCond %{HTTP_REFERER} !^$
RewriteCond %{HTTP_REFERER} !^http://(www\.)?myimages.com/.*$ [NC]
RewriteCond %{HTTP_REFERER} !^http://(www\.)?myimages.com*$ [NC]
RewriteRule .*\.(gif|jpg|GIF|JPG)$ http://www.myimages.com/donotholtink.jpg [R,NC]
what does this mean?

All accesses to your photo-files from the domain myimages.com will be served correctly, the browser will show these files.

All other accesses, for example a link in a html-file or a forum at www.thisisnotmydomain.com will be redirected to the file donothotlink.jpg at your domain.

The .htaccess-file at www.photografitti.net allows access from a list of domains, all other accesses will be redirected to this file:



This JPG-file tells in german, that the intended image, which should show off here, is stolen from my website www.photografitti.de

here is an excerpt from that .htaccess-file:

Quote:
RewriteEngine on
RewriteCond %{HTTP_REFERER} !^$
RewriteCond %{HTTP_REFERER} !^http://(www\.)?avantart.com/.*$ [NC]
RewriteCond %{HTTP_REFERER} !^http://(www\.)?avantart.com$ [NC]
RewriteCond %{HTTP_REFERER} !^http://(www\.)?avantart.de/.*$ [NC]
RewriteCond %{HTTP_REFERER} !^http://(www\.)?avantart.de$ [NC]
RewriteCond %{HTTP_REFERER} !^http://(www\.)?pixelpost.org/.*$ [NC]
RewriteCond %{HTTP_REFERER} !^http://(www\.)?pixelpost.org$ [NC]
RewriteRule .*\.(gif|jpg|GIF|JPG)$ http://www.zweiterblick.de/aetsch.jpg [R,NC]
what is the effect?

People, who steal your photos will be blamed.
They will learn something from that, I hope.

But still your webserver will deliver a graphic file and these hotlinking people will steal your traffic. But they are not stealing your photos anymore!


Remember, that this is only working with APACHE, not with IE-Servers!

Last edited by Connie; 02-15-2007 at 07:13 PM. Reason: typo
  #3  
Old 02-15-2007, 07:11 PM
Connie
Guest
 
Posts: n/a
#2 how to prevent crawling your website using robots.txt

From Wikipedia:
The robots exclusion standard or robots.txt protocol is a convention to prevent cooperating web spiders and other web robots from accessing all or part of a website which is, otherwise, publicly viewable.

This part is not a complete introduction to the use and benefit of the file robots.txt which is a good tool to control bots and spiders (beside other purposes), it is a short introduction and lists usefull directives.

You can set different directives in that file, which must be placed in the root of your website (edit it with an ASCII-editor and upload it in ASCII-modus to your webspace)

As it would make no sense to block your website for all bots, indexing robots and search machines, it does make sense to block some of them explicitely

To stop Microsoft Search (Windows Live) to crawl your site completley, you can add this:
Quote:
User-Agent: MSNBot
Disallow:/
To stop Microsoft Search (Windows Live) to crawl your website like a amok-running idiot, you can add this to slow it down..
Quote:
User-Agent: MSNBot
Crawl-Delay: 36000
and another exotic directive especially for Micros&ft, to block Microsoft Search to show your website as a website preview, add this to your robots.txt:
Quote:
User-agent: searchpreview
Disallow:/
To stop Google Bot to index your page completely:
Quote:
User-agent: Googlebot
Disallow: /
To block all Bots to index the images- and the thumbnail-folder, set these:
Quote:
User-agent: *
Disallow: /images
Disallow: /thumbnails
  #4  
Old 02-15-2007, 07:45 PM
Connie
Guest
 
Posts: n/a
#3 how to control Bots with Meta-Tags

last but not least, these tags

Mother Wikipedia says: Meta elements are HTML elements used to provide structured metadata about a web page. Such elements are placed as tags in the head section of an HTML document.

They are helpful to stop robots and crawlers, at least the good behaving ones, as they are part of Web-Standard.

For Pixelpost, Meta-Tags must be placed in the head section of the templates which you activated in the admin section.

The head-section of, f.e., the simple-template comes like this:
Quote:
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
<html xmlns="http://www.w3.org/1999/xhtml" lang="en" xml:lang="en">
<head>
<title><SITE_TITLE></title>
<!-- Link for ATOM feed autodiscovery -->
<ATOM_AUTODETECT_LINK>
<!-- Link for RSS feed autodiscovery -->
<RSS_AUTODETECT_LINK>
<!-- META -->
<meta http-equiv="Content-Type" content="text/html;charset=utf-8" />
<meta http-equiv="content-type" name="keywords" content="PhotoBlog,<SITE_TITLE>,<IMAGE_TITLE>,Pixe lpost" />
<meta http-equiv="content-type" name="description" content="<SITE_TITLE>-PhotoBlog: <IMAGE_TITLE>, <IMAGE_NOTES_CLEAN>" />
<!-- CSS -->
<link rel="stylesheet" type="text/css" href="templates/simple/styles/light.css" title="light" />
<link rel="alternate stylesheet" type="text/css" href="templates/simple/styles/dark.css" title="dark" />
<!-- SCRIPTS -->
<script type="text/javascript" src="templates/simple/scripts/styleswitcher.js"></script>
</head>
so why not add these lines to the header?
Quote:
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
<html xmlns="http://www.w3.org/1999/xhtml" lang="en" xml:lang="en">
<head>
....
<!-- Spider-Control -->
<meta name="robots" content="noimageindex,nomediaindex" />
<meta name="robots" content="noarchive" />
<meta http-equiv="pragma" content="no-cache" />
<meta http-equiv="imagetoolbar" content="false" />
</head>
what do these directives mean?

Microsoft itself recomments to use this tag to stop indexing of Image-Files and Media-Files (I did not find a specification which mediafiles they will stop to index...)
Quote:
<meta name="robots" content="noimageindex,nomediaindex" />
This will stop the Bots to show your page as "archived version", especially useful for dynamic content, and Pixelpost is dynamically generated content!
Quote:
<meta name="robots" content="noarchive" />
Proxy-Agents should not cache your content at Proxy-Servers:
Quote:
<meta http-equiv="pragma" content="no-cache" />
and the last one, especially for our good friend, the Internet Explorer, this one: Do not show this molesting toolbar whenever an image is included at the site:
Quote:
<meta http-equiv="imagetoolbar" content="false" />

add these lines to the head-sections of image_template.html, about_template.html, browse_template.html, comment_template.html .. to all your template-files in your template-folder

this was the third and last part of the small tutorial, when I find time (after adding all this code to all my .htaccess, robots.txt and template-files ), I will add this to the Pixelpost Wiki as well
  #5  
Old 02-16-2007, 09:13 AM
blinking8s's Avatar
blinking8s+ Offline
über loafer
 
Join Date: Oct 2004
Location: Bowling Green, Ky
Posts: 3,428
Send a message via ICQ to blinking8s Send a message via AIM to blinking8s Send a message via MSN to blinking8s Send a message via Skype™ to blinking8s
nice writup connie! im sure some users will find this useful.

we need to get a pixelpost blog up for content link this. that can be next weekends chore for me. this one is booked up.
__________________
i should say more clever stuff
  #6  
Old 02-19-2007, 12:20 PM
GeoS's Avatar
GeoS+ Offline
Team Pixelpost
 
Join Date: Apr 2005
Location: Warsaw, Poland
Posts: 3,613
Send a message via ICQ to GeoS Send a message via Skype™ to GeoS
We are waiting for new such nice texts Connie. Really nice work
__________________
photoblog | portfolio | addons | Donate
  #7  
Old 02-19-2007, 01:31 PM
Connie
Guest
 
Posts: n/a
I will try, whatever will annoy me next...

I will add this to the wiki as well
Thread Closed


Thread Tools




All times are GMT. The time now is 01:43 PM.

Powered by vBulletin® Version 3.7.3
Copyright ©2000 - 2013, Jelsoft Enterprises Ltd. | Style Design: d3 designs