Prevent hotlinking without killing images in my RSS feed?
January 11, 2010 7:51 PM Subscribe
Help! I'm being hosed! But, how do I prevent hotlinking without killing the pictures in my RSS feed?
Sorry if this is a little too geeky for Meta, but googling has (mostly) failed me, and when I look around at people asking similar questions on coder / web design forums they all seem to be pretty harsh places.
You and me may have had our (minor) differences, Metafilter, but I still think you're a nice persons!
So... I have a website. Some lovely people on a load of chinese forums have started hotlinking to entire rather large galleries on said website. While my host allegedly allows unlimited bandwidth, I fear that a couple of hundred GB's a day may be enough for them to start getting arsey with me.
I've added some stuff to htaccess. Now in place of a hotlinked image, users are served a picture with a snarky message asking them to visit the site instead. The problem with this is, my RSS subscribers are now also served this image in place of the pictures in my feed.
I know how to allow specific sites to hotlink, say just Google Reader or whatever, but what I'd really like to be able to do is add a rule using wildcards saying that any referrer that has "reader", "feed" or "RSS" somewhere in the URL can get images, and then also perhaps instead of having to work out the specific addresses every reader uses I can just add "yahoo", "google", "netnewswire" with some wildcards around them and go with that.
Does anyone have experience with this and know what the syntax is? Is this something I should not do because excessive wildcard usage will slow things down or something? Is there a more elegant solution? Blocking specific URL's will unfortunately not work for me because the hotlink rapeage is coming from a bewildering array of domains.
For your reference, here's the code I'm using in my htaccess right now:
RewriteEngine On
RewriteCond %{HTTP_REFERER} !^http://(.+\.)?mydomain\.com/ [NC]
RewriteCond %{HTTP_REFERER} !^$
RewriteRule .*\.(jpe?g|gif|bmp|png)$ http://mydomain.com/images/nohotlink.jpe [L]
Thanks!
Sorry if this is a little too geeky for Meta, but googling has (mostly) failed me, and when I look around at people asking similar questions on coder / web design forums they all seem to be pretty harsh places.
You and me may have had our (minor) differences, Metafilter, but I still think you're a nice persons!
So... I have a website. Some lovely people on a load of chinese forums have started hotlinking to entire rather large galleries on said website. While my host allegedly allows unlimited bandwidth, I fear that a couple of hundred GB's a day may be enough for them to start getting arsey with me.
I've added some stuff to htaccess. Now in place of a hotlinked image, users are served a picture with a snarky message asking them to visit the site instead. The problem with this is, my RSS subscribers are now also served this image in place of the pictures in my feed.
I know how to allow specific sites to hotlink, say just Google Reader or whatever, but what I'd really like to be able to do is add a rule using wildcards saying that any referrer that has "reader", "feed" or "RSS" somewhere in the URL can get images, and then also perhaps instead of having to work out the specific addresses every reader uses I can just add "yahoo", "google", "netnewswire" with some wildcards around them and go with that.
Does anyone have experience with this and know what the syntax is? Is this something I should not do because excessive wildcard usage will slow things down or something? Is there a more elegant solution? Blocking specific URL's will unfortunately not work for me because the hotlink rapeage is coming from a bewildering array of domains.
For your reference, here's the code I'm using in my htaccess right now:
RewriteEngine On
RewriteCond %{HTTP_REFERER} !^http://(.+\.)?mydomain\.com/ [NC]
RewriteCond %{HTTP_REFERER} !^$
RewriteRule .*\.(jpe?g|gif|bmp|png)$ http://mydomain.com/images/nohotlink.jpe [L]
Thanks!
Response by poster: I should perhaps clarify that what I'd really like is for someone to help me out with the syntax for using wildcards here. Whenever I try it seems to break stuff. D~:
Any help super appreciated!
posted by TheTorns at 8:05 PM on January 11, 2010
Any help super appreciated!
posted by TheTorns at 8:05 PM on January 11, 2010
Instead of matching the referrer, you can match the image URL, and add, say '?via_rss', to the ends of image URLs in your RSS feed. Someone intent on hotlinking can easily bypass this, but it should be enough to stop casual forum posters who found your image on Google, and no RSS readers get blocks.
posted by domnit at 8:26 PM on January 11, 2010
posted by domnit at 8:26 PM on January 11, 2010
Response by poster: That sounds great domnit, but I must have given the impression I am smarter than I am - because I really don't know how to do that. X~D
Any chance you could explain in more detail?
posted by TheTorns at 8:31 PM on January 11, 2010
Any chance you could explain in more detail?
posted by TheTorns at 8:31 PM on January 11, 2010
This may be an opportunity to use blacklisting rather than whitelisting.
Which is to say, rather than saying "anything that *doesn't* match this, send the hotlink image", change it to "anything that does match this, send the hotlink image". You say that they are coming from a bewildering number of domains, but it may be a more limited range of IPs. At the very least you may be able to confine it to just a few. So then you can write code like this:
posted by Deathalicious at 8:41 PM on January 11, 2010
Which is to say, rather than saying "anything that *doesn't* match this, send the hotlink image", change it to "anything that does match this, send the hotlink image". You say that they are coming from a bewildering number of domains, but it may be a more limited range of IPs. At the very least you may be able to confine it to just a few. So then you can write code like this:
RewriteEngine On
RewriteCond %{REMOTE_ADDR} 123.(12|34|56).[0-9]+.[0-9]+ [OR]
RewriteCond %{REMOTE_ADDR} 121.[0-9]+.[0-9]+.[0-9]+ [OR]
RewriteCond %{REMOTE_ADDR} 120.(56|78|90|101).[0-9]+.[0-9]+
RewriteRule .*\.(jpe?g|gif|bmp|png)$ http://mydomain.com/images/nohotlink.jpe [L]
posted by Deathalicious at 8:41 PM on January 11, 2010
Response by poster: Thanks Deathalicious!
So am I right in thinking that the first line is saying "block 123.(12 or 34 or 56).*.*
And the next one is more like 121.*.*.*
?
Whatever variety of wildcarding is being used here is not one that's familiar to me I'm afraid.
posted by TheTorns at 8:49 PM on January 11, 2010
So am I right in thinking that the first line is saying "block 123.(12 or 34 or 56).*.*
And the next one is more like 121.*.*.*
?
Whatever variety of wildcarding is being used here is not one that's familiar to me I'm afraid.
posted by TheTorns at 8:49 PM on January 11, 2010
Sorry, Torns, my idea would require you to modify or extend your blog software--you'd have to somehow append some tag to image URLs, but only in the feed.
And if you're up for a headache, the wildcard things you refer to are called regular expressions.
posted by domnit at 9:12 PM on January 11, 2010
And if you're up for a headache, the wildcard things you refer to are called regular expressions.
posted by domnit at 9:12 PM on January 11, 2010
Response by poster: Oh! I do have some experience with regex actually, I just didn't recognize that was what was being used there. Knowing that it's regex I might be alright, but It's still my preferred solution, new domains are popping up all the time as we speak.
Oddly enough I'm using regex via yahoo pipes to strip out some propietry markup for images from my feed and replace it with common or garden html, so I could perhaps quite easily append something there if I had to. I just don't quite understand how I would match the image URL in htaccess, and then how I would get htaccess to recognize the ?via_rss bit and let it through...
posted by TheTorns at 9:28 PM on January 11, 2010
Oddly enough I'm using regex via yahoo pipes to strip out some propietry markup for images from my feed and replace it with common or garden html, so I could perhaps quite easily append something there if I had to. I just don't quite understand how I would match the image URL in htaccess, and then how I would get htaccess to recognize the ?via_rss bit and let it through...
posted by TheTorns at 9:28 PM on January 11, 2010
Response by poster: Whoops! I meant to say above that it's still /not/ my preferred solution, in reference to blacklisting IP's or domain names. Sorry about that.
posted by TheTorns at 9:36 PM on January 11, 2010
posted by TheTorns at 9:36 PM on January 11, 2010
Everything you need is documented thoroughly on the apache site: mod_rewrite. Everything after the ? in the URL is part of ${QUERY_STRING}, so you have to use that in your rule if you want to write such a condition.
posted by Rhomboid at 9:37 PM on January 11, 2010
posted by Rhomboid at 9:37 PM on January 11, 2010
RewriteEngine on
RewriteCond %{HTTP_REFERER} ^http://(forum\.)?badsite.com/.*$ [NC]
RewriteRule \.(gif|jpg|png)$ http://www.yoursite.com/hello.jpg [R,L]
posted by Nameless at 10:11 PM on January 11, 2010
RewriteCond %{HTTP_REFERER} ^http://(forum\.)?badsite.com/.*$ [NC]
RewriteRule \.(gif|jpg|png)$ http://www.yoursite.com/hello.jpg [R,L]
posted by Nameless at 10:11 PM on January 11, 2010
Response by poster: Here's what I'm going with for now:
[code]
RewriteEngine On
RewriteCond %{HTTP_REFERER} !^http(s)?://(.+\.)?mydomain\.com/ [NC]
RewriteCond %{HTTP_REFERER} !^.feed. [NC]
RewriteCond %{HTTP_REFERER} !^.google. [NC]
RewriteCond %{HTTP_REFERER} !^.read. [NC]
RewriteCond %{HTTP_REFERER} !^.rss. [NC]
RewriteCond %{HTTP_REFERER} !^.zilla. [NC]
RewriteCond %{HTTP_REFERER} !^.yahoo. [NC]
RewriteCond %{HTTP_REFERER} !^.news. [NC]
RewriteCond %{HTTP_REFERER} !^.opera. [NC]
RewriteCond %{HTTP_REFERER} !^.pipes. [NC]
RewriteCond %{HTTP_REFERER} !^.space. [NC]
RewriteCond %{HTTP_REFERER} !^$
RewriteRule .*\.(jpe?g|gif|bmp|png)$ http://mydomain.com/images/nohotlink.jpe [L]
[/code]
I'm thinking that might have me covered for the moment. Domnit's solution sounds a lot more elegant though!
posted by TheTorns at 12:02 AM on January 12, 2010
[code]
RewriteEngine On
RewriteCond %{HTTP_REFERER} !^http(s)?://(.+\.)?mydomain\.com/ [NC]
RewriteCond %{HTTP_REFERER} !^.feed. [NC]
RewriteCond %{HTTP_REFERER} !^.google. [NC]
RewriteCond %{HTTP_REFERER} !^.read. [NC]
RewriteCond %{HTTP_REFERER} !^.rss. [NC]
RewriteCond %{HTTP_REFERER} !^.zilla. [NC]
RewriteCond %{HTTP_REFERER} !^.yahoo. [NC]
RewriteCond %{HTTP_REFERER} !^.news. [NC]
RewriteCond %{HTTP_REFERER} !^.opera. [NC]
RewriteCond %{HTTP_REFERER} !^.pipes. [NC]
RewriteCond %{HTTP_REFERER} !^.space. [NC]
RewriteCond %{HTTP_REFERER} !^$
RewriteRule .*\.(jpe?g|gif|bmp|png)$ http://mydomain.com/images/nohotlink.jpe [L]
[/code]
I'm thinking that might have me covered for the moment. Domnit's solution sounds a lot more elegant though!
posted by TheTorns at 12:02 AM on January 12, 2010
That doesn't really make sense. "." means any single character, and ^ means anchor to the beginning of the field, so those would not match e.g. www.google.com. If you intend to match a literal period you need to use "\." and not anchor the match to the beginning; if you intend to match one or more characters you need ".+".
posted by Rhomboid at 12:08 AM on January 12, 2010
posted by Rhomboid at 12:08 AM on January 12, 2010
Response by poster: Crikey, yes. Thanks for spotting that Rhomboid.
Here's what I replaced that with:
RewriteCond %{HTTP_REFERER} !^.?feed.? [NC]
RewriteCond %{HTTP_REFERER} !^.?google.? [NC]
RewriteCond %{HTTP_REFERER} !^.?read.? [NC]
RewriteCond %{HTTP_REFERER} !^.?rss.? [NC]
RewriteCond %{HTTP_REFERER} !^.?zilla.? [NC]
RewriteCond %{HTTP_REFERER} !^.?yahoo.? [NC]
RewriteCond %{HTTP_REFERER} !^.?news.? [NC]
RewriteCond %{HTTP_REFERER} !^.?opera.? [NC]
RewriteCond %{HTTP_REFERER} !^.?pipes.? [NC]
RewriteCond %{HTTP_REFERER} !^.?space.? [NC]
posted by TheTorns at 12:41 AM on January 12, 2010
Here's what I replaced that with:
RewriteCond %{HTTP_REFERER} !^.?feed.? [NC]
RewriteCond %{HTTP_REFERER} !^.?google.? [NC]
RewriteCond %{HTTP_REFERER} !^.?read.? [NC]
RewriteCond %{HTTP_REFERER} !^.?rss.? [NC]
RewriteCond %{HTTP_REFERER} !^.?zilla.? [NC]
RewriteCond %{HTTP_REFERER} !^.?yahoo.? [NC]
RewriteCond %{HTTP_REFERER} !^.?news.? [NC]
RewriteCond %{HTTP_REFERER} !^.?opera.? [NC]
RewriteCond %{HTTP_REFERER} !^.?pipes.? [NC]
RewriteCond %{HTTP_REFERER} !^.?space.? [NC]
posted by TheTorns at 12:41 AM on January 12, 2010
Response by poster: Actually, on reflection:
RewriteCond %{HTTP_REFERER} !^.+feed.+ [NC]
RewriteCond %{HTTP_REFERER} !^.+google.+ [NC]
RewriteCond %{HTTP_REFERER} !^.+read.+ [NC]
RewriteCond %{HTTP_REFERER} !^.+rss.+ [NC]
RewriteCond %{HTTP_REFERER} !^.+zilla.+ [NC]
RewriteCond %{HTTP_REFERER} !^.+yahoo.+ [NC]
RewriteCond %{HTTP_REFERER} !^.+news.+ [NC]
RewriteCond %{HTTP_REFERER} !^.+opera.+ [NC]
RewriteCond %{HTTP_REFERER} !^.+pipes.+ [NC]
RewriteCond %{HTTP_REFERER} !^.+torn.+ [NC]
RewriteCond %{HTTP_REFERER} !^.+space.+ [NC]
posted by TheTorns at 12:46 AM on January 12, 2010
RewriteCond %{HTTP_REFERER} !^.+feed.+ [NC]
RewriteCond %{HTTP_REFERER} !^.+google.+ [NC]
RewriteCond %{HTTP_REFERER} !^.+read.+ [NC]
RewriteCond %{HTTP_REFERER} !^.+rss.+ [NC]
RewriteCond %{HTTP_REFERER} !^.+zilla.+ [NC]
RewriteCond %{HTTP_REFERER} !^.+yahoo.+ [NC]
RewriteCond %{HTTP_REFERER} !^.+news.+ [NC]
RewriteCond %{HTTP_REFERER} !^.+opera.+ [NC]
RewriteCond %{HTTP_REFERER} !^.+pipes.+ [NC]
RewriteCond %{HTTP_REFERER} !^.+torn.+ [NC]
RewriteCond %{HTTP_REFERER} !^.+space.+ [NC]
posted by TheTorns at 12:46 AM on January 12, 2010
"?" means "zero or one" so ".?" means either nothing or any one character. I still don't think that's what you want, as that still wouldn't match www.whatever. If your intent is to match "any referrer that contains the string .google." then you want "\.google\.", and the ! in front inverts the logic, so "RewriteCond %{HTTP_REFERER} !\.google\. [NC]" means "the following RewriteRule applies unless the referer contains the string '.google.', non-case sensitively."
posted by Rhomboid at 12:48 AM on January 12, 2010
posted by Rhomboid at 12:48 AM on January 12, 2010
Oh, and if you just want to match any string with "google" then there's no need for anything else: RewriteCond %{HTTP_REFERER} !google [NC]
posted by Rhomboid at 12:50 AM on January 12, 2010
posted by Rhomboid at 12:50 AM on January 12, 2010
You want to display the correct image to users with an empty referrer or a referrer which matched your site url.
posted by devnull at 12:54 AM on January 12, 2010
posted by devnull at 12:54 AM on January 12, 2010
Response by poster: Thanks for the help guys!
Here's what I'm rolling with:
RewriteEngine On
RewriteCond %{HTTP_REFERER} !^http(s)?://(.+\.)?mydomain\.com/ [NC]
RewriteCond %{HTTP_REFERER} !feed [NC]
RewriteCond %{HTTP_REFERER} !google [NC]
RewriteCond %{HTTP_REFERER} !read [NC]
RewriteCond %{HTTP_REFERER} !rss [NC]
RewriteCond %{HTTP_REFERER} !zilla [NC]
RewriteCond %{HTTP_REFERER} !yahoo [NC]
RewriteCond %{HTTP_REFERER} !news [NC]
RewriteCond %{HTTP_REFERER} !opera [NC]
RewriteCond %{HTTP_REFERER} !pipes [NC]
RewriteCond %{HTTP_REFERER} !space [NC]
RewriteCond %{HTTP_REFERER} !^$
RewriteRule .*\.(jpe?g|gif|bmp|png)$ http://thetorns.com/images/nohotlink.jpe [L]
Seems to be working!
posted by TheTorns at 1:03 AM on January 12, 2010
Here's what I'm rolling with:
RewriteEngine On
RewriteCond %{HTTP_REFERER} !^http(s)?://(.+\.)?mydomain\.com/ [NC]
RewriteCond %{HTTP_REFERER} !feed [NC]
RewriteCond %{HTTP_REFERER} !google [NC]
RewriteCond %{HTTP_REFERER} !read [NC]
RewriteCond %{HTTP_REFERER} !rss [NC]
RewriteCond %{HTTP_REFERER} !zilla [NC]
RewriteCond %{HTTP_REFERER} !yahoo [NC]
RewriteCond %{HTTP_REFERER} !news [NC]
RewriteCond %{HTTP_REFERER} !opera [NC]
RewriteCond %{HTTP_REFERER} !pipes [NC]
RewriteCond %{HTTP_REFERER} !space [NC]
RewriteCond %{HTTP_REFERER} !^$
RewriteRule .*\.(jpe?g|gif|bmp|png)$ http://thetorns.com/images/nohotlink.jpe [L]
Seems to be working!
posted by TheTorns at 1:03 AM on January 12, 2010
OK, so for my solution the htaccess part should be:
RewriteCond ${QUERY_STRING} !via_rss
after the RewriteCond lines you already have.
As for rewriting the URLs, that would be an actual programming task--maybe Yahoo Pipes would work.
Periodically check for sites scraping your feed and individually block those.
posted by domnit at 9:57 AM on January 12, 2010
RewriteCond ${QUERY_STRING} !via_rss
after the RewriteCond lines you already have.
As for rewriting the URLs, that would be an actual programming task--maybe Yahoo Pipes would work.
Periodically check for sites scraping your feed and individually block those.
posted by domnit at 9:57 AM on January 12, 2010
« Older how to make chocolate covered pretzels from... | Maybe there was no need to hitchhike? Newer »
This thread is closed to new comments.
Don't forget to add whatever referers firefox and safari use.
posted by jjwiseman at 8:01 PM on January 11, 2010