<?xml version="1.0" encoding="utf-8"?>
<rss version="2.0"
    xmlns:dc="http://purl.org/dc/elements/1.1/"
     xmlns:admin="http://webns.net/mvcb/"
     xmlns:content="http://purl.org/rss/1.0/modules/content/"
     xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#">
	<channel> 

	<title>Comments on: How do I block Googlebot and other bots from indexing a certain part of a webpage?</title>
	<link>http://ask.metafilter.com/93650/How-do-I-block-Googlebot-and-other-bots-from-indexing-a-certain-part-of-a-webpage/</link>
	<description>Comments on Ask MetaFilter post How do I block Googlebot and other bots from indexing a certain part of a webpage?</description>
	<pubDate>Mon, 09 Jun 2008 19:12:13 -0800</pubDate>
	<lastBuildDate>Mon, 09 Jun 2008 19:12:13 -0800</lastBuildDate>
	<language>en-us</language>
	<docs>http://blogs.law.harvard.edu/tech/rss</docs>
	<ttl>60</ttl>

	<item>
		<title>Question: How do I block Googlebot and other bots from indexing a certain part of a webpage?</title>
		<link>http://ask.metafilter.com/93650/How-do-I-block-Googlebot-and-other-bots-from-indexing-a-certain-part-of-a-webpage</link>	
		<description>How do I block Googlebot and other bots from indexing a certain part of a webpage? &lt;br /&gt;&lt;br /&gt; I run an online music magazine. On every article, show review and feature, we use one column on the webpage to list 15-20 upcoming shows and venue information. Unfortunately, due to the repetitive text, Google inevitably thinks each webpage is about the 15-20 shows listed (particularly the repeated venue information) and not about the article, show review or feature. &lt;br&gt;
&lt;br&gt;
We&apos;re now using other techniques to emphasize the content in the articles (title tags, header tags, meta description/keywords, bolding) but I&apos;d really like it if I could exclude the column with the upcoming shows and venue information from being indexed. If that could be achieved, Google might instead focus on the true content and the keywords inside!&lt;br&gt;
&lt;br&gt;
Any ideas?</description>
		<guid isPermaLink="false">post:ask.metafilter.com,2008:site.93650</guid>
		<pubDate>Mon, 09 Jun 2008 19:04:14 -0800</pubDate>
		<dc:creator>jrholt</dc:creator>
		
			<category>SEO</category>
		
			<category>Google</category>
		
	</item> <item>
		<title>By: knave</title>
		<link>http://ask.metafilter.com/93650/How-do-I-block-Googlebot-and-other-bots-from-indexing-a-certain-part-of-a-webpage#1369921</link>	
		<description>&lt;a href=&quot;http://www.google.com/support/webmasters/bin/answer.py?hl=en&amp;answer=40360&quot;&gt;robots.txt&lt;/a&gt;</description>
		<guid isPermaLink="false">comment:ask.metafilter.com,2008:site.93650-1369921</guid>
		<pubDate>Mon, 09 Jun 2008 19:12:13 -0800</pubDate>
		<dc:creator>knave</dc:creator>
	</item><item>
		<title>By: spiderskull</title>
		<link>http://ask.metafilter.com/93650/How-do-I-block-Googlebot-and-other-bots-from-indexing-a-certain-part-of-a-webpage#1369922</link>	
		<description>Apparently, you can just add a META tag (I didn&apos;t know they were relevant) in your &amp;lt;HEAD&amp;gt; section:&lt;br&gt;
&amp;lt;META NAME=&quot;ROBOTS&quot; CONTENT=&quot;NOINDEX, NOFOLLOW&quot;&amp;gt;&lt;br&gt;
&lt;br&gt;
Another way, if your host supports it, is through &lt;a href=&quot;http://www.trap17.com/index.php/htaccess-block-bots_t28949.html&quot;&gt;&lt;code&gt;.htaccess&lt;/code&gt;&lt;/a&gt;</description>
		<guid isPermaLink="false">comment:ask.metafilter.com,2008:site.93650-1369922</guid>
		<pubDate>Mon, 09 Jun 2008 19:12:34 -0800</pubDate>
		<dc:creator>spiderskull</dc:creator>
	</item><item>
		<title>By: jrholt</title>
		<link>http://ask.metafilter.com/93650/How-do-I-block-Googlebot-and-other-bots-from-indexing-a-certain-part-of-a-webpage#1369930</link>	
		<description>Robots.txt and .htaccess seem only valid if you want to block bots from indexing your *entire* page. I want to exclude only a certain part of the page.&lt;br&gt;
&lt;br&gt;
Am I missing something?</description>
		<guid isPermaLink="false">comment:ask.metafilter.com,2008:site.93650-1369930</guid>
		<pubDate>Mon, 09 Jun 2008 19:20:03 -0800</pubDate>
		<dc:creator>jrholt</dc:creator>
	</item><item>
		<title>By: McSly</title>
		<link>http://ask.metafilter.com/93650/How-do-I-block-Googlebot-and-other-bots-from-indexing-a-certain-part-of-a-webpage#1369937</link>	
		<description>You can&apos;t do that with just &quot;robots.txt&quot; or the .htaccess file. The only way to do it is to have 2 versions of your page. In the code of the page, you have to detect the type of browser and if the browser signature is Googlebot, you don&apos;t display certain parts of the page. It&apos;s called &lt;a href=&quot;http://www.webreference.com/authoring/search_engines/cloaking/&quot;&gt;cloaking&lt;/a&gt;.</description>
		<guid isPermaLink="false">comment:ask.metafilter.com,2008:site.93650-1369937</guid>
		<pubDate>Mon, 09 Jun 2008 19:27:21 -0800</pubDate>
		<dc:creator>McSly</dc:creator>
	</item><item>
		<title>By: majick</title>
		<link>http://ask.metafilter.com/93650/How-do-I-block-Googlebot-and-other-bots-from-indexing-a-certain-part-of-a-webpage#1369938</link>	
		<description>&lt;i&gt;&quot; I want to exclude only a certain part of the page.  Am I missing something?&quot;&lt;/i&gt;&lt;br&gt;
&lt;br&gt;
Don&apos;t serve that part of the page to Google.  The address ranges and User-Agent headers are easily recognizable, just serve up that section of the page conditionally.</description>
		<guid isPermaLink="false">comment:ask.metafilter.com,2008:site.93650-1369938</guid>
		<pubDate>Mon, 09 Jun 2008 19:27:33 -0800</pubDate>
		<dc:creator>majick</dc:creator>
	</item><item>
		<title>By: Lazlo</title>
		<link>http://ask.metafilter.com/93650/How-do-I-block-Googlebot-and-other-bots-from-indexing-a-certain-part-of-a-webpage#1369945</link>	
		<description>Just wondering: would using &quot;nofollow&quot; on the pages with the show lists (or rel=nofollow on the show links) reduce the effect those links have on the page&apos;s overall rank?</description>
		<guid isPermaLink="false">comment:ask.metafilter.com,2008:site.93650-1369945</guid>
		<pubDate>Mon, 09 Jun 2008 19:33:23 -0800</pubDate>
		<dc:creator>Lazlo</dc:creator>
	</item><item>
		<title>By: spiderskull</title>
		<link>http://ask.metafilter.com/93650/How-do-I-block-Googlebot-and-other-bots-from-indexing-a-certain-part-of-a-webpage#1369949</link>	
		<description>Oh I see, I misunderstood the question. Does anyone here know if the Google bots run Javascript at all? Because if they don&apos;t, you can look into adding the show information via a client-side include.&lt;br&gt;
&lt;br&gt;
Scroll down about midway on &lt;a href=&quot;http://www.boutell.com/newfaq/creating/include.html&quot;&gt;this page&lt;/a&gt;, which shows how you can include other files. Then make the included file blocked via .htaccess or robots.txt.</description>
		<guid isPermaLink="false">comment:ask.metafilter.com,2008:site.93650-1369949</guid>
		<pubDate>Mon, 09 Jun 2008 19:39:57 -0800</pubDate>
		<dc:creator>spiderskull</dc:creator>
	</item><item>
		<title>By: soma lkzx</title>
		<link>http://ask.metafilter.com/93650/How-do-I-block-Googlebot-and-other-bots-from-indexing-a-certain-part-of-a-webpage#1369985</link>	
		<description>use javascript to write the non-indexable info to the page on the fly! the googlebot won&apos;t get that.</description>
		<guid isPermaLink="false">comment:ask.metafilter.com,2008:site.93650-1369985</guid>
		<pubDate>Mon, 09 Jun 2008 20:15:31 -0800</pubDate>
		<dc:creator>soma lkzx</dc:creator>
	</item><item>
		<title>By: mumkin</title>
		<link>http://ask.metafilter.com/93650/How-do-I-block-Googlebot-and-other-bots-from-indexing-a-certain-part-of-a-webpage#1370025</link>	
		<description>I&apos;m pretty sure that there&apos;s no way to accomplish this that will ultimately make Google happy with you. You might find a method that&apos;ll work for a time, but the bot is ever-evolving, and when Google finds that you&apos;re trying to hide things from Googlebot, I gather they go a bit nuclear on your pagerank under the assumption that you have ulterior motives. You don&apos;t want that to happen, obviously. Better if you continue the white hat methods you&apos;re currently using.&lt;br&gt;
&lt;br&gt;
Googlebot makes some assumptions about content importance based on its position within the document. Closer to the top = more topical, closer to the bottom = not so much. So, if you can make the column of gigs one of the very last things in your html, and then position it where you want it via CSS, that might help to reduce its importance in the eyes of Googlebot.&lt;br&gt;
&lt;br&gt;
If you&apos;re not already, get hooked up with &lt;a href=&quot;https://www.google.com/webmasters/tools/&quot;&gt;Google Webmaster Tools&lt;/a&gt; and start submitting site maps, to help the bot find its way around. Also read the &lt;a href=&quot;http://googlewebmastercentral.blogspot.com/&quot;&gt;Google Webmaster Central Blog&lt;/a&gt;.&lt;br&gt;
&lt;br&gt;
When I was running Google Ads, there was a way to assign page areas different weights, to help the AdSense engine determine what your page was &lt;em&gt;most&lt;/em&gt; about, so that it could more closely target the ads. I&apos;m not sure if Googlebot pays any attention to that or not, but it might be worth investigating.&lt;br&gt;
&lt;br&gt;
If all else fails, you might consider presenting the textual upcoming show content in a non-text format that the bot can&apos;t parse. Graphic, Flash, etc. It&apos;s certainly not ideal, but you could develop something to automate image generation server-side.</description>
		<guid isPermaLink="false">comment:ask.metafilter.com,2008:site.93650-1370025</guid>
		<pubDate>Mon, 09 Jun 2008 20:48:46 -0800</pubDate>
		<dc:creator>mumkin</dc:creator>
	</item><item>
		<title>By: Class Goat</title>
		<link>http://ask.metafilter.com/93650/How-do-I-block-Googlebot-and-other-bots-from-indexing-a-certain-part-of-a-webpage#1370035</link>	
		<description>Here&apos;s what my robots.txt file looks like:&lt;br&gt;
&lt;br&gt;
User-Agent: *&lt;br&gt;
Disallow: /&lt;br&gt;
&lt;br&gt;
User-Agent: Googlebot&lt;br&gt;
Disallow: /magicdir1/&lt;br&gt;
Disallow: /magicdir2/&lt;br&gt;
Allow: /&lt;br&gt;
&lt;br&gt;
User-Agent: MSNBot&lt;br&gt;
Disallow: /magicdir1/&lt;br&gt;
Disallow: /magicdir2/&lt;br&gt;
Allow: /&lt;br&gt;
&lt;br&gt;
User-Agent: AskJeeves&lt;br&gt;
Disallow: /magicdir1/&lt;br&gt;
Disallow: /magicdir2/&lt;br&gt;
Allow: /&lt;br&gt;
&lt;br&gt;
The first two lines kill all compliant bots except the three I explicitly permit. (&quot;AskJeeves&quot; is answer.com.) You can have as many disallow lines as you want, followed by the &quot;Allow: /&quot; and the bot in question will avoid all the disallows and do everything else.&lt;br&gt;
&lt;br&gt;
The path is from the web root. On my linux server, that&apos;s &quot;/home/groups/home/web&quot;. So if the above was interpreted literally, it would exclude /home/groups/home/web/magicdir1/ and /home/groups/home/web/magicdir2/ but permit the Googlebot to see everything else in the web directory.&lt;br&gt;
&lt;br&gt;
It isn&apos;t a perfect solution. Many bots ignore the robots.txt file entirely. Oddly, some of them read it, and then ignore it. But my experience is that those three are well behaved, because I&apos;ve stopped getting search hits on the directories I excluded this way from all three of them.</description>
		<guid isPermaLink="false">comment:ask.metafilter.com,2008:site.93650-1370035</guid>
		<pubDate>Mon, 09 Jun 2008 21:03:20 -0800</pubDate>
		<dc:creator>Class Goat</dc:creator>
	</item><item>
		<title>By: Class Goat</title>
		<link>http://ask.metafilter.com/93650/How-do-I-block-Googlebot-and-other-bots-from-indexing-a-certain-part-of-a-webpage#1370037</link>	
		<description>By the way, the trailing slash on the disallow lines is essential.</description>
		<guid isPermaLink="false">comment:ask.metafilter.com,2008:site.93650-1370037</guid>
		<pubDate>Mon, 09 Jun 2008 21:04:39 -0800</pubDate>
		<dc:creator>Class Goat</dc:creator>
	</item><item>
		<title>By: toomuchpete</title>
		<link>http://ask.metafilter.com/93650/How-do-I-block-Googlebot-and-other-bots-from-indexing-a-certain-part-of-a-webpage#1370052</link>	
		<description>&lt;i&gt;&quot;I&apos;m pretty sure that there&apos;s no way to accomplish this that will ultimately make Google happy with you.&quot;&lt;/i&gt;&lt;br&gt;
&lt;br&gt;
Not so. Check &lt;b&gt;soma lkzx&lt;/b&gt;&apos;s answer, which will work flawlessly until googlebot starts executing Javascript (don&apos;t count on it). That&apos;s the method I&apos;d recommend.</description>
		<guid isPermaLink="false">comment:ask.metafilter.com,2008:site.93650-1370052</guid>
		<pubDate>Mon, 09 Jun 2008 21:16:02 -0800</pubDate>
		<dc:creator>toomuchpete</dc:creator>
	</item><item>
		<title>By: bprater</title>
		<link>http://ask.metafilter.com/93650/How-do-I-block-Googlebot-and-other-bots-from-indexing-a-certain-part-of-a-webpage#1370057</link>	
		<description>Use CSS magic. You can float content to any page of the page. So the HTML would physically have the content first and links second, but the content would appear on the right side of the page. Nothing an average web designer can&apos;t hack together for you in an afternoon.</description>
		<guid isPermaLink="false">comment:ask.metafilter.com,2008:site.93650-1370057</guid>
		<pubDate>Mon, 09 Jun 2008 21:19:44 -0800</pubDate>
		<dc:creator>bprater</dc:creator>
	</item><item>
		<title>By: mmascolino</title>
		<link>http://ask.metafilter.com/93650/How-do-I-block-Googlebot-and-other-bots-from-indexing-a-certain-part-of-a-webpage#1370072</link>	
		<description>Another solution would be to do the non-indexable stuff as an IFrame and load the IFrame from a URL that is excluded from Googlebot.</description>
		<guid isPermaLink="false">comment:ask.metafilter.com,2008:site.93650-1370072</guid>
		<pubDate>Mon, 09 Jun 2008 21:49:00 -0800</pubDate>
		<dc:creator>mmascolino</dc:creator>
	</item><item>
		<title>By: zippy</title>
		<link>http://ask.metafilter.com/93650/How-do-I-block-Googlebot-and-other-bots-from-indexing-a-certain-part-of-a-webpage#1370149</link>	
		<description>Seconding mmascolino&apos;s suggestion. This is the cleanest, in my opinion.</description>
		<guid isPermaLink="false">comment:ask.metafilter.com,2008:site.93650-1370149</guid>
		<pubDate>Tue, 10 Jun 2008 00:41:20 -0800</pubDate>
		<dc:creator>zippy</dc:creator>
	</item><item>
		<title>By: Smoosh Faced Lion</title>
		<link>http://ask.metafilter.com/93650/How-do-I-block-Googlebot-and-other-bots-from-indexing-a-certain-part-of-a-webpage#1370773</link>	
		<description>thirding the Iframe answer, because its the simplest answer that degrades gracefully, without getting into server-side IP sniffing wizardry.</description>
		<guid isPermaLink="false">comment:ask.metafilter.com,2008:site.93650-1370773</guid>
		<pubDate>Tue, 10 Jun 2008 13:27:57 -0800</pubDate>
		<dc:creator>Smoosh Faced Lion</dc:creator>
	</item>
	</channel>
</rss>
