<?xml version="1.0" encoding="UTF-8"?><rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
		>
<channel>
	<title>Comments on: Introduction to Web Crawling</title>
	<atom:link href="http://www.grok.in/blog/2008/06/07/introduction-to-web-crawling/feed/" rel="self" type="application/rss+xml" />
	<link>http://www.grok.in/blog/2008/06/07/introduction-to-web-crawling/</link>
	<description>(ignorance killed the cat, curiosity was framed)</description>
	<lastBuildDate>Wed, 16 Jun 2010 18:48:41 +0000</lastBuildDate>
	
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
		<item>
		<title>By: Michael Wolf</title>
		<link>http://www.grok.in/blog/2008/06/07/introduction-to-web-crawling/comment-page-1/#comment-472</link>
		<dc:creator>Michael Wolf</dc:creator>
		<pubDate>Sun, 01 Nov 2009 11:06:28 +0000</pubDate>
		<guid isPermaLink="false">http://www.grok.in/?p=35#comment-472</guid>
		<description>The &quot;read more...&quot; link from *this* article goes to the full text of a *different* article</description>
		<content:encoded><![CDATA[<p>The &#8220;read more&#8230;&#8221; link from *this* article goes to the full text of a *different* article</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Jack</title>
		<link>http://www.grok.in/blog/2008/06/07/introduction-to-web-crawling/comment-page-1/#comment-388</link>
		<dc:creator>Jack</dc:creator>
		<pubDate>Thu, 16 Oct 2008 09:48:24 +0000</pubDate>
		<guid isPermaLink="false">http://www.grok.in/?p=35#comment-388</guid>
		<description>Check this link about web crawling it&#039;s very interesting,

http://crawltheweb.blogspot.com/</description>
		<content:encoded><![CDATA[<p>Check this link about web crawling it&#8217;s very interesting,</p>
<p><a href="http://crawltheweb.blogspot.com/" rel="nofollow">http://crawltheweb.blogspot.com/</a></p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Siddhartha Reddy</title>
		<link>http://www.grok.in/blog/2008/06/07/introduction-to-web-crawling/comment-page-1/#comment-171</link>
		<dc:creator>Siddhartha Reddy</dc:creator>
		<pubDate>Mon, 09 Jun 2008 05:45:40 +0000</pubDate>
		<guid isPermaLink="false">http://www.grok.in/?p=35#comment-171</guid>
		<description>@sanket,

Converting relative URLs to absolute is necessary, but, I was referring to canonicalization of URLs -- different URLs can be referring to the same resource. For example, http://www.grok.in:80/ and http://www.grok.in/ refer to the same resource. Canonicalizing URLs will ensure that these different URLs are considered as the same; in the previous example, both the URLs will be normalized to one of the two forms.</description>
		<content:encoded><![CDATA[<p>@sanket,</p>
<p>Converting relative URLs to absolute is necessary, but, I was referring to canonicalization of URLs &#8212; different URLs can be referring to the same resource. For example, <a href="http://www.grok.in:80/" rel="nofollow">http://www.grok.in:80/</a> and <a href="http://www.grok.in/" rel="nofollow">http://www.grok.in/</a> refer to the same resource. Canonicalizing URLs will ensure that these different URLs are considered as the same; in the previous example, both the URLs will be normalized to one of the two forms.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: sanket</title>
		<link>http://www.grok.in/blog/2008/06/07/introduction-to-web-crawling/comment-page-1/#comment-170</link>
		<dc:creator>sanket</dc:creator>
		<pubDate>Mon, 09 Jun 2008 03:55:51 +0000</pubDate>
		<guid isPermaLink="false">http://www.grok.in/?p=35#comment-170</guid>
		<description>What is *URL Normalization*? Converting relative to absolute addresses or something like that?</description>
		<content:encoded><![CDATA[<p>What is *URL Normalization*? Converting relative to absolute addresses or something like that?</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Ujj</title>
		<link>http://www.grok.in/blog/2008/06/07/introduction-to-web-crawling/comment-page-1/#comment-169</link>
		<dc:creator>Ujj</dc:creator>
		<pubDate>Sat, 07 Jun 2008 17:22:17 +0000</pubDate>
		<guid isPermaLink="false">http://www.grok.in/?p=35#comment-169</guid>
		<description>good to know you write crawlers. I was doing research in Grid Computing sometime back when I dived into crawlers and fell in love with the whole concept and the math involved, is a field to spend a lifetime in. Good post.</description>
		<content:encoded><![CDATA[<p>good to know you write crawlers. I was doing research in Grid Computing sometime back when I dived into crawlers and fell in love with the whole concept and the math involved, is a field to spend a lifetime in. Good post.</p>
]]></content:encoded>
	</item>
</channel>
</rss>
