<?xml version="1.0" encoding="UTF-8"?><rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
		>
<channel>
	<title>Comments on: Introduction to Web Crawling</title>
	<atom:link href="http://www.grok.in/blog/2008/06/07/introduction-to-web-crawling/feed/" rel="self" type="application/rss+xml" />
	<link>http://www.grok.in/blog/2008/06/07/introduction-to-web-crawling/</link>
	<description>(ignorance killed the cat, curiosity was framed)</description>
	<lastBuildDate>Tue, 17 Jan 2012 10:08:29 +0000</lastBuildDate>
	
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
		<item>
		<title>By: Gifts2japan.com</title>
		<link>http://www.grok.in/blog/2008/06/07/introduction-to-web-crawling/comment-page-1/#comment-490</link>
		<dc:creator>Gifts2japan.com</dc:creator>
		<pubDate>Wed, 19 Oct 2011 10:25:58 +0000</pubDate>
		<guid isPermaLink="false">http://www.grok.in/?p=35#comment-490</guid>
		<description>Toast the magic of the celebrations of your loved ones in Japan by sending gifts to Japan online with us. Just log on to www.gifts2japan.com and send gifts to Japan with poise for every relation and for all occasions.</description>
		<content:encoded><![CDATA[<p>Toast the magic of the celebrations of your loved ones in Japan by sending gifts to Japan online with us. Just log on to <a href="http://www.gifts2japan.com" rel="nofollow">http://www.gifts2japan.com</a> and send gifts to Japan with poise for every relation and for all occasions.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Michael Wolf</title>
		<link>http://www.grok.in/blog/2008/06/07/introduction-to-web-crawling/comment-page-1/#comment-472</link>
		<dc:creator>Michael Wolf</dc:creator>
		<pubDate>Sun, 01 Nov 2009 11:06:28 +0000</pubDate>
		<guid isPermaLink="false">http://www.grok.in/?p=35#comment-472</guid>
		<description>The &quot;read more...&quot; link from *this* article goes to the full text of a *different* article</description>
		<content:encoded><![CDATA[<p>The &#8220;read more&#8230;&#8221; link from *this* article goes to the full text of a *different* article</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Jack</title>
		<link>http://www.grok.in/blog/2008/06/07/introduction-to-web-crawling/comment-page-1/#comment-388</link>
		<dc:creator>Jack</dc:creator>
		<pubDate>Thu, 16 Oct 2008 09:48:24 +0000</pubDate>
		<guid isPermaLink="false">http://www.grok.in/?p=35#comment-388</guid>
		<description>Check this link about web crawling it&#039;s very interesting,

http://crawltheweb.blogspot.com/</description>
		<content:encoded><![CDATA[<p>Check this link about web crawling it&#8217;s very interesting,</p>
<p><a href="http://crawltheweb.blogspot.com/" rel="nofollow">http://crawltheweb.blogspot.com/</a></p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Siddhartha Reddy</title>
		<link>http://www.grok.in/blog/2008/06/07/introduction-to-web-crawling/comment-page-1/#comment-171</link>
		<dc:creator>Siddhartha Reddy</dc:creator>
		<pubDate>Mon, 09 Jun 2008 05:45:40 +0000</pubDate>
		<guid isPermaLink="false">http://www.grok.in/?p=35#comment-171</guid>
		<description>@sanket,

Converting relative URLs to absolute is necessary, but, I was referring to canonicalization of URLs -- different URLs can be referring to the same resource. For example, http://www.grok.in:80/ and http://www.grok.in/ refer to the same resource. Canonicalizing URLs will ensure that these different URLs are considered as the same; in the previous example, both the URLs will be normalized to one of the two forms.</description>
		<content:encoded><![CDATA[<p>@sanket,</p>
<p>Converting relative URLs to absolute is necessary, but, I was referring to canonicalization of URLs &#8212; different URLs can be referring to the same resource. For example, <a href="http://www.grok.in:80/" rel="nofollow">http://www.grok.in:80/</a> and <a href="http://www.grok.in/" rel="nofollow">http://www.grok.in/</a> refer to the same resource. Canonicalizing URLs will ensure that these different URLs are considered as the same; in the previous example, both the URLs will be normalized to one of the two forms.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: sanket</title>
		<link>http://www.grok.in/blog/2008/06/07/introduction-to-web-crawling/comment-page-1/#comment-170</link>
		<dc:creator>sanket</dc:creator>
		<pubDate>Mon, 09 Jun 2008 03:55:51 +0000</pubDate>
		<guid isPermaLink="false">http://www.grok.in/?p=35#comment-170</guid>
		<description>What is *URL Normalization*? Converting relative to absolute addresses or something like that?</description>
		<content:encoded><![CDATA[<p>What is *URL Normalization*? Converting relative to absolute addresses or something like that?</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Ujj</title>
		<link>http://www.grok.in/blog/2008/06/07/introduction-to-web-crawling/comment-page-1/#comment-169</link>
		<dc:creator>Ujj</dc:creator>
		<pubDate>Sat, 07 Jun 2008 17:22:17 +0000</pubDate>
		<guid isPermaLink="false">http://www.grok.in/?p=35#comment-169</guid>
		<description>good to know you write crawlers. I was doing research in Grid Computing sometime back when I dived into crawlers and fell in love with the whole concept and the math involved, is a field to spend a lifetime in. Good post.</description>
		<content:encoded><![CDATA[<p>good to know you write crawlers. I was doing research in Grid Computing sometime back when I dived into crawlers and fell in love with the whole concept and the math involved, is a field to spend a lifetime in. Good post.</p>
]]></content:encoded>
	</item>
</channel>
</rss>

