Author Archives: Siddhartha Reddy

Tag Mirror

LibraryThing (an online service to help people catalogue their books easily) recently launched a very useful feature that they call “Tag Mirror“. This is one of the more interesting things that has been done with tags. In fact, I would wager that this is one of the best thing to happen to tagging since tag [...]

Posted in Information Extraction | Tagged | Leave a comment

The Dude Experiment

A friend had once shown me something very interesting: no matter how many [u]s you use in [dude], Google always has results for it! (One of those results for some twenty-odd [u]s was a blog entry of his — that’s how he came up with this). I just decided to run this experiment using the [...]

Posted in Information Retrieval | Tagged | 2 Comments

Tools/Libraries for IR

We’ve created a page to gather together some of the tools that a researcher or engineer working on IR problems might find useful. Hopefully this will be useful to many.
Tools/Libraries for Information Retrieval
We will be updating that page as we come across more tools.
We are enabling comments on the page, please leave comments about any [...]

Posted in Tools | Tagged , | 2 Comments

Interesting Papers on Web Spam at AIRWeb 2007

AIRWeb (Adversarial Information Retrieval on the Web) is workshop on IR in the world of Web Spam. From the call for papers page:
Adversarial Information Retrieval addresses tasks such as gathering, indexing, filtering, retrieving and ranking information from collections wherein a subset has been manipulated maliciously. On the Web, the predominant form of such manipulation is [...]

Posted in Information Extraction | Tagged , | 1 Comment

Hello world!

There already are so many blogs on search why another? (just check out the blogroll on searchengineland.com)
There are at least a couple of reasons.
Information Retrival (or IR) is not exactly search. It is search and more. Much more.
Semantics aside, there is a second and probably more important reason: none of all those blogs discuss the [...]

Posted in Uncategorized | Tagged | Leave a comment
  • About grok.in

    This is a blog primarily focussed on the subjects of Information Engineering—Retrieval, Extraction & Management, Machine Learning, Scalability and Cloud Computing.