Softwares/Libraries for Full-text Search

A lot of applications have a requirement to search the full-text of some content they have for some words it might contain. This kind of functionality is often referred to as full-text search. For example, a blogging software might need to provide a search functionality that searches the blog posts for the user entered query terms.

It is not possible to use the regular database indexes (usually B-Trees or Hashmaps) for this purpose because they require that you provide the full value of the column you are searching in; in essence they do an equality search. In the blogging software example, the user would then have to type in the entire blog post verbatim in order to find it; even if you could imagine the most patient of users, if s/he already knows the entire post by-heart, why would s/he be looking for it anyway?!

Read on…

This entry was posted in Information Retrieval and tagged , , , , . Bookmark the permalink. Post a comment or leave a trackback: Trackback URL.

2 Comments

  1. Posted November 6, 2008 at 9:30 am | Permalink

    Thanks for the writeup Sids! For a guy thats been wanting to work on these systems for a long time this is a godsend.
    Keep writing such.

  2. Posted November 6, 2008 at 11:03 am | Permalink

    @kopos: Glad you found this useful!

One Trackback

  1. By Lucene or a Database?   Yes! « LingPipe Blog on November 23, 2008 at 2:08 am

    [...] future, because this is something that most database systems do badly, if at all. Here’s a nice critique of this feature in MySQL (an otherwise fine [...]

  • About grok.in

    This is a blog primarily focussed on the subjects of Information Engineering—Retrieval, Extraction & Management, Machine Learning, Scalability and Cloud Computing.