Is the Craigslist Search Engine Broken?
Posted on Sunday, March 29, 2009 by Ian Drake
As I mentioned in the last blog post, I'll be changing Notifywire.com's search engine to mimic Craigslist's search engine. FYI, this is the search engine language for Craigslist.
Here's the problem as I encountered it (NOTE: brackets are used to show the start and end of the search): [Honda Civic -(2001 | 2000)]
What this search should return is all the ads with the words "honda" AND "civic", but NOT have the words "2001" OR "2000". Instead, the search is parsed as [Honda Civic -2001 | 2000] and thus I get results that contain the word "2000", which is flat wrong!
But wait, that's not the really weird part! What is truly strange is that the search ["Honda Civic" -(2001 | 2000)] FIXES Craigslist's search engine parser. The only real difference is that now the ad must contain the phase "Honda Civic" not just the individual words. The phrase has nothing to do with the negation part of the search [-(2001 | 2000)], but somehow this small change makes it work correctly.
Even through I'd like to get my search engine as closely aligned to Craigslist as possible, I'm going to stop short of coding in bugs.
Try a similar search for yourself.
Strange, huh?
