We recently migrated a database from SQL Server 2005 to SQL Server 2008 and ran into unexpected problems with the search results returned by FREETEXTTABLE queries that had worked fine for years in SQL 2005. Specifically, the FREETEXT TABLE RANK column returned zero values for certain search terms. It worked fine for other terms, but for some more general or broad terms, the RANK value bonked. When the RANK value bonked, our SPROCS subsequently crashed because we were performing math against the integer values returned.
Nevermind the fact that we should have been properly catching situations were a MAX(RANK) value was zero before performing division on that value (which we later fixed), but in three years of production use within SQL 2005 we never had a problem. It was only after the conversion/upgrade to SQL 2008 that the RANK value began returning zero’s for every row in the result set.
Search terms like “late night dining” or “babysitting services” were unique enough and worked just fine within SQL 2008’s FREETEXTTABLE methodology. However, more general terms like “Disney” or “parking” bonked and caused the search results to crash. After a few weeks of back-and-forth with Microsoft’s SQL Dev team we believe we found the answer.
SQL Server 2008 and SQL Server 2005 differ in the method they use to calculate the RANK value based on the stemming performed by FREETEXTTABLE. In SQL 2005 if the word “park” was searched and multiple stemmed words (“parking”, “parked”, etc.) were found within a single row, it returned only one result in its internal calculations. In SQL 2008 that same scenario would multiple rows, one for each instance of the stemmed words. This helps SQL 2008 calculate a better relevance value for the actual RANK. However, that can result in a very large number of rows being returned internally. When that number of rows grows to be greater than 50% of the number of source rows contained within the full text index this bug surfaces and kills your rank values.
Our temporary solution has been to use CONTAINSTABLE which doesn’t seem to have the same bug. It’s not the same search but for our purposes it has proven to be a suitable alternative. And let’s face it; a functioning search is a better alternative to one that crashes for random and unknown search terms.
A more thorough and much better explanation of this issue is on its way from Microsoft and they will be providing that link to me once the related blog posts have been made. I’ll update this post when I receive that link.