AND Search using SQL Server Full Text Search

DISCLAIMER: I’m new to SQL Server Full Text Search, so if what I’m saying here is wrong then let me know.

Sticking with the Knowledge Base project, we had to implement search. We decided to use SQL Server Full Text Search for this. Once the fulltext catalogs were setup, the query my colleague wrote looked like this:

SELECT [Key], [Rank]
FROM FREETEXTTABLE(dbo.Question, QuestionText, 'building consent form', 100)

The problem with this search is that it will return results which contain any one of the keywords. So if a user enters “building consent form”, a page only needs to contain building OR consent OR form and it will be included in the results. I call this an OR search. This reminds me of the early days of search engines, (AltaVista.com et al), which used to work the same way.

AltaVista c.1999. Remember?

Back in the AltaVista days, if you wanted only pages with every search term you had to use AND between each search term e.g. [building AND consent AND form]. Nowadays you don’t need to type the ANDs; if you enter [building consent form] into Google, it will only return results which contain building, consent, and form. This is what I call an AND search.

SQL Server Full Text Search seems to be stuck in the late 1990s days of OR searches. As demonstrated above it will return a hit for any one of the search words. And there doesn’t seem to be an easy way to tell it to do an AND search.

AND Search

To do an AND search with SQL Server Full Text Search, you need to query the CONTAINSTABLE.

SELECT	[Key], [Rank], 1
FROM	CONTAINSTABLE(dbo.Question, QuestionText, '"building" AND "consent" AND "form"', 100)

Because we’re querying the CONTAINSTABLE and not the FREETEXTTABLE we lose all Full Text Search’s nice fuzziness around the search terms, i.e. in the above example, Questions which contain “buildings”, “consent”, and “form” won’t be returned. To help we can do the following:

SELECT	[Key], [Rank], 1
FROM	CONTAINSTABLE(dbo.Question, QuestionText, '"building*" AND "consent*" AND "form*"', 100)

Stop right there

Another problem with this is stopwords, aka noise words (SQL Server 2005 calls them noise words, 2008 stopwords). SQL Server has a list of around 150 common words, which are ignored by the Full Text Search engine. e.g. about, after, all, also, an, and, another etc. You can view the English stopwords in SQL Server 2008 like so:

select * from sys.fulltext_system_stopwords where language_id = 1033

In my table there is a Question titled “What are building consents?”. If we search for “what are building consents” like so:

SELECT	[Key], [Rank], 1
FROM	CONTAINSTABLE(dbo.Question, QuestionText, '"what*" AND "are*" AND "building*" AND "consents*"', 100)

because “are” is a stopword (and so is “what”) we won’t get any results!
There is a workaround – to wipe the list of stop words! The problem with doing this is that the list of stop words is setup server-wide on the SQL Server itself. In our production environment other applications are using the same SQL Server instance (but of course we have our own KnowledgeBase database), so this wasn’t an option.

Then the customer added a new requirement – if a Question contains the exact search phrase then that should be returned first. Adding the code for that use case fixes my search.

CREATE PROCEDURE dbo.Search
(
@keywords NVARCHAR(4000), — e.g. ‘”building*” AND “consent*” AND “form*”‘
@searchPhrase NVARCHAR(4000) — e.g. ‘building consent form’
)
AS
BEGIN
— Tmp table —
DECLARE @searchResults TABLE
(
QuestionID int,
Rank int,
ExactMatchToQuestionText bit default 0
)

— Question contains keywords in order —
INSERT @searchResults (QuestionID, ExactMatchToQuestionText)
SELECT QuestionID, 1
FROM Question
WHERE QuestionText LIKE ‘%’ + @searchPhrase + ‘%’

— Free Text Search matches —
INSERT @searchResults (QuestionID, Rank, ExactMatchToQuestionText)
SELECT [Key], [Rank], 0
FROM CONTAINSTABLE(dbo.Question, QuestionText, @keywords, 100)
AND [Key] not in (SELECT QuestionID from @searchResults)

— Return —
SELECT Question.*
FROM @searchResults [Results]
JOIN Question ON Results.QuestionID = Question.QuestionID
ORDER BY
Results.ExactMatchToQuestionText DESC,
Results.Rank DESC
END

Still wrong

This code still has a bug – a search for [what are building consents] returns 1 hit, but a search for [what building consents] returns 0, because “what” is a SQL Server stopword. D’oh!
At this point I am better off throwing away SQL Full Text Search and completely rolling my own query.
So why doesn’t SQL Server Full Text Search include a simple option for doing AND searches against the FREETEXTSEARCH table?
And, does anyone know a better way to do AND searches using SQL Server Full Text Search?

Enabling browser back button on cascading dropdowns with jQuery BBQ plugin

What’s this? I know you thought this was supposed to be a .NET blog, yet so far all I’ve posted is jQuery and iPhone stuff. Well the idea of this blog is that I will post stuff I’m learning on the job, and at the moment I’m learning a lot about jQuery, so here we go.

This continues on from the Knowledge Base project I used in a previous post. In this example, I start out on the “Help” page. I navigate to the Browse page, and then within the Browse page I we have a page which shows/hides different sections when the mouse is clicked.

In this example, I start on the Help page and navigate to a FAQ using cascading drop-downs. I choose a value from my “level1” dropdown, Animal Control. The “level2” drop down is populated with Animal Control-related topics, and I choose Dogs. The results pane is loaded with Dog-related FAQs, and I expand the first result, about micro-chipping dogs. I click a link to view the Dogs topic, and then click the browser back button. The browser takes me back to the Help page, but my previous state (viewing the “Do the dogs need to be microchipped” FAQ) isn’t displayed, instead the Help page is in its initial state.

We want to make it so that when the back button is clicked, we’ll be returned to the FAQ we were viewing on the Help page.

Existing Code

$(document).ready(function() {
  $('.dropdown').change(onSelectChange);
  loadServices(); // loads all the Services into the level1 dropdown
}

function onSelectChange() {     
   var selected = $("#" + this.id + " option:selected").val(); // the new selected value     
   if (this.id === 'level1') { // 'this' is the dropdown that was changed       
      loadTopicsForService(selected); // loads topics into the level2 dropdown with AJAX     
   } else if (this.id === 'level2') {
      showAnswers(selected); // shows FAQs for the selected topic with AJAX
   }
}

Plugins

To get the back button working, we need to kind of trick the browser into thinking that we’re navigating up and down within the same page by adding # tags to the address. There are a number of jQuery plugins for doing this and I tried out a few – jquery history plugin, history, and jQuery BBQ.

I originally got it working with jQuery history, but it wouldn’t work properly for IE7. So I tried again with jQuery BBQ and it worked fine across all our target browsers – IE6, IE7, IE8 and FireFox.

Code changes

First, I changed the dropdown change event. Instead of populating the drop downs by calling loadTopicsForService() or showing FAQs by calling showAnswers, I’ll call bbq.pushState:

$(document).ready(function() {     
  $(window).bind("hashchange", historyCallback); // needed for bbq plugin to work     
  
  $('.dropdown').change(function() {
    var dropdown = $(this);
    var id = this.id;
    var selected = dropdown.find("option:selected").val();
    if (selected != 0) {
      if (dropdown === "level1") {
        $.bbq.pushState({ level1: selected }, 2); // merge_mode=2 means wipe out other params (e.g. level2)
      } 
      else if (dropdown === "level2") {
        $.bbq.pushState({ level2: selected });
    }
  });

  loadServices(); // loads all the Services into the level1 dropdown
  historyCallback(); // manually call the bqq history callback, to process the # tags if any were previously 
                          // added. This is called when we load the page by pressing back, fwd, or refresh.
}

Now when I select “Animal Control” from the level 1 (aka Service) drop-down, jQuery.BBQ will 1. add “#level1=1” to the current page location, i.e. the address bar in the browser, and 2. call my historyCallback function which will do the work that used to be defined in the dropdown change event. #1 tricks the web browser into thinking that I’m now on a different page, which adds a new entry into the browser’s history so that clicking Back in the browser will take me to the previous page.

Now we need to define our callbacks which actually do the work of populating the cascading drop downs and displaying FAQs. Remember, historyCallback() will now do the work that used to be defined in onSelectChange().

function historyCallback(e) {
  levelOneCallback();     
  levelTwoCallback();
}

function levelOneCallback() {
  var L1 = $.bbq.getState("level1");     
  if (L1 != undefined && L1 != 0) {
    $('#level1').val(L1); // set dropdown value - needed for when fwd is clicked
    loadTopicsForService(L1); // loads topics into the level2 dropdown with AJAX
  }
  else { // level1 not specified so reset the page         
    $('#level1').val(0);
    $('#row2').hide();
  }
}

levelOneCallback() asks jQuery.BBQ if the level1 has been selected, and if it has been it loads up the level2 drop down. If level1 isn’t specified then we should explicitly reset the level1 dropdown.

The rest of the code is here:

function levelTwoCallback() {
  var L2 = $.bbq.getState("level2");
  if (L2 != undefined && L2 != 0) {
    $('#level2').val(L2); // set dropdown - needed when fwd is clicked
    showAnswers($("#level1").val(), selected);
  } 
  else { // level2 not in querystring, so reset level2
    $('#level2').val(0);
  }
}

So, when I choose a value from the first dropdown, #level1=3 is added to the URL of the page. When I choose a value from the second dropdown, #level2=7 is added. For brevity I have omitted the code which adds #question=55 to the URL when a question is clicked.

Here’s a video of the final result:

Once again I navigate through the drop downs, as before, but this time when I click back, we’re brought back to the previous state which is viewing the question about microchipping dogs. As I keep clicking back, each previous state is loaded. And then when I click Fwd, my previous actions with the dropdowns are repeated!

I’d like to thank “Cowboy” Ben Alman for the jQuery BBQ library and also for suggesting some improvements to my code.

Hope that helps someone.

Visual Studio 2008 SP1 locks up after installing Office 2010 RC

Since yesterday I’ve been having weird issue with Visual Studio 2008 SP1 locking up whenever I edit HTML files. Basically, the interface stops responding and any click anywhere within Visual Studio (e.g. Solution Explorer, within the HTML file itself, the menus) causes a system beep – ding!

It’s really annoying obviously and the only workaround is to kill Visual Studio in Task Manager.

Then when you restart Visual Studio it’ll load up the HTML file you were working on, and if you’re not careful it locks up again.

This started happening yesterday, when I uninstalled the Beta of Office 2010 and installed the RC.

Update: I found the solution, thanks to this blog post.

When Visual Studio hangs it spawns a setup process that never exits. The setup.exe is located at:
C:\Program Files (x86)\Common Files\microsoft shared\OFFICE12\Office Setup Controller\Setup.exe

All I did was run that setup.exe, and chose Repair. It repaired “Microsoft Visual Studio Web Authoring Component”, and once it was done Visual Studio stopped locking up.

Generating and injecting HTML content with Javascript and jQuery

Although I am a .NET developer, a recent small project I was working on wasn’t .NET but just plain ol’ HTML pages. I’m our team’s jQuery expert but that’s not saying much at all – I only earned that title by buying jQuery in Action and I have it sitting on my desk.

DISCLAIMER: I’m not a Javascript or jQuery expert! I’m still learning this stuff.

The project was a Knowledge Base. The requirements were that the user would choose a topic from a drop-down, then a subtopic, then a sub-sub topic and finally the “Questions” relating to that topic would be displayed. So away we went, writing and implementing “GetSubTopicsForTopic” web services which would populate the cascading drop downs. Fine and dandy.

Obviously, depending on which sub-subtopic they choose, the number of “Questions” returned will vary. Some subtopics (e.g. Contact Us”) will only have one Question while others would have 5 or 6. We would need to dynamically generate the HTML to display each Question (and it’s Answer) and then inject it.

Step 1: Getting the Answers

Here’s our jQuery.ajax call to get the Questions for the sub-subtopic. We first used .getJSON but were having some concurrency issues with it, so changed to .ajax and set async to false.


function showAnswers(subSubTopicID) {
$.ajax({
    async: false,
    url: '/_layouts/kb/services/getquestions.ashx',
    data: { subSubTopicid: subSubTopicID },
    type: 'GET',
    dataType: 'json',
    success:
       function(questions) {
          if (questions == null || questions.length == 0) {
             return;
          }

          // set header text
          var resultText = "Your query returned " + questions.length +  ((questions.length == 1) ? " result." : " results.");
          $('.header').text(resultText);
          renderQuestions(questions);
       }
   });
}

Now that we’ve gotten an array of Questions back we need to render them.

Here’s what it looks like again:

The HTML we need to generate for each Question and Answer (Q&A) is quite long, about 70 lines, as follows:

<ul id="results-list">
   <li style="border-bottom: #c4c3c3 1px dotted" id="Question77">
      <a id="question_77" href="#">
         What developments will contributions be charged on?
      </a>
      <div style="zoom: 1; display: block">
         <div class="answerText">
             Contributions will be charged on any development or change in use or service connection that creates demand on the city’s infrastructure. The only exception is for residential
   ...

A parent <li> followed by a bunch of child nested <div>, all within a <ul>.

Generating HTML from Javascript was something I’d never done before,  but I had worked on existing projects which do this.

Option 1: Embedding HTML in the Javascript

I’ve seen this done on a number of projects. e.g.

function renderQuestion(question) {
   // generate the html for the question
   var html = '<li style="border-bottom: #c4c3c3 1px dotted" id="Question' + question.QuestionID + '">';
   html += '<a id="question_' + question.QuestionID + '" href="#">';
   html += question.QuestionText;
   html += '</a>';
   html += '<div style="filter: ; zoom: 1; display: block">';
   html += '<div class="answerText">';
   html += question.Answer;
   html += '</div></div></li>';

   // add it to the DOM tree
   var results = document.getElementById('results-list');
   results.innerHTML += html;
}

The thing I don’t like about this is you’re mixing the HTML in with the Javascript, so that if the HTML design needs to change then you need to change the Javascript too. On the bright side, it should be reasonably simple to change the HTML, but as a programmer it smells bad.

Option 2: Creating DOM elements

Use Javascript (or jQuery) to create the <li> and <div> and then append them, like so.

function renderQuestion(question) {
   var listItem = document.createElement('li');
   listItem.setAttribute('style', 'border-bottom: #c4c3c3 1px dotted');
   listItem.setAttribute('id', 'Question' + question.QuestionID);

   var a = document.createElement('a');
   a.setAttribute('href', '#');
   a.setAttribute('id', 'question_' + question.QuestionID);
   a.innerHTML = question.QuestionText;
   listItem.appendChild(a);

   var answerDiv = document.createElement('div');
   answerDiv.className = 'answerText';
   answerDiv.innerHTML = question.Answer;
   listItem.appendChild(answerDiv);

   var results = document.getElementById('results-list');
   results.appendChild(listItem);
} 

Option 2 is more code than Option 1. As a programmer it feels cleaner, since I’m manipulating properties on objects rather than having hard-coded HTML. But it’s less readable than Option 1 and would be more difficult to change the HTML.

Option 3: Duplicate an HTML sample row and add it to the DOM

This was an idea that I came up with, it’s probably been done somewhere else before but I’ve never seen it.

Update: Ha, although I came up with this idea myself, Rick Strahl blogged this exact same technique way back in October 2008, he calls it ‘Manual’ Templating.

I’ll have an invisible “sampleQuestion” which contains the <li> and all its <div> embedded in the HTML.

help.html:
<ul id="results-list">
   <!--    Sample Question used for cloning   -->
   <li id="sampleQuestion" style="display: none;border-bottom:1px dotted #c4c3c3">
      <a href="#" class="questionLink"></a>
      <div style="zoom: 1; display: block">
      <div class="answerText"></div>
...

I’ll use jQuery to make a copy of this sample Question, change some attributes, make it visible, and then add it to the DOM.

help.js:
function renderQuestion(question) {
   var sampleQuestion = $("#sampleQuestion");
   var listItem = sampleQuestion.clone(false);
   listItem.attr('id', 'Question' + question.QuestionID);

   var questionLink = listItem.find('.questionLink');
   questionLink.attr('id', 'question_' + question.QuestionID);
   questionLink.text(question.QuestionText);

   listItem.find(".answerText").text(question.Answer);

   listItem.show();
   listItem.insertBefore(sampleQuestion);
}

I like this approach because the page design and layout is kept in the HTML file where it belongs. If your designer wants to change the layout he can, as long as he doesn’t change the ids and the class names too much. The only thing the Javascript does is set the IDs and set the content.

Update – Option 4: jQuery Templating

I only just found out about this, but a number of existing jQuery solutions exist for dealing with this exact scenario. Rick Strahl’s blog post looks at a few, and now Microsoft have proposed their own solution which might be added to jQuery in the future.

Summary

On the Knowledge Base project I implemented a ‘manual’ templating solution because I didn’t know better, but jQuery has a number of solutions for generating and injecting HTML content which are worth further investigation.

  1. John Resig’s Micro-Templating looks like a good option
  2. jTemplates
  3. PURE (Pure Unobtrusive Rendering Engine), a Javascript library (not jQuery specific)