Google Custom Search – Restricted to specific directory, and filetype…

Categories CSE, Custom Search Engine, Google, Information Technology, JavaScript

So I’ve been having trouble getting a new Google Custom Search to look only in a specific directory on my web server, as well as to only look at specific file types.   It seemed like this should have been SO easy…  But, it took me a bit to figure it out.  So here’s the information just in case anyone else needs it.

  1. Set up a new search engine, in Google Custom Search.
  2. Make sure that you have the root directory in the “Sites” section.  I used “www.mydomain.com”.
  3. You can enter refinements, but they only seem to work if the user clicks on them; not what I wanted.
  4. Go down to “Get Code” and grab the code block that is created for you.  For me it was this:
    • [code]<div id="cse" style="width: 100%;">Loading</div>
      <script src="http://www.google.com/jsapi" type="text/javascript"></script>
      <script type="text/javascript">
      google.load(‘search’, ‘1’, {language : ‘en’, style : google.loader.themes.MINIMALIST});
      google.setOnLoadCallback(function() {
      var customSearchOptions = {};
      var customSearchControl = new google.search.CustomSearchControl(
      ‘Custom Search ID’, customSearchOptions);
      customSearchControl.setResultSetSize(google.search.Search.FILTERED_CSE_RESULTSET);
      customSearchControl.draw(‘cse’);
      }, true);
      </script>[/code]
  5. This will give you the basic search.  In order to add in the site restriction and the filetype restriction I changed it to:
    • [code]<div id="cse" style="width: 100%;">Loading</div>
      <script src="http://www.google.com/jsapi" type="text/javascript"></script>
      <script type="text/javascript">
      google.load(‘search’, ‘1’, {language : ‘en’, style : google.loader.themes.MINIMALIST});
      google.setOnLoadCallback(function(){
      var customSearchOptions ={};
      /* Add Custom Search Option to restrict directory */
      customSearchOptions [google.search.Search.RESTRICT_EXTENDED_ARGS]={"as_sitesearch": "www.myDomain.com/subDirectory1/subDirectory2/"};
      var customSearchControl = new google.search.CustomSearchControl("Custom Search ID", customSearchOptions );
      customSearchControl.setResultSetSize(google.search.Search.FILTERED_CSE_RESULTSET);
      customSearchControl.draw(‘cse’);
      /* Add query addition to restrict filetype */
      customSearchControl.setSearchStartingCallback(
      this,
      function(control, searcher, query) {
      searcher.setQueryAddition("filetype:pdf OR filetype:PDF");
      }
      );
      }, true);
      </script>[/code]

    I hope that helps out anyone else who might be looking for a similar problem…