Welcome to http://www.marssoft.de/
 
Tuesday, 11th December 2018 21:22:38 (GMT+1) 

Googlesearch

Have you ever wanted to search through all subdomains on you server, when you have a wiki, a forum and a CMS, for example? With google, you can. Just specify 'site:<yourdomain>' in the search, and you'll only get local results. But how to use that on your website? No big deal, as there are the Google web APIs.

The below PHP script will make use of the Google search API to find search results. With the supplied config, you can easily restrict the search results to your website (or not restrict them, so users can access Google search from your page). The config allows for several other adjustments, and the resulting HTML can, if needed, be styled via css (though no template is supplied for now).

Download Latest Version

Want to give it a try?

Here are two examples of the search page, the first one is restricted to marssoft.de (so you have to use common words like 'server' or 'cow' in order to find results). :-) The second one is the current Linux-NTFS's project search engine, it comes therefore with little more design.

Requirements

  • Note: nuSOAP works with PHP 4 only. The script will probably work with PEAR::SOAP as well, so use that on PHP 5.
  • A google search api key which you'll get for free. Its good for 1000 searches a day, so quite enough even for big sites.
  • My Latest Version containing the config, the search-page and a copy of the nusoap library.
  • The NuSOAP library, a tiny SOAP implementation available for free from sourceforge. Be sure to download all the *.php-files into the nusoap directory. Only needed if you'd like to update my copy of the library, the one that comes with the archive.

Code Details: search_config.php

<?php
 
////////////////////////////////////////////////////////////
// This is example code of how to query the Google API using
// Web Services, SOAP, and PHP.
//
// Author: Geoff Peters, January 6th 2004.
// Updated by Dan Karran, 10th March 2005 to utilise nuSOAP instead of PEAR.
// Updated by Dan Karran, 20th February 2006 to use $_GET variable instead of expecting
//                        to be able to access vars directly as ${var}
// Updated by Mario Emmenlauer, 03th March 2006 to use config file, and to use new google soapclient,
//                              also lots of smaller changes
 
if( !defined( 'SEARCH_DEFINE' ) ) {
  die( 'Direct access not allowed.' );
}
 
// Options are explained in detail at the google web api page,
//   http://www.google.com/apis/reference.html
 
$key = '';				// put your google api's key here.
$site = 'yourdomain.org';		// specifiy site you want to search (or use ...&site=<site> in query).
$site_from_url = 'true';		// allow specifying the site from the url? Might cause abuse.
$maxresults = '10';			// number of results desired per query. The maximum value is 10.
$dupfilter = 'false';			// hides similar results. Leave false or you won't get all hits.
$restrict = '';				// restricts the search to a subset. i.e. country or topic.
$adultfilter = 'false';			// eliminates sites that contain adult information.
 
$nusoap_path = './nusoap';		// path to nusoap installation. Relative or absolute path allowed.
$show_errors = 'false';			// do you want to see error messages? probably not.
$max_retries = 10;			// how often to retry google before giving up. Leave 5 at least.
$compress = 'false';			// Should the output-html be compressed (gzip). Not working jet.
 
$html_header = '';			// header is shown above search. Leave empty if you include search.php.
$html_footer = '';			// footer is shown below search. Leave empty if you include search.php.
 
?>

Code Details: search.php

<?php
 
////////////////////////////////////////////////////////////
// This is example code of how to query the Google API using
// Web Services, SOAP, and PHP.
//
// taken from http://www.dankarran.com/googleapi-phpsitesearch/
//
// Author: Geoff Peters, January 6th 2004.  
// Updated by Dan Karran, 10th March 2005 to utilise nuSOAP instead of PEAR.
// Updated by Dan Karran, 20th February 2006 to use $_GET variable instead of expecting
//                        to be able to access vars directly as ${var}
// Updated by Mario Emmenlauer, 03th March 2006 to use config file, and to use new google soapclient,
//                              also lots of smaller changes
 
 
define( 'SEARCH_DEFINE', 'SEARCH_DEFINE' );
 
require_once ("search_config.php");
require_once ("$nusoap_path/nusoap.php");
 
$givensite = $_GET['site'];
$start     = $_GET['start'];
$query     = $_GET['query'];
 
$soapclient = new soapclient('http://api.google.com/GoogleSearch.wsdl', 'wsdl');
$soapoptions = 'urn:GoogleSearch';
 
// print any header first so the resulting html is sanitized
print "$html_header";
 
// Sanity checks:
if( $key == "" ) {
  print "Google API key is empty. Set it in the config file first.<br />\n";
  print "This error is also possibly due to the script being unable to";
  print " read the config file.<br />\n";
  die();
}
 
// Search the site given from the url?
if( $site_from_url && $givensite != "" ) {
  $site = $givensite;
}
 
// Ensure there is a start value
if( !$start ) {
  $start = 0; 
} else {
  $start = intval($start-1);
}
 
 
////////////////////////////////////////////////////////////
// Calls the Google API and retrieves the search results in $ret 
//
function do_search( $query, $type, $key, $site, $maxresults, $dupfilter,
                    $restrict, $adultfilter, $start, &$ret )
{
  global $soapclient;
  global $soapoptions;
 
  // Note that we pass in an array of parameters into the Google search.
  // The parameters array has to be passed by reference.
  // The parameters are well documented in the developer's kit on the
  // Google site http://www.google.com/apis
 
  // limit searches to this server
  if( $site != "" ) {
    $sitequery = "$query site:$site";
  } else {
    $sitequery = "$query";
  }
 
  $params = array(
    'key' => $key,
    'q' => $sitequery,
    'start' => $start,
    'maxResults' => $maxresults,
    'filter' => $dupfilter,
    'restrict' => $restrict,
    'safeSearch' => $adultfilter,
    'lr' => '',
    'ie' => '',
    'oe' => ''
  );
 
  // Show very verbose information before each search try:
 
  print "do_search( query=&quot;$query&quot;, type=&quot;$type&quot;,";
  print " key=&quot;$key&quot;, site=&quot;$site&quot;,";
  print " maxresults=&quot;$maxresults&quot;, dupfilter=&quot;$dupfilter&quot;,";
  print " restrict=&quot;$restrict&quot;, adultfilter=&quot;$adultfilter&quot;,";
  print " start=&quot;$start&quot;, ret=&quot;$ret&quot; )<br />\n";
  print " resulting query=&quot;$sitequery&quot;.<br /><br />\n";
 
  // Here's where we actually call Google using SOAP.
  // doGoogleSearch is the name of the remote procedure call.
  $ret = $soapclient->call('doGoogleSearch', $params, $soapoptions);
 
  $err = $soapclient->getError();
  if( $err ) {
    if( $show_errors ) {
      print "<br />An error occurred!<br />\nError: $err<br />\n";
    }
    return false;
  }
 
  return true;
}
 
 
////////////////////////////////////////////////
// Does Google search with retry. 
// Retry is useful because sometimes the connection will
// fail for some reason but will succeed when retried.
function search( $query, $type, $key, $site, $maxresults, $dupfilter,
                 $restrict, $adultfilter, $max_retries, $start, &$ret )
{
  $result = false;
  $retry_count = 0;
 
  while( !$result && $retry_count < $max_retries ) {
    $result = do_search( $query, $type, $key, $site, $maxresults, $dupfilter,
                         $restrict, $adultfilter, $start, $ret );
    if( !$result && $show_errors )
      print "Attempt $retry_count failed.<br />\n";
    $retry_count++;
  }
  if( !$result ) {
    print "<br />Sorry, connection to Google failed. Tried $retry_count times.<br />\n";
    if( $show_errors )
      print "Did you enter your Google API key correctly?<br />\n";
  }
  return $result;
}
 
 
////////////////////////////////////////////////////////////
// Calls the Google API and retrieves the suggested spelling correction 
// 
function do_spell( $query, $key, &$spell )
{
  global $soapclient;
  global $soapoptions;
 
  $params = array(
    'key' => $key, 
    'phrase' => $query, 
  );
 
  $spell = $soapclient->call('doSpellingSuggestion', $params, $soapoptions);
 
  $err = $soapclient->getError();
  if( $err ) {
    if( $show_errors ) {
      print "<br />An error occurred!<br /> Error: $err<br />\n";
    }
    return false;
  }
 
  return true;
}
 
 
//////////////////////////////////////////////////////////
// The main part of this script
 
// print the search box and details of search results
print "<form method=\"GET\" class=\"search_main\">";
print "<input type=\"text\" name=\"query\" size=\"18\" value=\"$query\">";
print "<input type=\"submit\" value=\"search\">";
print "</form><br />\n\n";
 
if( $query != "" )
{
  // remove the slashes that are automatically added by PHP before each quotation mark
  $query = stripslashes($query);
 
  if( search( $query, $type, $key, $site, $maxresults, $dupfilter,
              $restrict, $adultfilter, $max_retries, $start, $ret ) )
  {
    $count = $ret['estimatedTotalResultsCount'];  // total number of results
    $secs  = round($ret['searchTime'], 3);        // time taken to search (in seconds)
    $min   = $ret['startIndex'];                  // first record returned
    $max   = $ret['endIndex'];                    // last record returned
 
    if ($max) {
      // Truncate query for display
      if (strlen($query) > 36) {
        $short_q = substr($q,0,33)."...";
      } else {
        $short_q = $query;
      }
 
      print "<span class=\"search_top\">";
      print "results <b>$min</b> - <b>$max</b> of about <b>$count</b> for <b>$short_q</b> ";
      if ( $site != "" )
        print "on site <b>$site</b> ";
      print "(<b>$secs</b> seconds)&nbsp;</span>\n\n";
 
      // list results
      foreach($ret['resultElements'] as $result) {
        // Make URLs more friendly for user by removing http:// and highlighting where necessary
        $friendly_URL = $result['URL'];
        $friendly_URL = str_replace("http://","",$friendly_URL);
        $friendly_URL = str_replace("$query","<b>$query</b>",$friendly_URL);
 
        print "<p class=\"search_result\">";            
        if (!$title = $result['title']) {
          $title = $result['URL'];
          print "<a href=\"".$result['URL']."\">".$friendly_URL."</a>\n";
        } else {
          print "<a href=\"".$result['URL']."\">".$title."</a><br />\n";
          if ($result['snippet']) {
            print $result['snippet']."<br />\n";
          }
          print "<span class=\"search_url\">$friendly_URL</span>";
        }
        print "</p>\n\n";
      }
    } else {
      if ( $site != "" ) {
        print "Sorry, no results were found on site &quot;<b>$site</b>&quot; for &quot;<b>$query</b>&quot;.\n";
      } else {
        print "Sorry, no results were found on google for &quot;<b>$query</b>&quot;.\n";
      }
      do_spell($query, $key, $spell);
      if( $spell[0] ) {
        print "Did you mean <b><a href=\"?query=".$spell."\">".$spell."</a></b>?\n";
      }
      print "<p><small><b>Detailed Information:</b><br>\n";
      print "Occasionally no results will be returned when there are problems";
      print " with the link between this site and Google. Also, new entries on this";
      print " site might not jet be indexed by google, in which case you can";
      print " still try the individual search functions of the wiki, forum or CMS,";
      print " if available. If you are certain the information should be on the";
      print " site, but the search tells you that it's not, you could also just";
      print " try again.</small></p>\n";
    }
  }
}
 
// Show forward/backward navigation if there are enough results
if ($count) {
  print "<div class=\"search_bottom\">";
  if ($min>1) {
    // FIXME: shouldn't this be ($min-$maxresults) ?
    if ($min>10) {
      $prevpage = $min-10;
    } else {
      $prevpage = 1;
    }
    print " <b><a href=\"?query=".$query."&amp;start=".$prevpage."&amp;site=".$site."\">";
    print "previous page</a></b> | ";
  } else {
    print " previous page | ";
  }
  if ($count>$max) {
    $nextpage = $max+1;
    print " <b><a href=\"?query=".$query."&amp;start=".$nextpage."&amp;site=".$site."\">";
    print "next page</a></b> ";
  } else {
    print " next page ";
  }
  print "</div>\n";
}
print "$html_footer";
 
?> 
software/googlesearch.txt · Last modified: 2014/04/02 22:39 (external edit)