Sunday, 5 February 2012

Get List Site of Search Results Google with PHP

Want to get all the top google  search results in an array with PHP? Just read on and do it yourself!
This script basically outputs the url’s of all the top results of a google search, but it can be modified to output other details also.
We will be using the google ajax api in this script. As you might have found, by default it gives out only 4 results. Or you may manage to get 8 with one more parameter. But here, with some tweaks, you will get the top 64 results! (And 64 is the upper limit, because google doesn’t like bots.)

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
<?php
function google_search_api($args, $referer = 'http://localhost/testing/', $endpoint = 'web')
{
if ( !array_key_exists('v', $args) )
$args['v'] = '1.0';
//$args['key']="ABQIAAAArMTuM-CBxyWL0PYBLc7SuhT2yXp_ZAY8_ufC3CFXhHIE1NvwkxT-uD75NXlWUsDRBw-8aVAlQ29oCg";
//$args['userip']=$_SERVER['REMOTE_ADDR'];
$args['rsz']='8';
$url .= '?'.http_build_query($args, '', '&');
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL, $url);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
// note that the referer *must* be set
curl_setopt($ch, CURLOPT_REFERER, $referer);
$body = curl_exec($ch);
curl_close($ch);
//decode and return the response
return json_decode($body,true);
}
$query=urldecode(isset($_GET['q'])?$_GET['q']:"none");
echo "Results for: $query<br />-----<br />";
$res = google_search_api(array('q' => $query));
$pages=$res['responseData']['cursor']['pages'];
$nres=0;
for($i=0;$i<count($pages);$i++)
{
$res = google_search_api(array('q' => $query,'start'=>$rez['responseData']['cursor']['pages'][$i]['start']));
for($j=0;$j<count($res['responseData']['results']); $j++)
{
$nres++;
echo urldecode($res['responseData']['results'][$j]['url'])."<br />";
}
}
echo "<br />---<br />Total number of reuslts: $nres";
?>
Here is what we have done in this script:
  • First, we have a function that requests search results from google.
  • google_search_api($args, $referer = ‘http://localhost/testing/’, $endpoint = ‘web’)
    We have used the usual code provided by google to fetch the search results with php.
  • But have done a tweak to get 8 results at a time, instead of 4. the tweak is:
    $args['rsz']=’8′;
    This code tells google to return 8(the maximum for one query) results instead of the default 4.
  • The next step is to get all the 64 results for a single search. We go around with it by querying again and again and specifying the “start” parameter in the search request url.
    $res = google_search_api(array(‘q’ => $query,‘start’=>$rez['responseData']['cursor']['pages'][$i]['start']));
  • But before we request paged results, we need to setup a loop.
    $res = google_search_api(array(‘q’ => $query));
    $pages=$res['responseData']['cursor']['pages'];
    $nres=0;
    for($i=0;$i<count($pages);$i++)
    {  … }
    We do a request from google to get the values of start parameter. We get the whole pagination information in the array, $pages
  • The rest is just displaying the url’s of all the search results and the number of results returned.
  • you can display description, title, cached url,etc from the
    $res['responseData']['results']
    array. For the structure of the results array, use
    print_r($res['responseData']['results']);

No comments:

Post a Comment

BACK TO TOP