Skip to content


Python script to get links from yahoo search

This was a quick script I made to pull links from yahoo search using the boss search api, and then list the unique domains.

If you want the entire links, just modify so that the whole links are appended to the list. Yahoo does not allow to get all the results, but only a certain predefined number so this code only extracts about 800 domains. But it is still good enough for a start and for most uses.

I am also working on getting citation values for google scholar for a friend. I will post that soon here. Heres the code for now.

#! /usr/bin/python
import urllib,json
from urlparse import urlparse

yahoo_application_id="Ht18VqTV34EMRWTJKOOh4rNBWTqkrjTSSQj9JwWlsqTMK41_3oFWFnhivJipX0wnvU4qzXc9VAw-"
nextresult=0;
links=list()
linksdump=list()
#print yahoo_application_id

#print "http://boss.yahooapis.com/ysearch/web/v1/Jeba+Singh+Emmanuel?appid="+yahoo_application_id+"&format=xml"
while(True):
	print "trying result from " + str(nextresult)
	f = urllib.urlopen("http://boss.yahooapis.com/ysearch/web/v1/search+engine+optimization+software?appid="+yahoo_application_id+"&format=json&count=100&start="+str(nextresult))
	ss=json.JSONDecoder()
	ssjson= ss.decode(f.read())
	#count=ssjson["ysearchresponse"]["count"]
	#start=ssjson["ysearchresponse"]["start"]
	totalhits=int(ssjson["ysearchresponse"]["totalhits"])
	print totalhits
	for x in ssjson["ysearchresponse"]["resultset_web"]:
		url= x["url"]
		o = urlparse(url)
		linksdump.append(url)
		link = o[0]+"://"+o[1]
		if link not in links:
			links.append(link)
		nextresult=nextresult+1
	if (nextresult>10000):
		break
print "Obtained results: " + str(nextresult) + " of which " + str(len(links)) + " were unique."
for x in links:
	print x

Cool huh? If you want any help modifying this, drop me a line.

Posted in Tutorial.

Tagged with , , , , , , .


One Response

Stay in touch with the conversation, subscribe to the RSS feed for comments on this post.

  1. Alberto Labarga says

    hi, I am interested in accessing Google Scholar, if you can share the code, I would really appreciate it,



Some HTML is OK

or, reply to this post via trackback.