The Google PageRank functionality in Google Toolbar works by querying Google's server for information on the PageRank of a specific page. This might seem easy enough to implement in your own program/website, but the problem is that the toolbar calculates a checksum on the page URL before querying the server, and the server only responds if the checksum is correct. Fortunately the checksum algorithm was reverse engineered from Google Toolbar 7. I was provided the hand decompiled version of the algorithm in C from a friend. Then I went ahead and rewrote it in PHP for web development usage. You can find both versions below.
As an example, the query URL for the page 'http://en.wikipedia.org/wiki/Cypherpunk' is http://toolbarqueries.google.com/tbr?client=navclient-auto&features=Rank&q=info:http://en.wikipedia.org/wiki/Cypherpunk&ch=783735859783
Any other query with a checksum other than 783735859783 will result in a '403 forbidden' response.
Enjoy.
C Version (original):
PHP Version:
~ Dmitry
Very cool, but i wonder how quickly they modify their checksum now that this is out.
ReplyDeleteThis is the ruby implementation I did for the gem PageRankr: https://github.com/blatyo/page_rankr/blob/master/lib/page_rankr/ranks/google/checksum.rb
ReplyDeleteGoogle hasn't changed the algorithm since I first wrote my implementation, which was in 2010.
AFAIK it's been changed in 2011. http://www.seroundtable.com/pagerank-update-may12-15097.html
ReplyDelete