What a silly article. PageRank is based on eigenvector centrality[1], which is one of a number of well-understood centrality measures that have been around for a long time. Google's innovation was noticing the importance of the link structure of the internet and knowing to apply EC to it - not inventing the graph theory underlying EC.
Edit: and knowing to apply EC to that graph structure.
What they patented, however, is more broad than that still:
1. A computer implemented method of scoring a plurality of linked documents, comprising:
obtaining a plurality of documents, at least some of the documents being linked documents, at least some of the documents being linking documents, and at least some of the documents being both linked documents and linking documents, each of the linked documents being pointed to by a link in one or more of the linking documents;
assigning a score to each of the linked documents based on scores of the one or more linking documents and
processing the linked documents according to their scores.
Patent claims usually begin with the broadest claim possible, then narrow it in following claims. If the first is found invalid, the other claims may still succeed. Thus, it makes no sense for the first claim to be certain to stand on its own; strategically, it should be over-broad.
From a computer science/mathematics point of view, this is also a convenient way to factor out commonalities of several claims, instead of duplicating them ad nauseam. For example, in the PageRank patent, claim 1 is immediately followed by six dependent claims giving further detail on it (as do ten other claims scattered through the rest):
https://www.google.com/patents/US6285999
Thus, it is inaccurate to treat the first claim as "what they patented".
While it's true that the independent claims try to be as broad as possible and dependent claims narrow them down, each claim is still evaluated -- and must be valid -- on an individual basis. And if infringement of a patent is to be determined, you'd start with the independent claims and work your way down, as those have the broadest scope.
So, yes, it does make sense to say that the patent claims what nullc quoted. It's only slightly inaccurate in that the patent also claims a lot more, but the independent claims define the broadest scope.
Oh dear, you and nullc are right, and I am wrong. My apologies. This patent was granted, and so they do have a patent for claim 1 (and also for the other claims). I was confusing it with a patent application.
It's surprising to me that even the cursory prior art search of examination didn't turn up this idea, since it's so simple, doesn't use any of the meat of the invention described, and the idea of ranking by citation was well-known, e.g. for academic papers. I think applying that to hypertext links is clever and insightful... but not patent-worthy (like many other patents). Upon litigation, I think claim 1 might not stand. But regardless, the patent was granted, so they do have a patent on claim 1.
.. And yet still not invalidated by this prior art (if you were to stretch the concept) due to key phrases like 'computer implemented method'.
Computers in the 1940s were things like ENIAC and more likely to be involved in computing firing tables or cryptanalysis.
Now - if you were to look at Kleinberg's HITS instead you may possibly have a better argument, but I'm betting Kleinberg's prioritization method is still technically different enough than PageRank for that patent to stand. Given that Brin and Page actually reference the paper, I'm sure they've had patent lawyers make sure it wasn't infringing.
> Google's innovation was noticing the importance of the link structure of the internet - not inventing the graph theory underlying EC.
Actually, I'd say that Google's innovation was applying the eigenvector centrality computation to the Web graph, not discovering any particularly meaningful property (the "random surfer") of the Web graph.
I'd call it an extremely skillful application of an old algorithm to a new problem domain.
Exactly. Page and Brin cite some papers (or atleast Kleinberg's paper does -- which inspired Pagerank) which deal with ranking authors based on their citation graph. This article makes no sense whatsoever.
I'm really curious how on Earth they tracked all these down. Imagine how many obscure papers were published they didn't look through that had the same idea.
>One important question is: what is the value of each sector when they are so tightly integrated? Leontief’s answer was to develop an iterative method of valuing each sector based on the importance of the sectors that supply it.
That's actually pretty interesting. But I think in economics a circular graph doesn't really make sense, does it? It's more two ways with one person trading with another person (with any number of intermediate nodes.)
> But I think in economics a circular graph doesn't really make sense, does it?
Leontief functions are a list of what inputs and outputs each technology a country uses as inputs and yields as outputs. So farming a region might require 100 tractors (inputs), of which 95 will be in good condition at the year's end (outputs), and need 50 000 hours of labourer time (input) and yield 10 000 tons of corn (output).
Then a network of trade will usually give you circular graphs: Country A produces corn that is imported by country B to feed its workers. Country B produces iron that is imported by country C as an industrial raw material. Country C produces tractors that are imported by country A for farming.
I think Leontief functions are one of the cooler ideas in economics: they get very close to the empirical data and they allow you to leverage linear algebra effectively. They've been important in some fundamental debates in economics, particularly the Cambridge capital controversy.
Surely this is innovation when applied to a new domain (ranking websites). Also Google never stopped innovating PageRank and other ranking techniques.
PG applying Bayes' theorem to e-mail spam was innovative.
If we get very loose with our definition of algorithm, one could even say that ants implemented the first PageRank-like algorithm, when they ranked sources of food, using pheromone trails.
Well I haven't read the actual page ranking patent but apply existing math to solve a problem is not innovation.
However your example about ants is very true. Ants should be the patent holders of page ranking. But that also means the patent should have already expired a long time ago.
Edit: and knowing to apply EC to that graph structure.
[1] http://en.wikipedia.org/wiki/Centrality