System builders look at CPU benchmarks to verify vendor specifications printed on the box (or online product page) in real-world contexts before buying hardware. I thought it would be interesting to benchmark the benchmark sites.
There are about a dozen or so that look legitimate, or at least kept up-to-date. Independent testing sites give PC builders a nice vetting tool for how to compare processors. I was curious which of these sites is the most trusted.
But how do you measure something intangible like trustworthiness?
For web content, enterprise-grade analytical tools dig into search data and link structures across the internet to gauge Google ranking authority–which is a pretty good indicator of trust. These tools are extraordinarily important in e-commerce, so we are quite handy with them at Newegg.
Crash course in authority, trust, SEO, and Moz Analytics
For this experiment I used Moz Analytics, one of many tools available for generating this type of data. Run any URL through the Open Site Explorer tool and it spits back analysis of how authoritative a domain and its pages are.
To do this, Moz scores domains and pages with three main metrics:
- The overall metric is Domain Authority (DA) (100 point scale) which predicts likelihood of a domain and its pages to rank high in a Google search. Page Authority (PA) (100 points scale) measures individual pages on a domain.
- A key factor in generating DA and PA is MozRank (10 point scale), which gives a logarithmic accounting of all links that direct back to the domain in question, ‘backlinks’ as they are called. Not all backlinks are equal; their value is weighted by site popularity and relevance to certain key words.
- Similarly, MozTrust (10 point scale) analyzes backlinks from a set of “high trust” domains like reputable publications, and .gov or .edu sites.
This is all done to plumb the Google algorithm for determining quality and relevance in its search results. There are certainly other more esoteric factors behind the scenes, but DA/PA, Rank, and Trust are the big three.
How do the CPU benchmark sites stack up?
To gather my initial list, I probed six Google searches around the term “CPU benchmarks”, and I went 10 pages deep on each. I came out with 12 contenders.
Here is how the trust metrics shake out with the Moz Open Site Search tool sorted by MozTrust.
|Primate Labs / Geekbench||61||57||6.39||5.97|
|3D Fluff Cinebench 15 Database||23||27||2.7||0|
Overall trustworthiness, all metrics considered, would shape up something like this:
- AnandTech Bench
- CPU World
- – 10. Toss up
[useful_banner_manager banners=5 count=1]
GIMPS win! GIMPS win!
Great Internet Mersenne Prime Search (GIMPS) is a worthy recipient of having the web’s most trusted CPU benchmarks. A distributed computing project founded in 1996 by computer programmer and mathematician George Woltman, GIMPS main objective is to calculate prime numbers as they approach infinity.
Every time GIMPS discovers a new prime number, the news generates backlinks from scholarly journals, Wikipedia, and reputable technology websites like Ars Technica. This explains the heightened trust scores for the domain and its benchmarks page.
You have to disqualify low-DA sites like UserBenchmark, 3D Fluff and the Passmark knockoff even though they have comparable trust scores. The low DA indicates Google is penalizing them for shady backlinking schemes, leftovers from the bad old days of SEO before Google Panda. You will notice the drop-off in quality if you visit those sites.
The 85 DA score says AnandTech is a reputable website, which it is–definitely a leading tech/hardware publication. It has a reputable benchmark page with a solid PA, but not as many links back to that particular page. Speculating here, but this is probably because the benchmarks page is overshadowed by other site content in terms of calculated relevance. Benchmarks are the main focus for sites with moderate DA and higher trust metrics.
But wait, why does GIMPS do CPU benchmarking?
Finding and vetting prime numbers relies on an algorithm called the Lucas-Lehmer primality test. Applying the test to numbers with millions of digits is an I/O-intensive number-crunch that Woltman’s Prime95 software performs on clusters of mainstream consumer PCs. Through PrimeNet, a community of numbers geeks from around the globe, users pool together their processing power in probing for the next big Mersenne prime. They call it grassroots supercomputing.
GIMPS members are encouraged to torture test CPUs because hardware flaws create false positives and other glitches in Prime95. They take benchmarking seriously, and Prime95 is a good litmus for CPU performance and reliability.
From the GIMPS home page:
Today’s computers are not perfect. Even brand new systems from major manufacturers can have hidden flaws. If any of several key components such as CPU, memory, cooling, etc. are not up to spec, it can lead to incorrect calculations and/or unexplained system crashes.
Overclocking is the practice of increasing the speed of the CPU and/or memory to make a machine faster at little cost. Typically, overclocking involves pushing a machine past its limits and then backing off just a little bit.
For these reasons, both non-overclockers and overclockers need programs that test the stability of their computers. This is done by running programs that put a heavy load on the computer. Though not originally designed for this purpose, [Prime95] is one of a few programs that are excellent at stress testing a computer.
- The venerable Intel Core i7-4790 powered the computer that found the most recent (49th) Mersenne prime number in January 2016. The number is 22,338,618 digits long. The PC ran calculations for 31 straight days at a lab at University of Central Missouri under the guidance of Prof. Curtis Cooper.
- The highest clock speed documented in GIMPS benchmarks is 4.722 GHz, achieved by an AMD FX-9590.
Best practices for viewing PC benchmarks
The right way to translate CPU benchmarks into actual performance is to look for tests that apply to your prospective system build using the applications you plan to run. System builders probably will find PassMark’s site the most useful for this. It offers the most extensive and granular benchmarking lists in my opinion. I was surprised that it did not win the Trust experiment, but I understand why.
Geekbench and the other top sites, from what I can tell, provide decent value as well by offering several configurations and applications by which they conduct testing.
Just make sure to avoid the Passmark knock-off site for CPU benchmarks, which apparently has tricked a few reputable sites to link back to it.