The performance and battery life of smartphones and tablets may not match the numbers provided by device makers, but development is under way on a tool that could bring consistency to the measurement of those metrics.
A "user experience" benchmark being developed by Berkeley Design Technology (BDTI) analyzes system-level efficiency to predict performance and battery life in mobile devices. The ratings will be based on device configurations and usage modes such as Web browsing, video and phone calls.
"When we looked at the kinds of benchmarks people were using, we were horrified at how bad some of them are. We saw an opportunity to improve the situation and provide the industry with better metrics," said Jeff Bier, president and founder of BDTI.
The overall mobile device experience is what matters and honest measurements are needed for apples-to-apples comparisons with other devices, Bier said. He likened the benchmark to measuring the performance of a car, in which fuel efficiency and overall performance are measured while taking all components into account.
As the mobile industry matures, chip makers and analysts agreed there is a need for more accurate benchmarks.
Mobile-device benchmarking has been in "a horrible state for a decade" and needs to be addressed, said Patrick Moorhead, founder and president of Moor Insights and Strategy.
"Consumers are being misled by some of the benchmarks, resulting in a bad purchase. Many benchmarks are synthetic in nature and most are unable to correlate to real-world usage models. Others are easy to manipulate because they use the honor system to keep code clean from someone gaming the benchmark," Moorhead said.
Moorhead for more than a decade has been vocal about the need to establish a honest benchmarks in PCs, and is now pushing for metrics that represent real-world mobile usage scenarios.
BDTI is trying to differentiate its benchmark by not measuring the performance of individual components such as CPUs and GPUs, which is common in other tools. One component may not necessarily reflect the performance of a device and the sum of all parts needs to be taken into account when determining power efficiency and user experience, Bier said.
A graphics processor may perform well, but a screen may consume a lot of power to render 3D polygons, Bier said. Or a device may simply lack the bandwidth to deliver the graphics.
"This is one of the perils in isolation, it may look good, but when lapped into a system, it may look different. It you're a component provider.... you want to test in a whole system with a whole set of applications," Bier said.
Many benchmarks have been recently abused, and Bier said he was "horrified" by the measurement options available, which partly triggered the development of the new benchmark. He provided the example of the widely used AnTuTu mobile benchmark, which he said lacked transparency.
In the last few months, some benchmarks and performance tools have come under scrutiny for misrepresenting the performance of mobile devices. Chip and reviews site AnandTech raised questions about the results displayed about the GPU specifications and performance in Samsung's Galaxy S4. AnTuTu last month was exposed by analyst Jim McGregor for showing inconsistent results when comparing the performance of Intel Atom chips to ARM-based chips.
AnTuTu Labs said criticisms from media and companies was unwarranted. It hopes that more people will join in benchmarking work, which could benefit smartphone users, according to a statement on its website posted Wednesday. AnTuTu has been deeply affected by the criticism and said the "slurs" amount to "malicious competition, slander, dissemination [of] false information," according to the statement.
Existing benchmarks are necessary, but BDTI's tool could make it more difficult to mask inefficiencies in system implementations, said McGregor, who is principal analyst at Tirias Research.
"I don't see component benchmarks going away anytime soon, but this should help sway the argument toward system efficiency," McGregor said, adding that multiple benchmarks are needed so component makers can compare their parts to the competition.
BDTI hopes to release the final benchmark early next year. A no-fee license will be offered to those interested in evaluating the benchmark, but a license fee will apply to those who adopt it.
BDTI has the backing of chip maker Qualcomm and Chinese Internet service provider Tencent. BDTI is trying to attract more members, but chip makers like Intel, Nvidia and Advanced Micro Devices have not received information about the benchmark. The chip makers said they were interested in evaluating the tool.
BDTI is not interested in establishing an industry-wide consortium around the benchmark, especially with competing parties having their own interests in mind. The company will welcome input on possible improvements, but it will make the final decision on whether to make changes.
"[Consortiums] tend to move slowly. ... In the end they have a much lower chance of designing a valid benchmark because the people participating in the process are motivated by many factors," Bier said. "We're not trying to make everyone happy. We're trying to make a benchmark."
Bier is respected in the chip industry for his past work on benchmarks for embedded devices. But companies back benchmarks that make their products look good and analysts were skeptical about BDTI's chances of gaining industry-wide support.
"I think the only way to really improve your chances is to get support through some form of industry group or association. The more influential the organization, the more likely companies will be forced to pay attention," McGregor said.
Getting companies like AMD, Intel and Nvidia involved would be a good thing, but it will be tough, Moorhead said.
"Intel prefers large, consortium-style benchmarking groups. The last time AMD, Intel and Nvidia worked in a client benchmark consortium on BAPCO, it didn't end very well, so I'm not necessarily optimistic," Moorhead said. BAPCO is a PC benchmarking organization that AMD and Nvidia in 2011 left after disagreements over measurement techniques.
But Bier hopes potential members can put aside politics and think of users as they look to evaluate entire mobile platforms and usage models.
"If we can get significant adoption, we can move the needle," Bier said.