Why is Net Applications (Hitslink) changing its browser stats after publishing them?

I was curious to see how the uptake of Google's Chrome browser would be, considering that they were promoting it on their front page. I looked it up on a special page set up by Net Applications to track Chrome usage before I left for work. It seemed to do pretty well, as it climbed above 1%, passing Opera's alleged market share. Not really surprising considering the massive media coverage it was getting.

When I got back later and reloaded the page, I noticed that it had gone down to 0.5% or so the last few hours. I still left the page open, and returned a little later. To my surprise, the page was no longer showing the same numbers for the same time. It's as if it had never shown 0.5%.

I tried to get my hands on a cached copy of the page to make sure that it wasn't just a mistake on my part, and indeed, it was not. Apparently Net Applications decided to change the numbers after they had been published. …

You can actually still access the original graph from their site. Compare this to the current graph:

Edit: It looks like the original image has been removed from the server, so I uploaded both of them here instead:

Original graph: Current graph:

The numbers are 0.56% and 1.18% respectively for 9/4/2008 11:00:00 AM (EDT).

See the full images in my Net Applications/Chrome stats photo album.

This is not the first or even second time the numbers published by Net Applications have "mysteriously" changed from one day to the next. A few years ago, Opera showed up with up to 5% in their stats. Apparently they figured that this was too high, so the numbers were slashed. Then, last year, Opera for Desktop was climbing above 1%, and Opera Mini was up to 0,6% or so, and climbing fast. Overnight, the figures changed completely, and Opera Mini was down to 0.1% or so, while desktop was at about 0.5%.

Now, what is the reason for this apparent tampering with the stats? I really don't know. It could have a perfectly valid explanation (even though I can't think of one at the top of my head), but Net Applications seems to stay quite tight-lipped about how their stats are measured. For example, they claim that theirs are "global stats", but aren't their customers mainly located in the United States?

In any case, this highlights just how unreliable browser statistics are. While one browser might re-fetch resources from the server fairly often, another browser might use the cache more aggressively, and thus cause fewer "hits", and show up lower in the statistics. This depends on how the stats are measured, of course, but many stats companies refuse to share their actual methodology.

And when their numbers are seemingly manipulated without any explanation what so ever, this raises serious questions about how reliable they are, and whether they should be quoted at all when discussing browser usage.

Advertisements

19 thoughts on “Why is Net Applications (Hitslink) changing its browser stats after publishing them?

  1. Regarding one of the other cases of manipulation, see the old numbers with Opera Mini at 0.66% and desktop at 1.11% here, in case you want to see that I'm not just making this up 🙂Other data from before the overnight change:August 2007:Opera Mini: 0.27%Opera Desktop: 0.88%July 2007:Opera Mini: 0.24%Opera Desktop: 0.89%June 2007:Opera Mini: 0.21%Opera Desktop: 0.91%May 2007:Opera Mini: 0.16%Opera Desktop: 0.74%April 2007:Opera Mini: 0.13%Opera Desktop: 0.76%And just before the sudden change:October 2007:Opera Mini: 0.41%Opera Desktop: 0.99%November 2007:Opera Mini: 0.66%Opera Desktop: 1.11%It's worth noting that the old stats (percentage growth) for 2006 and 2007 seem to make more sense, since the number of Opera users has doubled since 2006. As you can see, Opera Desktop grew by 67% in 2006 and 45% in 2007, according to the old statistics. That confirms the number from the press release.In other words, it would seem that the numbers from Net Applications today are less accurate than they were before, which is interesting indeed.

  2. Yes, but also internet users increased as well. Maybe they recalculated data 🙂 I don't know, just guessing.You could be right though 😉

  3. i still find no reason not to think they lacked sufficient for the last hour, and they might get the 0.5% by the ratio between all hits received that part of hour over the overall hits count of the previous hour (last complete hourly stat).thought, i don't defend them, i don't even trust them, just look at this stat of OS market share, note that iphone has 0.30% of the market share, does this means that for every 1000 computer have access to them 3 of them are iphones (or have iphone software installed somehow, or even are sniffing as iphone OS) ? i really doubt.p.s. i already know hitslink is an opera partner, i am sorry, but that was just my personal opinion

  4. If you look at the screenshots, it was down to 0.5x% for more than one hour. Those numbers, in fact, remained at the same level for several hours. I just wasn't able to get my hands on a cached copy of that.This site is currently using Hitslink, but considering their strange and unreliable stats, I hope we will be looking elsewhere for a better solution.

  5. Anonymous writes:

    it is a matter of slowly parsing logs from various sources (stat systems use MASSIVE logs distributed over MANY servers around the world and parsing them takes considerable ammount of time – google analytics claims their results to be reliable after about 24hours). they might've waited longer and published data only after it was completly calculated, but they've decided to do it as fast as possible – chrome was/is 'the news' that day.it isnt a scientific way of doing stuff, but it isnt a crime you are making it to be.generaly look at stats at least one day old and you'll never be confused again.

  6. Anonymous writes:

    oh another thing. modern stat gathering is a strictly JavaScript matter that anyone can debug using firebug or fidler – one eliminate caching using properly formed http requests (with 'no cache' directives of some sorts). there are some problems with 'user activity' due to cookie issues etc, but these are anyway fairly reliable.I think that anyone from your webdev coding department can shed some light on the topic if you think that you need to know more.note: it isnt that I think that stats are reliable with very good accuracy, but they are good enough to spot trends and overal market situation quite well. yeah, it depends somewhat on the market where stats are measured (what webpages have these codes) but given large enough sample all results will obey natural distribution, and that makes them reliable enough. (this does not include errors in counting algorithms, but let put these aside).

  7. Anonymous writes:

    "it isnt a scientific way of doing stuff, but it isnt a crime you are making it to be"How convenient that you ignore the other cases of this company manipulating its stats LONG after the fact.

  8. and yet you are still paying them, haha. They are not hte ones to blame! I wonder how difficult would be to build your own counter and statistics code? you built a complete forum engine, and you cant build your own statistics code? I really dont get it.

  9. I do think Net Applications are to blame for manipulating their own statistics. And yes, I'd love to see us move away from them, but it probably isn't that simple.As for doing everything in-house, that is not always the best solution. Just like all phone manufacturers may not want to build their own browsers and contact Opera instead to license one because it might be cheaper, easier and better than building your own.

  10. Look at the bottom of the page. It says “This report has been reviewed by Quality Assurance”, so they are already admitting that humans are indeed reviewing and modifying their metrics.I wonder whether Opera would have a larger share if it had not been for the Content blocking feature. I, for one, block web metrics such as Hitslink because I feel it is a waste of bandwidth. (I am old school: Go HTTP server access logs for metrics! Woho!)

  11. Anonym writes:

    @danaleksUh, the only thing that text says is that it's been viewed. Nowhere does it say that they actively edit the stats to their own liking.

  12. Ideally browsers should incorporate an anonymous reporting standard, and cut out all of this opt-in for-pay page-tracking nonsense.Ask the browser how many pageloads it's executed. Ask for an estimate of the data volume covered. If someone uses the Opera Mail client, is it possible to report that as traffic to email providers? Mail client reporting is godawful in itelf.But for example, the download client uTorrent has a floating bank of aggregate use statistics. It lists how often and for how long it's been run, how much it's sent and received…I would hope that these numbers are already being anonymously reported to vendors by their install base.It would be nice to see a report on traffic saved due to page caching, or any number of other statistics that are much more valid than say, how many people are visiting one small company's limited client base.I have no need to hear what a site thinks it's being viewed by. I want to know how many people are using each viewer and how many pages they're looking at. If a news provider's IE hits land on the main page 90% of the time, but Opera hits only land there 40% of the time, this may be due to a difference in browser behavior AND NOT Internet Explorer users fleeing deadly words/non-pornographic images. Capice?PS: User-Agent spoofing isn't much of a concern to me. By sheer metric of volume growth among Opera users, the majority of them have new-enough copies to rule out any weighting in the statistics. Or is Opera to this day failing to represent itself? If Opera is still spoofing, then the slim number of reported hits come from sites that have taken extra effort to correctly identify the browser. Maybe I'm low on sleep, but I can't seem to find raw numbers on that Net Applications graph. I'd like to see how many total hits they've indicated.For laughs, the current results of the "best browser" survey pane on their site are:Firefox: 63.17%Opera: 15.74%Safari: 11.34%Internet Explorer: 9.76%So either there's a lot of unreported Opera users or they're 15.5 times as vocal as your average Safari user. For that matter Firefox users voted at three times their respective market share.

  13. Originally posted by hellspork:

    User-Agent spoofing isn't much of a concern to me. By sheer metric of volume growth among Opera users, the majority of them have new-enough copies to rule out any weighting in the statistics. Or is Opera to this day failing to represent itself?

    Actually Opera identifies as itself by default, but it needs to spoof as other browser for lots of major sites. And it needs to apply JS patches for several hundred sites as well. Check out the documentation on Browser JS.

  14. Huh. Should have seen this reply sooner. Already looked at browser.js not long after posting. Certain other features like page element suppression should cut deeply into hits reported by advertisers. I know that dial-home behavior is regarded as a bad thing, but raw page-and-data-count numbers would be better than third-party hit-tracking.

Comments are closed.