Putting Surface Reliability in Perspective

Posted on August 18, 2017 by Paul Thurrott in Microsoft Surface with 22 Comments

Putting Surface Reliability in Perspective

This week’s blockbuster report on Surface reliability triggered some fascinating discussions. But now I have some new information to share.

To recap, Microsoft’s communicated the following points internally in the wake of a damning report from Consumer Reports in which the trusted consumer advocate organization removed its recommendation of Surface products.

  • Microsoft acknowledges internally that Surface Book and Surface Pro 4 launched with unprecedented reliability problems. And the Consumer Reports data is skewed by that fact, as it should be, since so many existing Surface users did experience problems over the past year or more.
  • Microsoft’s internal data shows that Surface reliability has improved since then. This is true of both the devices that were impacted by reliability issues and of new devices like Surface Laptop and the new Surface Pro.
  • Microsoft’s internal data shows that Surface customer satisfaction rates are very high. I have made the case that customer satisfaction is not the same as reliability, but it’s fair to say, too, that these things are related.

I also theorized that the reliability issues are the real reason that Microsoft shipped Surface Laptop and the new Surface Pro sans modern technology like USB-C/Thunderbolt 3: It was afraid that using a new platform could trigger another round of reliability issues, so it stuck with the well-understood (if out-of-date) Surface Connect/USB-3 platform it used in prior generation products instead. This is just my own theory, but I believe it’s supported by the facts.

Since then, I’ve received a bit more context about the information in my report. And I’d like to share that with you now.

First, Skylake.

Based on background conversations with multiple highly-placed Microsoft executives over late 2015 and 2016, I placed the blame for Surfacegate firmly in Intel’s lap. Microsoft, by being first out of the gate with this then-new chipset, suffered from an unprecedentedly buggy new generation of Core processors, with the now well-understood results. No other PC makers, I noted, experienced these kinds of issues.

But in my most recent report, I noted that a different trusted source at Microsoft had a different story: The real problem was Surface-specific custom drivers and settings that the Microsoft hardware team cooked up.

As it turns out, both of these stories are, in effect, true. And the blame for Surfacegate can be split somewhat evenly between Intel and Microsoft.

That is, Skylake really was very buggy. So buggy, in fact, that a key processor feature, its compatibility with the Windows 10 “Instant On” (previously Connected Standby) power management functionality, was literally broken. And other PC makers did experience some reliability issues with their Skylake-based PCs too. Just not to the level that Microsoft did.

And that is so because the other PC makers, long used to how Intel does things, simply did not enable Instant On in their Skylake-based PCs. They did not ship PCs using a feature that they knew would not work properly.

Microsoft, a relative newcomer to the PC business, and still trusting Intel, its biggest partner, believed the company when it told them that the power management issues would be fixed. So it shipped Surface Book and Surface Pro 4 with Instant On enabled. Even though it did not work. The theory being that Intel would quickly fix this issue. Which it did not.

This, I think, explains Microsoft’s anger at Intel. But it’s fair to note that Microsoft was naive to trust Intel too. I believe they did this because it would have been embarrassing to both companies for Microsoft to ship its first Windows 10 PCs without enabling a key new Windows 10 feature. But they paid the price.

Second, the Lenovo story.

I also buried a story, of sorts, when I noted that Microsoft CEO Satya Nadella met with key Lenovo executives and asked them how they were faring with all the Skylake issues. Lenovo was confused, I wrote. No one was having any issues, he was told. And then I theorized that this must have triggered some interesting conversations inside of Microsoft.

This story is true, and I now believe that the person Nadella spoke with was, in fact, the CEO of Lenovo. But regardless, some context will help explain why this conversation shouldn’t be surprising.

Inside of Lenovo, there is a large group of teams that works on the firm’s PCs. With every new Intel CPU generation, that team, like teams from other PC makers, tests the new chips before implementing them in their products. And like the other PC makers, Lenovo, of course, discovered that Instant On wasn’t working. So it disabled that functionality and shipped PCs that were much more reliable than Microsoft’s.

But why would the CEO of Lenovo, or any other high-level executives at that firm, be aware of such implementation details for specific PC models or configurations? After all, there are bugs in all Intel chips, and those bugs are corrected with firmware issued later by Intel, over time, or by software made by the PC makers. From Lenovo’s perspective, it shipped new PCs like it always does, with whatever features. And those PCs achieved whatever level of reliability and market acceptance. To Lenovo’s leadership, everything proceeded normally. It’s no wonder they were confused by Nadella’s question.

Ultimately, what we’re left with here is that Microsoft suffered from some major reliability issues with Surface Book and Surface Pro 4. And it feels that it has corrected those issues since then, though we will need a lot more time and data before we know that to be true.

More important, Microsoft clearly cares deeply about Surface reliability and about its customers having a great experience. And that is quite heartening. I’ve seen other blogs try to undercut the Consumer Reports data, which is a losing strategy given that publication’s decades of unbiased experience testing consumer products. Microsoft, to its credit, is not engaging in that effort, at least not directly. (I can only imagine what cherry-picked data it might have provided to less sophisticated bloggers to help them make its case for them.) Instead, it accepts its role in the reliability issues of the past, and it contends that it will simply continue to try to do better.

And that, folks, is the Microsoft I know.

 

Join the discussion!

BECOME A THURROTT MEMBER:

Don't have a login but want to join the conversation? Become a Thurrott Premium or Basic User to participate

Register
Comments (22)

22 responses to “Putting Surface Reliability in Perspective”

  1. Avatar

    DaveHelps

    Long term reader observation: how many truth-bombs contain the word "folks"?


    Somehow, any topic that brings Paul's F-word out really seems to resonate with me :)


    Anyone else?

  2. Avatar

    North of 49th

    Out of curiosity, does 'Instant On' work now in any of the current processors, or has this functionality been de-scoped?

    I can appreciate how angry Microsoft would be if this functionality didn't work and wasn't going to be fixed given their desire to bridge the PC/tablet market where the competition's products on ARM would have 'Instant On' capabilities. If I were Microsoft, I might be angry enough to want to demo Photoshop on an ARM processor (where the low power instant on functionality does work) to perhaps highlight to Intel why prioritizing getting 'Instant On' to be functional is so important.

    It is a sad commentary on Intel when you state "But it’s fair to note that Microsoft was naive to trust Intel too." Maybe AMD's resurgence will focus Intel.

    • Avatar

      Polycrastinator

      In reply to North of 49th:

      Works on the Surface Book now. Interestingly, I've presumed it's been broken since the CU, but I recently discovered that Microsoft's setting to let Windows Save Power when it knows you're away (or similar) causes Windows to ignore your sleep timer settings and in my case was switching my Book into hibernation almost instantly rather than sleeping. Which is just a dumb, dumb, dumb thing for it to do.

  3. Avatar

    John Scott

    I think this has been beat to death I guess because technology is so boring these days? Bottom line when people buy a premium advertised product with hard earn money. They have some higher expectations that everything is more polished, perfect, and less problematic. If that does not happen, you blame the maker of that product. When my car has a part failure or recall, I blame the maker of the car not the part maker. Yes we can differ blame away from Microsoft to Intel, but in the end Microsoft will take much of the blame. Since most SkyLake devices work perfectly fine, I am not sure Intel has a lot of blame to accept? I don't think Microsoft has been singled out either or unfairly judged. If Apple produced a MacBook Pro and had similar issues customers would also be blaming Apple. As with a car maker who chooses who makes their parts, and therefore takes responsibility for their poor quality. The customer is the great equalizer for quality, if you don't improve your product it will fail, if you ignore the customer feedback you will also fail.

  4. Avatar

    Bill R

    I'm not inclined to accept the Surface team's effort to deflect responsibility for what is essentially their quality problems. If other vendors can discover issues and work around or fix them why can't Microsoft? This seems like the fix is simply more rigorous testing, validation, and vendor management. There are many possible configurations for CPU/Chipset/memory/IO, it's up to the company to qualify their product in their configuration using their systems.

  5. Avatar

    Jules Wombat

    "More important, Microsoft clearly cares deeply about Surface reliability and about its customers having a great experience. And that is quite heartening. "

    They only care NOW, because the reputable consumer report has outed them.

    They should have tested, QA their products, and LISTENED to their consumers over the last two years. But then I guess power management is a 'difficult software problem'. Sorry but I have little sympathy for Microsoft here. They are marketting and pricing Surface as premium products, and it still takes them more than a product cycle to reconcile their device driver level issues.

  6. Avatar

    Geoff

    I can't help thinking these reports are massively overblown.

    I've had a Surface Pro 4 at work for about 6 months. I use it every (work) day. I see, literally, dozens of them. Every day.

    The truth is, it's an excellent device. I've never seen an issue on mine, or heard of problems with anyone else's.

    I've been in corporate IT departments for 25 years. I know quality when I see it. This is a stand-out device.

    'Perfect' doesn't exist, of course. But this is the best PC money can buy right now.


    When the new '5' came out, I saw the '4' was on sale, so I bought one for my daughter. That's how good it is.

    If the '6' has USB-C, I'll probably get another one then. No USB-C is a show-stopper for me, so I'll stick with the '4' for now.

  7. Avatar

    VMax

    Some Lenovo machines certainly have similar issues to the Surfaces - the Helix 2, for example. If you have a full charge and unplug it, when you go to power up again, it usually will already be on. Sometimes it'll be off, but a bunch of charge has vanished. Two different units, same issue since new. Never fixed.

  8. Avatar

    Awhispersecho

    I get the feeling a certain someone was contacted by a certain company about a specific story he wrote and was influenced or persuaded to "clarify" said story.

  9. Avatar

    nbplopes

    What about years and years denying the problems and promising fixes that never came? What about they keyboard issues? Was all that down to Instant On? What about promising that fixes would come, and this time around all would be fixed but it wasn't?


    lets not forget SP3 shall we? Wifi problems, sleep issues, keyboard problems ... the CPU was not Kabylake.

    From all the devices I have SP3 is one that gives me more erratic behavior establishing an Wifi connection.


    This is not a perpective ... it's something else.


    Heartening because they care? Of course they do, but that is the minimum. People don't buy care, we buy a device that is not cheap by any means.





  10. Avatar

    Wizzwith

    This is interesting info on the whole Surface/Skylake fiasco, and pretty much solves the mystery of what was wrong with the Skylake chips on some and not other machines; if all accurate, bravo.  One thing that should be added to really close the loop is Apple's 2016 Macbooks w/Skylake chips.  Apple took so long to add Skylake, and then got rid of it so quick.  Seems like Skylake was really problematic to Apple as well, but what really happened? 


    Now on another note, "[CR's] decades of unbiased experience"... sigh... "unbiased" does not mean accurate or correct.  And "trusted consumer advocate organization" - trusted by who, the 50+ crowd maybe; but trust needs to be continually earned, and always scrutinized.  It's not to say CR's report on this is all wrong, and we know there is truth to it, but their data collection, analysis, and conclusions (as presented w/out further information) do not pass the test by any measure of data analytics, scientific process or even just critical thinking. They don't deserve any passes because of their reputation from 20 years ago.  CR bares the burden of proof in their claims, and they do not satisfy that in this case nor in many other areas.  I'll bet those other blogs you take digs at here were not so much "defending Microsoft" so much as just taking a critical look at what are pretty obvious flaws in CR's reporting.  So why do you feel the need to "defend" Consumer Reports as you've been doing in most of your articles about this? 


  11. Avatar

    Waethorn

    "The real problem was Surface-specific custom drivers and settings that the Microsoft hardware team cooked up."


    Exactly.


    Didn't I say this already?

  12. Avatar

    MSFTVulcan

    And of course, anything Mechanical can fail!

  13. Avatar

    ChesterChihuahua

    You are gradually uncovering layers of the Surface reliability problems. Keep digging, and you will eventually get to the truth: it was Steve Ballmer, wearing a monster suit. And he would have gotten away with it, if it wasn't for you meddling kids !

  14. Avatar

    CaedenV

    Still better than Xbox 360's nearly 55% failure rate... #success

  15. Avatar

    glenn8878

    "power management issues would be fixed"


    But when? That's the problem. Microsoft shipped faulty product that was confirmed to not work. But there's also many issues with hibernation and power standby that existed for years since Windows 98. I assume from "custom drivers and settings that the Microsoft hardware team cooked up" that Microsoft sucks as fixing years old issues.

  16. Avatar

    Tomasz Sowinski

    But why would the CEO of Lenovo, or any other high-level executives at that firm, be aware of such implementation details for specific PC models or configurations? 


    For the same reason CEO of Microsoft would?

    • Avatar

      Darmok N Jalad

      In reply to xyz123:

      Yeah, I would have to think that the CEO of one PC company would be very aware of a competing company's highly-publicized problems, though I guess that would depend on when the CEOs spoke. If nothing else, it would invoke a question down the chain regarding Skylake. Maybe that was all that happened, and the report from below was that they didn't have any problems because they didn't trust Intel's new feature and didn't enable it.

      I think MS is trying to push Surface to the bleeding edge of technology, which is not what other OEMs do. They likely want to ship well-proven and tested products so that they don't have to spend so much on support. All this time, MS thought OEMs weren't pushing the envelope enough. Well, look what it got them, and so their next products were an internally conservative Surface Studio and an ultrabook that looks nice.

    • Avatar

      david.thunderbird

      In reply to xyz123: When things are working that's all a CEO needs to know, when things go south then he needs to know why and who is gonna fix it and fast.


  17. Avatar

    chump2010

    "More important, Microsoft clearly cares deeply about Surface reliability and about its customers having a great experience. And that is quite heartening. "


    That is not what you were saying all the time you had the problems. They were massive problems and they were plaguing you for ages. So no Microsoft does not care about Surface Reliability. Microsoft cares about profits and sales...and now that Consumer reports has come out and slammed them they might buck up their ideas.


    By the way, why did Microsoft not test the hardware, like Lenovo did? Or they ran many tests and thought that Intel would fix it? I don't know why they could not release a driver that did not use the instant on feature and then later on when it was patched by intel, release a new one to enable it. It seems to me like quality control sucked, because I don't think they were aware that the feature was not working properly. It was only time and time again when you brought up the issue, that eventually they admitted the problem.


    Maybe I am being a bit harsh, but only because I am reading all the frustrations you had with Surface. I have never owned the device, but to come out and say Microsoft cares deeply about reliability is just wrong. They only care in so far as it affects sales.

    • Avatar

      lvthunder

      In reply to chump2010:

      Both of those can be true. Of course the Surface Team cares about reliabilty and their customers. While at the same time they shipped a buggy product. It happens. That doesn't mean you don't care. It means you made a mistake. Maybe they launched Surface with a agreement from Intel to have this fixed by the shipping date and they could just replace the driver via Windows Update the first time most people used it. Maybe that date slipped by 6 months. No one around here really knows what really happened. My point is you can care about someone's problem and not be able to do much about it.

  18. Avatar

    Cain69

    I really do not care if it's Intel or this doohickey or that. Systemic problems are bane of every OEM. The technology OEMs rely on are not always perfect - especially when using "new" anything.

    Here is when I have a problem - when the OEMs disavow well documented systemic issue(s). If an OEM owns up to the problem - communicates the issues and working on resolving it... I am not sure you can ask for more.

Leave a Reply