Dark Magic: What Caused Google’s Nexus 6P Death Loop (and How to Fix It With a Hairdryer)
Exclusive

Dark Magic: What Caused Google’s Nexus 6P Death Loop (and How to Fix It With a Hairdryer)

My friend’s Nexus 6P died while acting as navigator and DJ on a road trip from upstate New York to Manhattan in February 2017. The 6P froze, rebooted, displayed the Google logo, then rebooted dozens and dozens of times, offering no clue what was wrong.

I tried to help over speakerphone from the driver’s phone, but none of the usual power/volume-button/safe-mode tricks worked. The 6P was two weeks out of warranty. They bought a $200 Moto G4 at a store in Union Square to get through the trip. Later on, they would trade in a replacement 6P for $113 toward a new Pixel 2. They are still quite irked about the whole saga.

It wasn’t all bad news, though. Less than a month ago, three years after their 6P lost the will to live, my friend received a check for $400 from Google and the 6P’s manufacturer, Huawei, through a class action lawsuit settlement. Another friend got a $400 check and turned it into a cool gravel bike. LG similarly settled a separate class action involving bootlooping phones, including the Nexus 5X. Phones are complex, lesson learned—end of story?

Not for me. I see all the blacked-out sections of the 6P settlement filing and I’m left with so many questions. What would cause an Android phone to suddenly be unable to fully boot, often months or years after first purchase, in such a way that Google couldn’t fix it with software? You can wipe and fix just about any software issue on an Android phone. If it was a simple hardware fault, why didn’t either company own up to the defect and recall it?

I contacted Google, Huawei, LG, and Qualcomm for comment on this post, but did not hear back from any of those companies. Actually, Huawei’s inbox for global press communications responded that it was full and could not deliver messages, twice in two weeks; messages to individual press handlers that I could find were not returned.

How does it happen that Apple, of all companies, looks positively transparent by comparison? Apple has repaired and swapped defective phones while admitting, however cagily, that something was wrong with them.

After weeks of research, including buying my own bootlooped Nexus 6P, talking to software hackers and board repair pros, and reading way too many articles about system-on-chip architectures, I am compiling here what I believe is the most likely cause of the 6P bootloop issue (and, just as important, ruled out some others), and even found a “fix” that, while a bit sad, might work if you want to rescue a Nexus 6P from bootloop.

Here’s why a really hot hairdryer ended up being the best tool for fixing a seemingly bricked Nexus 5X or 6P.

The Hot, Weird Chips Inside the 6P and 5X

Qualcomm makes modems, graphics processors, and CPUs, sometimes combined into a neat system-on-chip (SOC) package. In 2015, Qualcomm’s Snapdragon platform was pretty much the only game in town for a flagship smartphone core (at least, if you aren’t making your own chips, like Apple or Samsung). Google, working with Huawei to make one of two Nexus phones, went with the Snapdragon 810 for the Nexus 6P, its larger and costlier Nexus. It picked its diminished sibling, the Snapdragon 808, for the Nexus 5X made by LG.

The Snapdragon 810, in red, on the Nexus 6p motherboard (Heat shields have been removed).

The most important things you should know about the Snapdragon 808/810 inside many bootlooping phones are that:

Talk of fabrication problems and heat issues aren’t smoking guns or even known causes, but they’re interesting data points. What’s really interesting, for the owner of a bootlooping phone, is the “big.LITTLE” CPU setup. In theory, it’s an elegant system for maximizing performance while saving battery life. Your phone uses four slower, lower-power chips to do non-intensive and background tasks, then switches to the four performance, or “big,” cores for demanding, active tasks.

Please note the usage of “in theory” in that paragraph as we move on.

The Bootlooping Conundrum: Turns On, Doesn’t Care

What it looks like when a Nexus 6P bootloops, sped up 2x.

It’s unfortunate for Google that the malfunctioning phone reminds you which company sold it to you hundreds of times

Owners of the 5X and 6P, many of them Android enthusiasts eager to experience the vanguard phone Google recommended for developers, were stumped when their phones stopped working. Normally, a data reset of an Android phone solves glitchy startup or freezing and crashing issues. Worst case scenario, you have to download the original image for your device, boot into a “fastboot” or recovery mode by holding down certain buttons, and execute some terminal commands to patch in the factory-fresh firmware.

Except with this bootlooping issue, you can’t get into recovery mode, because trying to boot into that just sends the phone back into its logo/off/logo/off loop. If you’re an Android developer, or just messed with third-party ROMS before, you might have clicked the toggles for “Enable OEM Unlock” and “Enable USB debugging” in your phone’s settings. You could get into fastboot mode to flash new firmware, but your phone would still loop when you were done.

It’s almost worse that the Google logo shows up, and the phone seems to boot for just a bit, instead of just being an unexplainably dead phone. It’s also unfortunate for Google that the malfunctioning phone reminds you which company sold it to you hundreds of times.

The Fix: Disable the Faulty Half of the CPU

XDA-Developers is a forum where Android enthusiasts and developers go to offer up their experiments, troubleshoot devices, and do amazing feats to extend the end of a phone’s useful life with software. The Nexus 6P had a very active sub-forum at XDA, and it wasn’t long before complaints about bootlooping phones led to investigations and potential solutions.

XCnathan32 delivered the first working fix for the “Boot Loop of Death” (BLOD). Somewhere—in a forum thread, IRC channel, or device log—it was suggested that the crisis occurred after the device tried to enable the “big” performance cores for booting. The big cores were not responding, or had become “detached.” The phone’s standard boot code didn’t anticipate those cores failing to respond under normal circumstances, so the phone crashes and reboots. 

A portion of XCnathan32’s initialization script, with work assigned only to the bootlooping phone’s four “small” cores (0-3).

XCnathan32’s fixes are versions of the phone’s boot software, Linux kernel, and recovery mode, rewritten so that none of them reference or call on the phone’s “big” cores, ever. They also made a fix for the Nexus 5X that did the same thing: disable the big cores so the phone can boot. Read through the replies on either forum thread, and you’ll see people reporting back that their phones are booting again for the first time—perhaps with hitches, but for real. Other developers made XCnathan32’s fixes easier to install, and carried his work forward into newer versions of Android, including osm0sis and squabbi.

After revisions to better optimize the four little cores, some users suggested their phones seemed to run about the same, or even with better battery life. The big cores were often running so hot, it seemed, that they were throttled or disabled anyways. Others noticed the performance hit, but were glad they could at least get into their phones and recover their data.

These quirky fixes are more than Google or Huawei were offering most customers who reported their bootlooping phones. My friend with the bootlooped 6P contacted Google, which referred them to Huawei, which sent them back to Google, after noting that the phone was out of warranty. They escalated the issue twice with Google, citing a Reddit reply from a verified Google employee about “a hardware related issue,” but no replacement or refund was offered. A month later, a coworker told them that Google had replaced their own 6P, so my friend tried once more. They got a refurbished 6P replacement, then traded it in as soon as the Pixel 2 was announced.

Some people I know received replacements, even newer first-generation Pixels when they were available. Some were stonewalled if they were out of warranty. None, so far as I’ve seen, were told what might be the cause.

The Dirtier Fix: A Hairdryer to Scare the Phone

If you didn’t unlock your phone and enable debugging before the bootlooping occurred, you couldn’t do it later, because you couldn’t get into your phone’s software settings. But there is a way to trick the phone into disabling the big, power-hungry, hot-running cores. You have to make those cores so hot before booting that the phone is afraid they’ll be damaged if they start up.

If the CPU’s thermal sensors read high enough (like running heavy apps while your phone is in direct sunlight), the phone boots into a kind of safety mode, using only the little cores until it cools down enough. If you move fast and you’re lucky, that cautious interval is just enough to enable unlocking and debugging, to flash the XDA firmware that disables the big cores, or to grab your photos and texts and saved games.

The most proven way to do this is with a hairdryer (or adjustable heat gun set to a hairdryer-like temperature). You aim the heat at the space just above the fingerprint sensor where the Snapdragon 810 lives, and blast it while the phone is bootlooping. 

One young man on YouTube hairdryer-blasts his 6P for more than 6 minutes, sometimes in a bag, sometimes in his hand. He is grimacing after a while; likely it’s because the phone is getting too hot to hold, while the Google logo shows up again and again. But just then, the Google logo turns it into multi-colored swirling dots. The phone boots to a lock screen. Another thermal warrior with a camera sets a heat gun to 160 degrees Celsius (320 degrees Fahrenheit) and gets to a booting logo in about 4 minutes.

Inspired by this evidence (and the dozens upon dozens of comments of others saying it worked for them), I purchased a bootlooped Nexus 6P from eBay. I wanted to feel that Lazarus moment for myself, and add some first-person validation.

Unfortunately, despite the phone getting so hot that I had to wear gloves, the safety boot never seemed to happen. I also tried two other methods suggested in XDA threads, freezing the phone in a plastic bag and letting the battery run out drastically low, but neither prevailed.

A few of my attempts to make a Nexus 6P break out of boot loop. Not shown: attempts with a completely drained battery (I got frustrated and forgot where the focus point was).

I may next try a more drastic method of opening up the phone and exposing the motherboard more directly to hairdryer heart. Or I might use a heat gun; Hackaday suggests some hair dryers just aren’t hot enough. Nothing to lose now! I’ll update this post if I have success after either surgery or upgrading my heating arsenal.

So, Who’s to Blame?

Image by Monoar Rahman Rony from Pixabay

“Dark magic, for sure.”

Having recently finished a week-long microsoldering and board repair class, I thought that maybe the issue with the 6P was flexion—the big phone bending in some way that made the CPU or a nearby component crack and pop the solder joint connecting it to the motherboard. It’s what caused Touch Disease on the iPhone 6 Plus, and Audio IC issues on iPhone 7. Flexion disconnections are one way you can cause internal damage without notable external evidence. And, to be honest, having spent a week thinking about solder and pads, everything looked like a soldering issue now.

I asked Mark Shafer, one of my instructors at iPad Rehab, if he thought some kind of board/solder/chip disconnect was causing the CPU core separation. “Nope, but I wish,” he said. Shafer has a 6P hanging around his home workshop, and has looked at others under microscopes. If there was a repairable board issue with the 6P bootloop, or he heard a credible rumor of one, he would offer to fix it. I asked if that meant the issue was likely deep, dark, chip-making magic. “Dark magic, for sure,” he said.

Flexion or soldering faults also fail to explain the other phones that suffered the Boot Loop of Death. Of the five phones included in LG’s bootloop settlement, three (the Nexus 5X, LG G4, LG V10) used the Snapdragon 808 SOC, with its 20nm fabrication platform made by TSMC and a core-swapping big.LITTLE setup. Two of the bootlooping phones (LG V20 and G5), however, used a Snapdragon 820, fabricated by a different company (Samsung), without the big.LITTLE architecture.

XCnathan32, the original disable-the-big-core fixer, spent a couple weeks deep inside his 6P, trying to figure out exactly why the big cores were failing when called upon. Reading through the thread, there’s a lot of optimism that there’s some kind of voltage or software fix just out of reach. But the recurring response is that there’s just something wrong with the way this CPU works; it’s a minor grace that it’s wrong in a way that allows a tricky work-around. I traded a message with osm0sis, the XDA admin who kept up some of the 6P fixes, but he lacked for any deeper insight, as well. 

Perhaps the SOCs were binned (selected as viable product, despite faults) a bit too aggressively. Without numbers from Google or LG’s settlements on how many phones were affected, it’s hard to say how many people might have been part of a bad batch.

I can’t say with absolute certainty what caused the bootloops that killed a lot of phones in the mid-2010s. While hot, quirky Snapdragon models are one common aspect to all of them, there are many other parts that power, interact with, and regulate the operation of a SOC. Early on in the saga, LG told customers that a booting issue with the G4 was caused by “a loose contact between components.” Google’s one employee cited a hardware issue in one oft-linked Reddit thread. But the G4 was still included in the later bootloop settlement. Phones, of course, can also have more than one fault.

Even if Qualcomm’s SOCs were to blame, companies like Google and Huawei and LG might not want to point fingers or demand too much of the company. Qualcomm, declared a “monopolist” by the FTC for its aggressive cellular modem business, still dominates the SOC market. Apple doesn’t offer its SOCs outside its products. The alternatives are Samsung, Huawei’s (relatively new) HiSilicon, and not much else. Google and LG continue to use Snapdragon SOCs in their phones.

I can say, however, that some truly dedicated fixers fought through the most inscrutable malfunction, with little to no support from the device makers, in admirable fashion. It’s hard not to root for the person wielding a hair dryer against the conglomerates. Let’s hope next time they don’t have to work quite so hard.


Note: iFixit has a business relationship with Google. Google did not have input or access to this post before it was published.