Field Report Postmortem · Native Dependencies · Win32 Loader
The DLL that was not.
A release stopped finding msvcr120.dll on customer machines while QA stared at green dashboards. Three days, four registry hives, one forgotten manifest, and a quiet apology to anyone still running Windows 7 SP1 without KB2999226. A debrief from the engineers who lived inside the loader.
Monday morning, two pagers, and a smile that did not last.
The 07:14 page arrived in the middle of a coffee that had not yet been poured. By 07:22 there were nine tickets in Zendesk and by 07:40 there were forty-one, all of them carrying the same screenshot — the small grey Win32 dialog that says a DLL is missing, an OK button, and behind it our installer politely waiting to be killed. None of us had touched the C++ runtime in two years. None of us had recompiled against a different toolset in eighteen months. And nobody, anywhere in QA, could reproduce it.
By 09:00 the Slack channel had the usual shape of an early incident: people typing over each other, two unrelated theories, one well-meaning suggestion to roll back. We did not roll back, because the rollback would have taken the new payment provider integration with it, and that had been promised to a customer in São Paulo who was already on the phone. Theo Ranganathan, our build lead, asked the question that ended up shaping the next three days: "If QA cannot reproduce it, what does QA have that customers don't?"
The answer turned out to be embarrassing. QA had, in their C:\Windows\WinSxS cache, a side-by-side assembly directory called x86_microsoft.vc120.crt_1fc8b3b9a1e18e3b_12.0.40664.0_none_* that had been installed years ago by Visual Studio 2013 itself. Customers, in many cases, did not. The DLL was there on every developer machine because Visual Studio had put it there, and we had quietly come to depend on its presence without ever shipping it. The application built. The application ran. The application worked beautifully — on machines that had been developer machines at some point in the last decade.
The first wrong theory
Our first theory, at 09:18, was the antivirus. It is almost always the antivirus. Three customers in the first batch were running Bitdefender Endpoint 7.2.1.96, and there was a known case from 2024 where its file-system minifilter would quarantine unsigned native binaries from a temp directory during installation. We asked a customer to whitelist our installer path. The DLL did not appear. The theory died at 09:46.
Our second theory, at 09:51, was a corrupted MSI. We had recently moved from WiX 3.11 to WiX 3.14, and although the merge module for the VC++ 12.0 runtime had not changed, the new toolchain emitted a slightly different Component GUID. Margit pulled the .msi apart with Orca on a clean Hyper-V image of Windows 7 SP1 build 7601 with no servicing stack updates and confirmed the merge module was present, the file was present in the cabinet, and the install action ran to completion. The DLL still did not appear on the customer fleet. Theory two died at 11:30.
We spent the first four hours arguing about the antivirus, the MSI, and the build server. The actual answer was sitting in a manifest that nobody had looked at since 2019.
— Theo Ranganathan, Build Lead · Slack message, 12:04 EST
A short walk through the Windows loader, so the rest of this makes sense.
The Windows loader is the part of ntdll.dll that, before main ever runs, has already walked the import table of your executable and resolved every __declspec(dllimport) to an address in memory. It does this by, for each named DLL, asking a series of questions in a particular order. Most of the time, when the question is asked aloud in code review, somebody gets the order wrong.
The question the loader asks first is not "is there a file called msvcr120.dll on disk?" The question is "does the current activation context have a redirect for the name msvcr120.dll?" If the answer is yes — and on a developer machine, with Visual Studio installed, the answer is almost always yes for the VC++ runtime — the loader is sent to a specific directory under C:\Windows\WinSxS that holds a specific, signed, versioned copy. If the answer is no, the loader falls through to a search of, in this order, the application directory, the System32 folder, the 16-bit system folder (which has not existed in any meaningful way since 2003 but the code still checks), the current working directory, and finally the PATH.
This is the part where the failure mode becomes obvious in retrospect. If our application's manifest requested the SxS-redirected version of msvcr120 — by embedding a <dependency> entry with the right public key token and version — and the SxS cache had it, the loader was happy. If our application's manifest did not request it, the loader fell through to the application directory and System32. On QA machines, the manifest was honored, because the SxS cache was populated. On customer machines, the manifest was honored, but the SxS cache was not populated, because nobody had ever installed Visual Studio 2013 on a point-of-sale terminal in rural Quebec.
The two-line lie in the manifest
The exact mechanism, which we did not understand until late Tuesday, was that our manifest still contained an entry from a 2019 build:
<!-- application.manifest, lines 47-48 --> <dependentAssembly> <assemblyIdentity type="win32" name="Microsoft.VC120.CRT" version="12.0.21005.1" processorArchitecture="x86" publicKeyToken="1fc8b3b9a1e18e3b" /> </dependentAssembly>
That version, 12.0.21005.1, was the original RTM of the Visual C++ 2013 redistributable. Our build machines, at some point in 2021, had been upgraded to Visual Studio 2013 Update 5, which installs 12.0.40664.0. The SxS cache on every developer and QA box held both. On customer machines that had ever run the official VC++ 2013 redistributable installer published after January 2014, only the newer version was present. The loader, asked for an exact match on 12.0.21005.1, said no, fell through to System32, did not find the DLL there either, and gave up.
It is worth stopping for a moment on that "fell through" step, because a lot of engineers — including, on Monday morning, three of us — assume that if SxS resolution fails, the loader will still find msvcr120.dll sitting next to the application or in System32. It will not. When the manifest contains an explicit dependentAssembly request, the loader treats the SxS lookup as authoritative. A failed SxS lookup is a hard failure for that DLL name. The "would have worked anyway" copy three folders away is invisible to it.
The loader is not looking for a file. The loader is looking for an assembly, by name, version, architecture, and public key token. The file is a side effect.— Internal training note, Desktop Platform team · September 2021
What Dependency Walker told us, and what it conspicuously did not.
By Tuesday morning the incident channel had moved from theories to tools. Margit, who started her career on Windows XP and had not stopped thinking in PE-headers since, ran the customer's executable through Dependency Walker on a freshly imaged Windows 7 SP1 virtual machine that had received only the security updates required to reach KB4474419 — the SHA-2 code-signing update — and nothing else. The result was a long red-and-yellow tree that no person should have to look at before lunch.
The tree showed, at the top, our application posclient.exe with a green check. Below it, indented, a list of direct imports: kernel32.dll in green, user32.dll in green, advapi32.dll in green, ole32.dll in green, and then a series of yellow question marks against api-ms-win-core-rtlsupport-l1-1-0.dll, api-ms-win-core-processthreads-l1-1-1.dll, and roughly thirty other apiset stubs that on Windows 7 SP1 simply do not exist as files on disk. Those yellow flags were noise. Depends.exe is famously bad at understanding that apisets are a contract, not a file.
Below the noise, near the bottom, was the real find. A line in red: MSVCR120.DLL, marked as not found, with a small footnote that read "Error opening file. The system cannot find the file specified (2)." Below it, a single child entry — also red — listed MSVCP120.DLL. That second line was the one that finally pointed us at the right answer, but not for the reason we thought.
What the table actually contained
What Margit produced on Tuesday afternoon, after about six hours with depends.exe and the kernel debugger, was a table. The table was the first artifact in the incident that did not contradict itself. We are reproducing it below, lightly cleaned up, because it is the clearest evidence that the question "where does this DLL come from?" has more than one answer at the same time.
| Module | Version | Arch | Size | Source | Role on customer host |
|---|---|---|---|---|---|
| posclient.exe | 6.1.7601.24545 | x86 | 11,734,016 | %ProgramFiles% | our binary |
| kernel32.dll | 6.1.7601.24214 | x86 | 1,165,824 | System32 | resolved |
| ntdll.dll | 6.1.7601.24545 | x86 | 1,310,224 | System32 | resolved |
| msvcr120.dll | 12.0.21005.1 | x86 | — | WinSxS (manifest) | not present |
| msvcr120.dll | 12.0.40664.0 | x86 | 974,952 | WinSxS (CRT update) | present, ignored |
| msvcp120.dll | 12.0.40664.0 | x86 | 532,592 | WinSxS (CRT update) | present, ignored |
| msvcr120.dll | 12.0.40664.0 | x86 | 974,952 | App folder (post-fix) | resolved (Tuesday 22:18) |
| api-ms-win-core-com-l1-1-0.dll | — | x86 | — | apiset (contract) | false flag (depends.exe) |
| vcruntime120.dll | 14.16.27033.0 | x86 | 87,816 | WinSxS (VC2017) | unrelated; on host |
| ucrtbase.dll | 10.0.10240.16384 | x86 | 1,012,440 | System32 (UCRT) | unrelated; KB2999226 |
Table 03-A · Customer host: Lenovo M73 Tiny, Win7 SP1 x86, build 7601, no Convenience Rollup, KB2999226 absent.
The thing that the table makes obvious — and that nothing in the depends.exe GUI made obvious — is that the customer machine had a perfectly good copy of msvcr120.dll sitting in WinSxS. It was the wrong version. Or rather, our manifest had asked for the wrong version. Three weeks of testing on developer machines never noticed because our developer machines had both versions side by side. Eight hundred QA-bot runs never noticed because the QA image was a long-lived snapshot that had Visual Studio 2013 in its installation history.
Depends.exe is a museum piece that nonetheless still tells you, in roughly twelve seconds, what every other tool takes forty minutes to confirm. Just do not believe the apiset rows.
— Margit Halvorsen · post-incident write-up, p. 6
Registry forensics, the hive that lied, and a SxS counter that has been wrong since 2014.
Two of the worst hours of the incident were spent staring at HKLM\SOFTWARE\Microsoft\Windows\CurrentVersion\Uninstall on a customer's offline disk image. The customer — a Brazilian pharmacy chain running point-of-sale on Windows 7 SP1 x86, build 7601 with the April 2019 security-only rollup as their last update — had let us pull a forensic image off one of their machines. We mounted it read-only under \\.\Disk2 on a Windows 11 workstation and loaded the SOFTWARE hive with reg load HKLM\OFFLINE C:\mnt\Windows\System32\config\SOFTWARE.
The hive showed, in the Uninstall key, an entry labeled "Microsoft Visual C++ 2013 Redistributable (x86) - 12.0.40664" with a DisplayVersion of 12.0.40664.0 and an InstallDate of 20180317. The registry was telling us, with complete confidence, that this customer had installed the C++ 2013 runtime on the seventeenth of March 2018. The loader was telling us that the assembly version 12.0.21005.1 requested by our manifest was not available. Both were correct.
This is the part of the story that is hard to internalize until you have lived through it. The presence of an Uninstall entry says one thing, and one thing only: that an MSI installer ran once and registered a product code with Windows Installer. It says nothing about whether the runtime was uninstalled later, whether a system-restore operation rolled the SxS cache backward, whether a third-party "cleaner" tool removed files under WinSxS (looking at you, every "PC speedup" application from 2011 to 2017), or whether a manual installation of a single missing DLL into System32 bypassed the SxS publishing step entirely.
The COMPONENTS hive, and why we left it alone
Late Tuesday, around 18:40, Theo proposed mounting the COMPONENTS hive directly. The COMPONENTS hive — C:\Windows\System32\config\COMPONENTS — is the database the Component Based Servicing (CBS) stack uses to track every package, payload, and assembly Windows knows about. On a healthy Windows 7 SP1 machine it has on the order of 180,000 keys and weighs roughly 220 megabytes. On the customer's image it was 312 megabytes, which is a soft indication that something has gone wrong with servicing at some point in the machine's life.
We loaded it. We regretted loading it. Within ten minutes the workstation was paging hard, RegEdit was showing the spinning circle, and Margit ran the DISM /Image:C:\mnt /Get-Packages command in another window which took eleven minutes to enumerate. The output was useful, but a long way from worth eleven minutes: it confirmed that KB2999226 — the Universal C Runtime update for Windows 7 SP1, dated June 2015 — was not installed on the customer's machine, and that the Convenience Rollup KB3125574 from May 2016 was also not installed.
That is what we should have asked first. Most of our customers in Brazil and Vietnam were running Windows 7 SP1 in a state we politely call "as Microsoft published it in 2011, plus security-only updates." They had never received the Convenience Rollup. They had never received the various servicing stack updates that backported pieces of the SxS resolver. They were running, in 2026, a fifteen-year-old version of the loader.
A timeline of the incident, as it actually unfolded
First P1 page from PagerDuty.
Customer in São Paulo opens a ticket at 07:14:32 EST. Title: "Software won't open after update." Screenshot attached. The screenshot will, in retrospect, contain the full answer. Three engineers are paged.
The antivirus theory.
Bitdefender Endpoint Security 7.2.1.96 is named as the suspect. We ask three customers to whitelist our install path. Two reply within an hour. The DLL still does not appear. We move on.
MSI theory dies on a clean VM.
Margit images Windows 7 SP1 with no rollups, runs our installer end-to-end, sees the same failure. The MSI is fine. The merge module is fine. The cabinet is fine. The DLL is in C:\Windows\WinSxS\x86_microsoft.vc120.crt_*_12.0.40664.0_* after install — but not where the loader wants it.
First look at the manifest.
Theo runs mt.exe -inputresource:posclient.exe;#1 -out:dumped.manifest and opens the file. The dependentAssembly entry shows version="12.0.21005.1". Nobody has changed this manifest since 2019. Theo posts a screenshot. Margit goes quiet for nine minutes.
The SxS cache hypothesis.
Margit, working out of Oslo at 04:08 local, posts a single message: "The loader is asking for an assembly version that was deprecated by VS2013 Update 5 in 2014. We have been shipping a manifest for a runtime that has not existed in the wild for eleven years."
Confirming with a kernel debugger.
We attach WinDbg 10.0.19041.1 to a frozen process on the test VM, set a breakpoint on ntdll!LdrpResolveDllName, and watch it walk the activation context. The resolver fails with STATUS_SXS_ASSEMBLY_NOT_FOUND (0xC0150004). Confirmation.
Hotfix candidate built.
We strip the dependentAssembly block from the manifest, embed the manifest with mt.exe, and ship msvcr120.dll and msvcp120.dll at version 12.0.40664.0 next to posclient.exe. Now the loader, finding no SxS request, falls through to the application directory and resolves locally.
Hotfix complete and signed.
Build 6.1.7601.24559 ships through the auto-updater. Open tickets drop from 1,284 to 71 in six hours. The remaining 71 are mostly customers whose machines have not phoned home since Monday morning.
Manifest hell, by the numbers. A small, opinionated tour of the wreckage.
The frozen manifest.
A dependentAssembly entry for a runtime that has not existed in the wild since the third quarter of 2014. The application builds. The application runs in QA. The application fails the moment it meets a machine that does not happen to have a Visual Studio installation in its history.
The unsigned redirect.
A team ships an app.exe.local file or a .config with probing privatePath entries pointing inside a subfolder the antivirus sometimes quarantines. Resolves on six machines, fails on the seventh. Survives until a new endpoint protection vendor is rolled out.
The architecture mismatch.
A 32-bit binary loaded inside a 64-bit host (Excel calling our COM add-in is the classic) finds the wrong CRT because the host's manifest wins. We have seen this exactly twice in production, both times during the first hour after a customer migrated from Office 2016 to Office 2019.
Three days after the hotfix shipped we sat down and tried to write a one-page document explaining what we had learned. The page kept growing. By Friday it was four pages, and by the following Tuesday it had become this article. The compressed version, for any engineer who is going to ship native code on Windows in the next decade, fits in three observations.
First, the manifest is part of the binary. Treat it that way. Embed it with mt.exe as a build step, version it in source control, run it through CI as a separate verification artifact. We had been treating ours as a file that lived next to the project, edited by hand once in 2019, and never reviewed. Nobody on the team in 2026 had been in the room when those two lines were written.
Second, ship your runtime. The advice from Microsoft for nearly two decades has been to ship the official redistributable as a chained installer and let SxS do the work. The advice was good in a world where every customer machine received Windows Update reliably. In 2026, with a long tail of point-of-sale terminals, kiosks, embedded systems, and air-gapped LAN segments, the advice is at best half-right. The pragmatic answer is to embed a manifest that does not name an SxS assembly for your CRT, and to drop the DLLs next to your executable. The loader will find them. Customers who have a working SxS cache will not be harmed. Customers who do not will suddenly be unblocked.
Third, test on a machine that has never been a developer machine. We have, since this incident, kept three Hyper-V images permanently. One is Windows 7 SP1 x86 with security-only updates through April 2019, no Convenience Rollup. One is Windows 10 21H2 LTSC with all updates. One is Windows 11 24H2 freshly imaged. Every release runs end-to-end on all three before it is signed. The cost is forty minutes of CI per build. The cost of not doing it was sixty-three hours of incident response and a deeply embarrassing apology email to a pharmacy chain in São Paulo.
The whole story collapses into one sentence: we shipped a manifest that asked for an assembly version that had not existed in any redistributable for eleven years, and discovered the bug the day our customers did.
— Incident summary, line 1, paragraph 1
Questions readers always ask, with answers we have stopped equivocating about.
Q.01Should I just static-link the CRT and stop thinking about this?
Q.02Is shipping the DLL next to the executable really endorsed?
Q.03What is the difference between vcruntime120.dll and msvcr120.dll?
Q.04Could we have caught this with anything in CI?
Q.05Is KB2999226 still relevant in 2026?
Q.06Why not drop Windows 7 support entirely?
Q.07What is the cheapest tool that would have shown us this in five minutes?
A checklist for shipping native dependencies, written in the kitchen at 03:00.
Most incidents end with a fix. The good ones end with a checklist. The best ones end with somebody volunteering to write the checklist down for the next team.
— Closing line, Bergen Desktop Platform meetup, 28 February 2026