Posted by Mateusz Jurczyk, Google Project
Zero
When tackling a new vulnerability research target, especially a closed-source one, I
prioritize gathering as much information about it as possible. This gets especially interesting when
it’s a subsystem as old and fundamental as the Windows registry. In that case, tidbits of valuable data
can lurk in forgotten documentation, out-of-print books, and dusty open-source code – each potentially
offering a critical piece of the puzzle. Uncovering them takes some effort, but the payoff is often immense.
Scraps of information can contain hints as to how certain parts of the software are implemented, as well as
why – what were
the design decisions that lead to certain outcomes etc. When seeing the big picture, it becomes much easier
to reason about the software, understand the intentions of the original developers, and think of the
possible corner cases. At other times, it simply speeds up the process of reverse engineering and saves the
time spent on deducing certain parts of the logic, if someone else had already put in the time and
effort.
One great explanation for how to go beyond the
binary and utilize all available sources of information was presented by Alex Ionescu in
the keynote of OffensiveCon 2019 titled “Reversing
Without Reversing”. My registry security audit did involve a lot of hands-on
reverse engineering too, but it was heavily supplemented with information not coming directly from
ntoskrnl.exe. And while Alex’s talk discussed researching Windows as a whole, this blog post provides a
concrete case study of how to apply these ideas in practice. The second goal of the post is to consolidate
all collected materials into a single, comprehensive summary that can be easily accessed by future
researchers on this topic. The full list may seem overwhelming as it includes some references to overlapping
information, so the ones I find key have been marked with the 🔑 symbol. I highly recommend reviewing
these resources, as they provide context that will be helpful for understanding future posts.
Microsoft Learn
Official documentation is probably the first and
most intuitive thing to study when dealing with a new API. For Microsoft, this means the Microsoft Learn
(formerly MSDN Library), a vast body of technical information maintained for the benefit of Windows software
developers. It is wholly available online, and includes the following sections and articles devoted to the
registry:
🔑
 – the main page about all registry-related subjects. It contains a wealth of knowledge, and
is a must read for anyone deeply interested in this system mechanism. It is divided into three sections:
-
About the Registry
 – provides an introduction to the registry and many of its fundamental concepts.
-
Using the Registry
 – provides several examples of how to perform certain common tasks using the Registry
API in
C++. - Registry
Reference – includes complete documentation of all functions that
make
up the Registry API (see Registry
Functions), and specifies the Registry
element size limits.
- Windows
registry information for advanced users – a separate article that
discusses the principles of the registry. It appears to be somewhat outdated (the latest version
mentioned is Windows Vista), and based on an old KB256986 article that can be traced back to
at
least 2004. - Inside
the Windows NT Registry and Inside
the Registry – two articles
published by Mark Russinovich in the Windows NT Magazine in 1997 and 1999, respectively. - Windows
2000 Registry Reference – a web
mirror of regentry.chm, an official help file bundled with the Windows 2000 Resource Kit. It includes a
brief introduction to the registry followed by detailed descriptions of the standard registry content,
i.e. keys and values used for advanced configuration of the system and applications.
- Windows
Server 2003 Resource Kit Registry Reference – a similar, but more recent reference for Windows
Server 2003.
- Using
the Registry Functions to Consume Counter Data – information about collecting performance data through
the registry pseudo-keys: HKEY_PERFORMANCE_DATA, HKEY_PERFORMANCE_TEXT and
HKEY_PERFORMANCE_NLSTEXT. - Offline
Registry Library – complete
documentation of the built-in Windows offreg.dll library, which can be used to inspect / operate on
registry hives without loading them in the operating system. - Registry system call documentation, e.g.
ZwCreateKey – a reference guide to the kernel-mode support of the
registry, which reveals numerous details about how it works internally and how the high-level API
functions are implemented under the hood. - Filtering
Registry Calls – a set of eight articles detailing how to correctly
implement registry callbacks as a kernel driver developer.
- CmRegisterCallbackEx
function (wdm.h) – documentation of the CmRegisterCallbackEx routine
used for registering callbacks. From there, one can browse to other relevant pages, such as the
documentation of the callback
function prototype, and further to the documentation of all of the
operation-specific structures (such as REG_CREATE_KEY_INFORMATION).
- [MS-RRP]:
Windows Remote Registry Protocol – technical specification of the RPC
protocol used by the Remote Registry feature.
Blogs and
online resources
online resources
Due to the fact that the registry stores a
substantial amount of traces of user activity, it is a popular source of information in forensic
investigations. As a result, a number of articles and blog posts have been published throughout the years,
focusing on the internal hive structure, registry-related kernel objects, and recovering deleted data. Below
is the list of non-official registry resources I have managed to find online, from earliest to
latest:
- WinReg.txt,
author unknown (signed as B.D.) – documentation of the hive binary formats in Windows 3.x
(SHCC3.10), Windows 95 (CREG) and Windows
NT (regf) based on reverse engineering. It was likely the first public write-up outlining the
undocumented structure of the hives. - Security
Accounts Manager, author unknown (signed as
[email protected]) – a comprehensive article primarily focused on the user
management internals in Windows 2000 and XP, dissecting a number of binary structures used by the
SAMÂ component. Since user and credential
management is highly tied to the registry (all of the authentication data is stored there), the article
also includes a “Registry Structure” section that explains the encoding of regf hive
files. - 🔑 Windows
registry file format specification, Maxim
Suhanov – a high-quality and relatively up-to-date specification of the regf format versions 1.3
to 1.6, with extra bits of information regarding the ancient versions 1.1 and 1.2. - Windows
NT Registry File (REGF) format specification, Joachim Metz – another
independently developed specification of the regf format associated with the
libregf library. - Push
the Red Button, Brendan Dolan-Gavitt (moyix) – a personal blog focused on security, reverse engineering
and forensics. It contains a number of interesting registry-related posts dating back to
2007-2009. - Windows
Incident Response, Harlan Carvey – a
technical blog dedicated to incident response and digital analysis of Windows, with a variety of posts
dealing with the registry published between 2006-2022. - My
DFIR Blog, Maxim Suhanov – another blog concentrating on digital forensics
with many mentions of the Windows registry. It provides some original information that’s hard to
find elsewhere, see e.g. Containerized
registry hives in Windows. - Digging
Up the Past: Windows Registry Forensics Revisited, David Via – a blog post by Mandiant discussing the recovery
of data from registry hives and transactional logs. - Creating
Registry Links and Mysteries
of the Registry, Pavel Yosifovich – two
blog posts covering the creation of symbolic links in the registry, and its overall internal
structure. - Windows
Registry, Wikipedia contributors – as
usual, Wikipedia doesn’t disappoint, and even though the article includes few deeply technical
details, it features extensive sections on the history of the registry, its high level design and role
in the system.
Furthermore, The
Old New Thing is a fantastic, technical blog exploring the quirks, design
decisions, and historical context behind Windows features. It is written by a Microsoft employee of over 30
years, Raymond Chen, with an astounding consistency of one post per day. While the blog posts are not
technically documentation, they are very highly regarded in the community and
can be considered a de-facto Microsoft knowledge resource – only more fun than Microsoft Learn. Over
the course of the last 20+ years, Raymond would sometimes write about the registry, sharing interesting
behind-the-scenes stories and anecdotes concerning this feature. I have tried to find and compile all of the
relevant registry-related posts in the single list below:
- Why
is a registry file called a “hive”? (August 8th, 2003) - The
long and sad story of the Shell Folders key (November 3rd, 2003) - Beware
of non-null-terminated registry strings (August 24th, 2004) - The
performance cost of reading a registry key (February 22nd, 2006) - The
.Default user is not the default user (March 2nd, 2007) - Why
are INI files deprecated in favor of the registry? (November 26th, 2007) - How
did registry keys work in 16-bit Windows? (January 17th, 2008) - Why
do registry keys have a default value? (January 18th, 2008) - Why
can’t you apply ACLs to registry values? (January 23rd, 2009) - What
is the terminology for describing the various parts of the registry? (February 4th, 2009) - What
the various registry data types mean is different from how they are handled (February 5th, 2009) - The
inability to lock someone out of the registry is a feature, not a bug (March 26th, 2009) - Why
is there the message ‘!Do not use this registry key’ in the registry? (March 22nd, 2011) - Why
is the registry a hierarchical database instead of a relational one? (September 7th, 2011) - Cheap
amusement: Searching for spelling errors in the registry (May 10th, 2012) - What
was the registry like in 16-bit Windows? (May 21st, 2012) - Why
does RegOpenKey sometimes (but not always) fail if I use two backslashes instead of
one? (October 4th, 2012) - Why
do I get notified for changes to HKEY_CLASSES_ROOT when nobody is writing to
HKEY_CLASSES_ROOT? (December 5th,
2012) - RegNotifyChangeKeyValue
sucks less (February 26th, 2015) - So
how bad is it that I’m calling RegOpenKey instead of RegOpenKeyEx? (January 20th, 2016) - How
can I change a registry key from within the debugger? (September 8th, 2016) - If
I simply want to create a registry key but don’t intend to do anything else with it, what
security access mask should I ask for? (November 28th, 2016) - Diagnosing
why you cannot create a stable subkey under a volatile parent key (May 25th, 2017) - How
can I programmatically inspect and manipulate a registry hive file without mounting
it? (October 15th, 2018) - It
rather involved being on the other side of this airtight hatchway: Messing with somebody’s
registry (January 9th, 2019) - Why
doesn’t RegSetKeySecurity propagate inheritable ACEs, but SetSecurityInfo
does? (January 2nd, 2020) - The
sad but short story of the SM_AccessoriesName registry value (March 10th, 2020) - Why
does RegÂNotifyÂChangeÂKeyÂValue stop notifying once the key is
deleted? (May 7th, 2020) - How
can I emulate the REG_NOTIFY_THREAD_AGNOSTIC flag on systems that don’t support it? part
1Â (December 21st, 2020) - How
can I emulate the REG_NOTIFY_THREAD_AGNOSTIC flag on systems that don’t support it? part
2Â (December 22nd, 2020) - How
can I emulate the REG_NOTIFY_THREAD_AGNOSTIC flag on systems that don’t support it? part
3Â (December 23rd, 2020) - How
can I emulate the REG_NOTIFY_THREAD_AGNOSTIC flag on systems that don’t support it? part
4Â (December 24th, 2020) - How
can I emulate the REG_NOTIFY_THREAD_AGNOSTIC flag on systems that don’t support it? part
5Â (December 25th, 2020) - The
history of passing a null pointer as the key name to RegÂOpenÂKeyÂEx (July 23rd, 2021) - On
the failed unrealized promise of RegÂOverrideÂPredefÂKey (October 20th, 2023)
Academic papers and
presentations
presentations
Recovering meaningful artifacts from the registry during digital forensics is also a
problem known in academia. To find relevant works, I often begin by typing the titles of a few known papers
in Google
Scholar, and then delve into a breadth-first search
of their bibliographies. Here’s what I managed to find pertaining to the registry:
- Forensic
analysis of the Windows registry in memory (2008), Brendan Dolan-Gavitt – a paper detailing
techniques for extracting and analyzing Windows registry data from physical memory dumps. - Recovering
deleted data from the Windows registry (2008), Timothy D. Morgan – a paper and accompanying
slide deck that examine deleted registry data structures in NT-based Windows systems, propose an
algorithm for their recovery, and introduce the RegLookup tool to implement this recovery
process. - Forensic
analysis of unallocated space in Windows registry hive files (2008), Jolanta Thomassen – a 63-page MSc dissertation
demonstrating the feasibility of recovering deleted or updated Windows registry keys from unallocated
space within hive files. - The
internal structure of the Windows Registry (2009), Peter Norris – a 144-page MSc thesis focusing on
the reconstruction of damaged registry files, analysis of historical states, and the extraction of
standalone forensic evidence from dispersed fragments. - The
Windows NT Registry File Format (2009),
Timothy D. Morgan – a concise paper providing a comprehensive description of the regf format data
structures. - Windows
Kernel Internals: NT Registry Implementation (2009), David B. Probert – a slide deck discussing the
registry internals through the lens of the Windows kernel, offering a unique perspective of a Windows
kernel developer.
Open-source
software
software
To paraphrase a famous saying, source code is
worth a thousand words. Sometimes it is far easier to grasp
a concept or design by looking straight at code instead of reading an English explanation. And while the
canonical implementation of the registry is the Windows kernel, a number of open-source projects have been
developed over the years to operate on registry hives. They are typically either based on regf format
analysis performed by the developer itself, or on existing documentation and other open-source tools. The
three main reasons for their existence are a) computer forensics, b) emulating Windows behavior on other
host platforms, c) directly accessing the SAM hive to change/reset local user credentials. Whatever the
reason, such projects may prove useful in better understanding the internal hive format, and help in
building proof-of-concept hives if necessary. A list of all the relevant open-source libraries and utilities
I have found is shown below:
- libregf – a library written in C with Python bindings,
- hivex –
a library written in C as part of the libguestfs project, with bindings for OCaml, Perl, Python and
Ruby, - cmlib –
a module implemented in C as part of ReactOS, which closely resembles the Windows implementation, - chntpw (The Offline Windows Password Editor) – a tool developed
in C between 1997-2014 to manage Windows user passwords offline directly in the SAM hive. The
registry-related code is located in ntreg.c (regf parser) and reged.c (a basic registry editor), - Samba – the Samba project includes yet another implementation
of the Windows registry (under source3/registry and source4/lib/registry), - regipy – a Python registry hive parsing library and
accompanying tools, - yarp –
literally yet another registry parser (in Python), - Registry – a hive parser written in C#,
- nt-hive – a hive parser written in Rust (with read-only
capabilities), - Notatin – another hive parser written in Rust, including Python
bindings and helper binaries.
Lastly, at the time of this writing, simply searching for some internal kernel
function names on GitHub might
reveal how certain functionality was implemented in Windows itself 20+ years ago.
SDK Headers
Header files distributed with Software Development Kits are an interesting case, because on one hand
they are an official resource with information that Microsoft intends the developers to use, but on the
other – they are a bit more concealed, as online documentation isn’t always kept up to date with
regards to their contents. We can thus explore their local copies on disk and sometimes find artifacts
(function declarations, structure definitions, comments) that are not publicly documented online. Some of
the headers most relevant to the registry are:
- winreg.h (user-mode) – the primary registry header on the list,
containing the prototypes of functions and structures from the official Registry API. - wdm.h (kernel-mode)
– specifies a number of interesting constants/flags and types used by the system call interface of
the registry, for example hive load flags (third argument of NtLoadKey2, such
as REG_LOAD_HIVE_OPEN_HANDLEÂ etc.) or key/value query structures
(KEY_TRUST_INFORMATION, KEY_VALUE_LAYER_INFORMATION, etc.). - ntddk.h (kernel-mode)
– contains some types not found elsewhere, e.g. KEY_LAYER_INFORMATION. - winnt.h (user-mode) – mostly equivalent to wdm.h.
- winternl.h (user-mode)
– contains the declarations of some registry-related system calls
(NtRenameKey, NtSetInformationKey).
Security research
Learning about prior security research can be
especially useful when starting a new project yourself. Not only does it often reveal deep technical details
about the target, but it also comes from like-minded professionals who look at the code through a security
lens, and may inspire ideas of further weaknesses or areas that require more attention. When it comes to the
registry, I think that relatively little work has been done in the public space compared to its high
complexity and the pivotal role it plays in the Windows operating system. Nevertheless, there were some
materials that I found extremely insightful, especially those by my colleague James Forshaw from Project
Zero. The full list of security-relevant resources I have managed to gather on this topic is shown below
(including some references to my own publications from the past):
- Case
study of recent Windows vulnerabilities (2010), Gynvael Coldwind, Mateusz Jurczyk – a
presentation on several security bugs Gynvael and I found during our brief registry research in
2009/2010. - Microsoft
Kernel Integer Overflow Vulnerability (2016), Honggang Ren – a write-up on CVE-2016-0070, a
Windows kernel vulnerability in the loading of malformed hive files. - Project
Zero bug tracker (2016), James Forshaw,
Mateusz Jurczyk – four bug reports submitted to Microsoft as a result of naive registry hive
fuzzing. - Project
Zero bug tracker (2014-2020), James
Forshaw – 17 vulnerabilities related to the registry discovered by James, many of them are logic
issues at the intersection of registry and other system mechanisms (security impersonation, file
system).
Books
For a 20+ year old codebase such as the registry, it is expected that some resources
covering it in the early days were published on paper rather than on the Internet. For this reason, part of
my standard routine is to search Google
Books for various technical terms and keywords
related to the specific technology and see what pop ups. For the registry, these could be e.g.
“regedit”, “regf”, “hbin”, “LOG1”, “RegCreateKey”,
“NtCreateKey”, “HvAllocateCell”, “RegistryMachine”, “key control
block” and so on. In some cases this yields books with unique, strictly technical information, while in
others the most insightful part is the historical perspective and being able to see how the given technology
was perceived soon after it first came out. And sometimes the value of the book is a complete surprise until
it arrives in the mailbox, as it is neither offered for sale as an ebook nor has preview available in Google
Books, and so a hard copy is required.
The books that I found which are either fully or
partially dedicated to the Windows registry are (latest to oldest):
- 🔑 Windows
Internals (Part 2, 7th Edition)Â by Andrea
Allievi, Alex Ionescu, Mark E. Russinovich, David A. Solomon – the Windows Internals series is an
in-depth technical guide that delves into the architecture, components, and underlying mechanisms of the
operating system. The latest edition, covering Windows 10, features a dedicated 35-page chapter on the
registry, and explores many technical details that are difficult to find elsewhere. Notably, the
registry has been covered in the book since Windows Internals 4 (corresponding to Windows XP/Server
2003), with explanations progressively expanding in subsequent editions. Comparing these chapters could
be an interesting exercise to observe how the registry has evolved throughout the years. - Windows
Registry Forensics by Harlan Carvey - Microsoft
Windows Registry Guide by Jerry
Honeycutt - Managing
the Windows NT Registry and Managing
the Windows 2000 Registry by Paul E.
Robichaux - Inside
the Registry for Microsoft Windows 95Â by
Günter Born - Inside
the Windows 95 Registry by Ron
Petrusha
Patents
Another useful source of information that may be otherwise difficult to find are
patents, indexed by Google
Patents. A particularly valuable result that I found this way is 🔑 Containerized Configuration (US20170279678A1), Microsoft’s patent from 2016 that thoroughly
explains the core concepts behind differencing hives and layered keys in registry. These mechanisms are part
of a new feature introduced in Windows 10 Anniversary Update to better support containerization, but any
official documentation of how it works is nowhere to be found. The patent is thus a great aid in
understanding the intricate aspects of this new registry functionality, adding the necessary context and
helping to make sense of otherwise highly cryptic kernel functions like
CmpPrepareDiscardAndReplaceKcbAndUnbackedHigherLayers.
Manual analysis
So far, all of the resources we’ve discussed
were accessible through a web browser, a text editor, or in physical form. But there is another type of
information source that is equally, if not more, important, and that requires more specialized tooling to
make sense of it. What I mean by that is the knowledge we can extract from the executable images in Windows
responsible for handling the registry, both in terms of the “standard” reverse-engineering and
also fully taking advantage of any helpful artifacts in or around them. I’ll write more about the
hands-on reversing process in upcoming posts, and now we will turn our attention to those artifacts that
present us with clear-cut information without the need for deduction.
On a side note, by far the most essential file to be looking at is ntoskrnl.exe, the core NT kernel
image. It contains the entirety of the kernel-space registry implementation and is of interest both from the
security and functional perspective. I have personally spent 99% of my manual analysis time looking at that
particular binary, but it’s worth noting that there are a few other executables and libraries related to
the registry as well:
- winload.exe
– the Windows Boot Loader, which executes before the Windows kernel. One of its responsibilities
is to load the SYSTEM hive into memory and read some configuration from it, so it includes a partial
copy of the registry code from ntoskrnl.exe. - offreg.dll
– the Offline Registry Library, which also shares some registry code with the kernel (but executes
in user-mode). - kernelbase.dll – one of the primary WinAPI libraries,
implementing a majority of the user-space Registry API. - ntdll.dll
– another core user-mode library which provides a bridge between the Registry API and the kernel
registry implementation. - regsvc.dll
– a DLL implementing the Remote Registry Service.
Let’s investigate what types of information about the
registry are readily available to us by
running a disassembler/decompiler. I personally use IDA Pro + Hex-Rays and so the examples below are based
on them.
🔑 Public symbols (PDB)
Microsoft makes public symbols available for a majority of executable images found in
C:Windows, for the benefit of developers and security researchers. By “public” symbols I mean PDB
files that mainly contain the names of functions found in the binaries, which help in symbolizing system
stack traces during debugging or in the event of a crash. In the past, the symbols used to be bundled with
the system installation media or on a separate Resource Kit disc, and later they were available for download
in the form of pre-packaged archives from the Microsoft website. Both of these channels have been
deprecated, and currently the only supported way to obtain the symbols is on a per-file basis from the
Microsoft
Symbol Server. The PDB files can be downloaded directly with the official SymChk tool, or indirectly through software that supports the symbol
server (e.g. IDA Pro, WinDbg).
As for ntoskrnl.exe specifically, its accompanying symbols are one of the most
invaluable sources of information. As mentioned in an earlier post, the Windows kernel follows a consistent
naming convention, so we can immediately see which internal routines are related to the registry, and where
the entry points (registry-related system call handlers) that we might start our analysis from are. It shows
us the extent of the code we are dealing with (1000+ registry functions) and makes it possible to perform
analysis such as the one shown in blog
post #2Â (counting lines of code per system version) out-of-the-box, without doing
any reverse engineering work. And perhaps most importantly, the function names make it substantially easier
to reason about the code while doing the actual reversing, especially for functions with very descriptive
names, like CmpCheckAndFixSecurityCellsRefcount or
CmpPromoteSingleKeyFromParentKcbAndChildKeyNode.
The other type of information we can find in the
kernel debug symbols are types: enums, structures and unions. However, there are two caveats. First, only
some types are included in the PDBs, and it’s not clear what criteria Microsoft uses to decide whether
to publish them or not. My rough estimate is that ~50% of the registry types can be found there, mostly the
fundamental ones. Secondly, even though the prototypes of some types are in the symbols, neither the
function arguments nor local variables are annotated with their types, so it is still necessary to determine
the corresponding types and manually annotate the variables for the decompiled output to make any sense.
Nevertheless, having access to this information is still a huge help both in understanding code on a local
level and also grasping the bigger picture.
The structures that can be found in the public
symbols are:
- Hive descriptors
(HHIVE, CMHIVE)
and related structures - Hive bin and cell structures
(HBIN, CM_KEY_NODE,
CM_KEY_VALUE, CM_KEY_SECURITY, …) - Key object related structures
(CM_KEY_BODY, CM_KEY_CONTROL_BLOCK, …) - Some transaction related structures
(CM_TRANS, CM_KCB_UOW, …) - Some layered-key related structures
(CM_KCB_LAYER_INFO, …)
Meanwhile, the ones that are missing and need to
be manually reconstructed are:
- The parse context and path information
structures (as used by CmpParseKey) - Some
transaction related structures (on-disk transaction log records, lightweight transaction object
descriptors, …) - Virtualization-related structures
- Most
layered-key related structures
Most of the relevant type names start with
“CM”, so it’s easy to find them in the Local Types window in IDA:
I would like to take this opportunity to thank Microsoft for making
the symbols available for download, and encourage other vendors to do the same for their products.
🙂
Debug/Checked builds of Windows
Microsoft used to publish debug/checked builds of Windows (in addition to
“free” builds) from Windows NT to early Windows 10. The difference between
them was that the debug/checked builds had some compiler optimizations disabled, and they enabled extra
debugging checks to identify internal system state inconsistencies as early as possible. The developers of
kernel-mode drivers were encouraged to test them on debug/checked Windows builds before considering them as
stable and shipping them to customers. Unfortunately, these special builds have been
discontinued and don’t exist anymore for the
latest Windows 10 and 11.
These old builds can be quite valuable in the context of reverse engineering, because
the extra checks may reveal some invariants and assumptions that the code makes, but which are not obvious
when looking at retail builds. What is more, the checks are often verbose and include calls to functions
like RtlAssert, DbgPrint,
DbgPrintEx etc., passing a textual
representation of the failed assertion, the source code file name and/or the line number. These may disclose
the names of variables, structure members, enums, constants and other types of information. Let’s see
some examples:
DbgPrintEx(DPFLTR_CONFIG_ID, 24u, “tImplausible size %lxn”, v13);
DbgPrintEx(DPFLTR_CONFIG_ID, 24u, “tKey is bigger than containing cell.n”);
DbgPrintEx(DPFLTR_CONFIG_ID, 0, “invalid name starting with NULL on key %08lxn”, a3);
DbgPrintEx(DPFLTR_CONFIG_ID, 0, “invalid (ODD) name length on key %08lxn”, a3);
DbgPrintEx(DPFLTR_CONFIG_ID, 24u, “tNo key signaturen”);
DbgPrintEx(DPFLTR_CONFIG_ID, 0, “tData:%08lx – unallocated Datan”, v20);
DbgPrintEx(DPFLTR_CONFIG_ID, 24u, “Class:%08lx – Implausible size n”, v20);
DbgPrintEx(DPFLTR_CONFIG_ID, 24u, “SecurityCell is HCELL_NIL for (%p,%08lx) !!!n”, a1, v67);
DbgPrintEx(DPFLTR_CONFIG_ID, 24u, “SecurityCell %08lx bad security for (%p,%08lx) !!!n”, v86, a1, v73);
DbgPrintEx(DPFLTR_CONFIG_ID, 0, “Root cell cannot be a symlink or predefined handlen”);
DbgPrintEx(DPFLTR_CONFIG_ID, 0, “invalid flags on root key %lxn”, v31);
DbgPrintEx(DPFLTR_CONFIG_ID, 24u, “tWrong parent value.n”);
The CmpCheckKey function is responsible for verifying the structural correctness
of every key in a newly loaded hive, and for every problem it encounters, it prints a more or less verbose
message. This can help us better understand what each of these checks is intended to accomplish.
DbgPrintEx(DPFLTR_CONFIG_ID, 0, “CmKCBToVirtualPath ==> Could not get name even from parent KCB =
%p!!!!n”, a1);
This message can be interpreted as some kind of a fallback mechanism failing when
converting a registry path. It could indicate an interesting/brittle code construct, and indeed, the
surrounding code did turn out to be affected by a 16-bit integer overflow and a resulting pool memory
corruption (reported in Project Zero issue
#2341). In consequence, the entire block of code (including the vulnerability) was
removed, as it was functionally redundant and didn’t serve any practical purpose.
RtlAssert(“(*VirtContext) & CMP_VIRT_IDENTITY_RESTRICTED”, “minkernel\ntos\config\cmveng.c”, 3554u, 0i64);
This single line of code in CmpIsSystemEntity reveals
a few pieces of information: the name of the function argument (VirtContext), an
internal name of a flag that is not documented in any other resources
(CMP_VIRT_IDENTITY_RESTRICTED), and the source file name and line number of the
expression (minkernelntosconfigcmveng.c:3554). Such information can be ported into our main disassembler database
(such as an .idb) and later help us better understand other areas of code that use the same
object/flags.
DbgPrintEx(DPFLTR_CONFIG_ID, 22u, “Error[1] %lx while processing CmLogRecActionDeleteKeyn”, v12);
This and similar calls in CmpDoReDoRecord inform us of
the internal names of the transaction record types (CmLogRecActionCreateKey,
CmLogRecActionDeleteKey etc.), which again
are not publicly mentioned anywhere else.
Debugging and
experimentation
experimentation
Poking and prodding the registry of a running
Windows system is the last way of learning about it that comes to my mind. In some sense it is a required
step, because we can only get so far by reading static documentation and code. At some point, we will be
forced to investigate the real memory mappings corresponding to the hives, explore the contents of in-memory
registry objects, or verify that a specific function behaves the way we think it does. Thankfully, there are
some tools that make it possible to peek into the internal registry state beyond what the standard utilities
like Regedit allow. They are briefly described in the sections below.
Extended
Regedit alternatives
Regedit alternatives
The built-in Regedit.exe utility offers quite basic functionality, and while it is
adequate for most tinkering and system administration purposes, some third party developers have created
custom registry editors with an extended set of options. I haven’t personally used them so
I cannot attest to their quality, but they may offer some benefits to other
researchers. One example is Total
Registry, whose main advantage is being able to
browse the internal registry tree structure (rooted in Registry) in addition to the standard high-level
view with the five HKEY_* root keys.
Process
Monitor
Monitor
Process
Monitor is a part of the Sysinternals suite of
utilities, and is a widely known program for monitoring all file system, registry and process/thread
activity in Windows in real time. Of course in this case, we are specifically interested in registry
monitoring. For every operation taking place, we can see a corresponding line in the output window, which
specifies the time, type of operation, originating process, registry key path, result of the operation and
other details (all of this is highly configurable):
ProcMon is a great tool for exploring what the registry is like as an interface, and
how applications in the system use it. It is the most helpful when dealing with logical bugs, and attacking
more privileged processes through the registry rather
than attacking the registry implementation itself. For example, I used it to find a suitable exploitation
primitive for Project Zero issue
#2492, which allowed me to demonstrate that predefined keys were inherently insecure,
leading to their deprecation. One of its advantages is that it works out-of-the-box without any special
system configuration (other than the admin rights required to load a driver), and it’s certainly a must
have in a researcher’s toolbox.
🔑
WinDbg and the !reg extension
WinDbg and the !reg extension
WinDbg attached as a kernel debugger to a test (virtual) machine is the ultimate tool
to explore the inner workings of the Windows kernel. I have used it extensively at every step of my
research, to analyze how the registry works, reproduce any bugs that I found, and develop reliable
proof-of-concept exploits. While its standard debugger functionality is powerful enough for most tasks, it
also comes with a dedicated !reg extension that automates the process of traversing registry-specific
structures and presents them in an accessible way. The full list of its options is shown below:
reg <command>   <params>    – Registry extensions
  querykey|q   <FullKeyPath> – Dump subkeys and values
  keyinfo    <HiveAddr> <KnodeAddr> – Dump subkeys and values, given knode
  kcb    <Address>   – Dump registry key-control-blocks
  knode   <Address>   – Dump registry key-node struct
  kbody   <Address>   – Dump registry key-body struct
  kvalue   <Address>   – Dump registry key-value struct
  valuelist <HiveAddr> <KnodeAddr> – Dumps list of values for a particular knode
  subkeylist <HiveAddr> <KnodeAddr> – Dumps list of subkeys for a particular knode
  baseblock <HiveAddr>   – Dump the baseblock for the specified hive
  seccache  <HiveAddr>   – Dump the security cache for the specified hive
  hashindex <HiveAddr> <conv_key> – Find the hash entry given a Kcb ConvKey
  openkeys  <HiveAddr|0>  – Dump the keys opened inside the specified hive
  openhandles <HiveAddr|0> – Dump the handles opened inside the specified hive
  findkcb  <FullKeyPath> – Find the kcb for the corresponding path
  hivelist     Â
   – Displays the list of the hives in the system
  viewlist  <HiveAddr>   – Dump the pinned/mapped view list for the specified hive
  freebins  <HiveAddr>   – Dump the free bins for the specified hive
  freecells <BinAddr>   – Dump the free cells in the specified bin
  dirtyvector<HiveAddr> Â
 – Dump the dirty vector for the specified hive
  cellindex <HiveAddr> <cellindex> – Finds the VA for a specified cell index
  freehints <HiveAddr> <Storage> <Display> – Dumps freehint info
  translist <RmAddr|0>   – Displays the list of active transactions in this RM
  uowlist  <TransAddr>  – Displays the list of UoW attached to this transaction
  locktable <KcbAddr|ThreadAddr> – Displays relevant LOCK table content
  convkey  <KeyPath>   – Displays hash keys for a key path input
  postblocklist    Â
  – Displays the list of threads which have 1 or more postblocks posted
  notifylist     Â
  – Displays the list of notify blocks in the system
  ixlock   <LockAddr>   – Dumps ownership of an intent lock
  finalize  <conv_key>   – Finalizes the specified path or component hash
  dumppool  [s|r]    Â
– Dump registry allocated paged pool
    s – Save list of registry pages to temporary file
    r – Restore list of registry pages from temp. file
As we can see, the extension offers a wide selection of commands related to various
components of the registry: hives, keys, values, security descriptors, transactions, notifications and so
on. I have found many of them to be immensely useful, either on a regular basis (e.g. querykey, kcb,
hivelist), or for more specialized tasks when experimenting with
a particular feature (e.g. translist, uowlist for
transactions).
The best way to discover its potential is to see it in action on a specific example.
I used a Windows 11 guest system for this purpose. Let’s query an existing
HKEY_LOCAL_MACHINESoftwareDefaultUserEnvironment key to find out more about it:
kd> !reg querykey RegistryMachineSoftwareDefaultUserEnvironment
Found KCB = ffff888788731ad0 :: REGISTRYMACHINESOFTWAREDEFAULTUSERENVIRONMENT
Hive  Â
  ffff88877af5c000
KeyNode  Â
000001e6ed0334b4
[ValueType]Â Â
   [ValueName]Â
         [ValueData]
REG_EXPAND_SZÂ Â
  Path  Â
          %USERPROFILE%AppDataLocalMicrosoftWindowsApps;
REG_EXPAND_SZÂ Â
  TEMP  Â
          %USERPROFILE%AppDataLocalTemp
REG_EXPAND_SZÂ Â
  TMP  Â
           %USERPROFILE%AppDataLocalTemp
Here, we have referenced the key by its internal NT object manager registry path
starting with Registry. The relation between the high-level paths known from
Regedit / the Registry API and the internal paths used by the kernel will be detailed in a future post
– for now, we just need to know that these paths are equivalent. We can learn a few things from the
command output: the key is cached in memory and the KCB (Key Control
Block, represented by the CM_KEY_CONTROL_BLOCKÂ structure) is
located at address 0xffff888788731ad0. The address of the SOFTWARE hive descriptor
is 0xffff88877af5c000, and that’s where the
HHIVEÂ / CMHIVEÂ structures are stored.
HHIVEÂ is the first member of CMHIVEÂ at offset 0,
hence why their addresses line up, similar to how the KPROCESSÂ /
EPROCESSÂ structures work. Furthermore, the key node
(CM_KEY_NODE), the definitive representation of a key within the hive file, is
mapped at address 0x1e6ed0334b4. You may
notice that this is a user-mode address, and that’s because in modern versions of Windows, hive files
are generally operated on via section-based mappings within the user address space of a thin
“Registry” process (you can find it in Task Manager). Lastly, we can see that the key has three
values and we are provided with their types, names and data.
Next, we can use !reg kcb to learn more about the
key based on its cached KCB data:
kd> !reg kcb ffff888788731ad0
Key  Â
    : REGISTRYMACHINESOFTWAREDEFAULTUSERENVIRONMENT
RefCount  Â
  : 0x0000000000000001
Flags  Â
   : CompressedName,
ExtFlags  Â
  :
Parent  Â
   : 0xffff88877ab517e0
KeyHive  Â
  : 0xffff88877af5c000
KeyCell  Â
  : 0xe824b0 [cell index]
TotalLevels Â
 : 4
LayerHeight Â
 : 0
MaxNameLen Â
  : 0x0
MaxValueNameLenÂ
:Â 0x8
MaxValueDataLenÂ
:Â 0x66
LastWriteTime Â
: 0x 1d861d2:0xdb7718d1
KeyBodyListHeadÂ
:Â 0xffff888788731b48Â 0xffff888788731b48
SubKeyCount Â
 : 0
Owner  Â
   : 0x0000000000000000
KCBLock  Â
  : 0xffff888788731bc8
KeyLock  Â
  : 0xffff888788731bd8
This is a summary of some of the KCB components that the author of the extension
deemed the most important. We can see the value of the reference count, flags shown in textual form, the KCB
address of the key’s parent, the address of the hive, etc. Let’s resolve the virtual address of the
key node by using !reg cellindex:
kd> !reg cellindex 0xffff88877af5c000 0xe824b0
Map = ffff88877ec20000 Type = 0 Table = 7 Block = 82 Offset = 4b0
MapTable Â
 = ffff88877ec37000Â
MapEntry Â
 = ffff88877ec37c30Â
BinAddress = 000001e6ed033001, BlockOffset = 0000000000000000
BlockAddress = 000001e6ed033000Â
pcell:Â 000001e6ed0334b4
The result is 0x1e6ed0334b4, the same value
that !reg
querykey returned to us earlier. In order to inspect the contents of the key node, we can
use !reg knode:
kd> !reg knode 1e6ed0334b4
Signature:Â CM_KEY_NODE_SIGNATUREÂ (kn)
Name  Â
      : DefaultUserEnvironment
ParentCell Â
    : 0x20
Security  Â
    : 0x98f300 [cell index]
Class  Â
     : 0xffffffff [cell index]
Flags  Â
     : 0x20
MaxNameLen Â
    : 0x0
MaxClassLen Â
   : 0x0
MaxValueNameLen Â
 : 0x8
MaxValueDataLen Â
 : 0x66
LastWriteTime Â
  : 0x 1d861d2:0xdb7718d1
SubKeyCount[StableÂ
]:Â 0x0
SubKeyLists[StableÂ
]:Â 0xffffffff
SubKeyCount[Volatile]:Â 0x0
SubKeyLists[Volatile]:Â 0xffffffff
ValueList.Count Â
 : 0x3
ValueList.List Â
  : 0xe825a8
A very similar effect can be achieved by finding the Registry process, switching to
its context, and inspecting the memory directly by overlaying it onto the
CM_KEY_NODEÂ structure layout:
kd> !process 0 0
****Â NTÂ ACTIVEÂ PROCESSÂ DUMPÂ ****
PROCESSÂ ffffe30198ef5040
  SessionId: none Cid: 0004  Peb: 00000000 ParentCid: 0000
  DirBase: 001ae002 ObjectTable: ffff88877a285f00 HandleCount: 3302.
  Image: System
PROCESSÂ ffffe30198fe1080
  SessionId: none Cid: 0040  Peb: 00000000 ParentCid: 0004
  DirBase: 1002c002 ObjectTable: ffff88877a277b40 HandleCount:  0.
  Image: Registry
[…]
kd> .process ffffe30198fe1080
Implicit process is now ffffe301`98fe1080
WARNING: .cache forcedecodeuser is not
enabled
kd> dt _CM_KEY_NODE 1e6ed0334b4
nt!_CM_KEY_NODE
  +0x000 Signature  Â
  : 0x6b6e
  +0x002 Flags   Â
   : 0x20
  +0x004 LastWriteTime   :
_LARGE_INTEGER 0x01d861d2`db7718d1
  +0x00c AccessBits  Â
 : 0x3 ”
  +0x00d LayerSemantics  :
0y00
  +0x00d Spare1   Â
  : 0y00000 (0)
  +0x00d InheritClass   :
0y0
  +0x00e Spare2   Â
  : 0
  +0x010 Parent   Â
  : 0x20
  +0x014 SubKeyCounts   :
[2] 0
  +0x01c SubKeyLists  Â
 : [2] 0xffffffff
  +0x024 ValueList  Â
  : _CHILD_LIST
  +0x01c ChildHiveReference :
_CM_KEY_REFERENCE
  +0x02c Security   Â
 : 0x98f300
  +0x030 Class   Â
   : 0xffffffff
  +0x034 MaxNameLen  Â
 : 0y0000000000000000 (0)
  +0x034 UserFlags  Â
  : 0y0000
  +0x034 VirtControlFlags :
0y0000
  +0x034 Debug   Â
   : 0y00000000 (0)
  +0x038 MaxClassLen  Â
 : 0
  +0x03c MaxValueNameLen  :
8
  +0x040 MaxValueDataLen  :
0x66
  +0x044 WorkVar   Â
  : 0
  +0x048 NameLength  Â
 : 0x16
  +0x04a ClassLength  Â
 : 0
  +0x04c Name   Â
   : [1]  “æ•„”
In the listing above, we can see the full extent of information stored in the hive
for each key. The name in the last line is incorrectly displayed as æ•„, because formally the type of
CM_KEY_NODE.Name is wchar_t[1], but since the name
consists of ASCII-only characters, it is compressed down so that each
wchar_t element stores two characters of the name (as indicated by the flag 0x20
translated by WinDbg as CompressedName). So æ•„ is in
fact the two first letter of the name, “De”, represented as a UTF-16 code point.
This is only a glimpse of what is possible with WinDbg and the
!reg extension. I highly encourage you to
experiment with other options if you’re curious about the mechanics of the registry and want to explore
further.
Conclusion
In this post, I have aimed to share my
methodology for gathering information and learning about new vulnerability research targets. I hope that you
find some of it useful, either as a generalized approach that applies to other software, or as a
comprehensive knowledge base for the registry itself. Also, if you think I’ve missed any resources,
I’ll be more than happy to learn about them. See you in the next post!