The Windows Registry Adventure #3: Learning resources

Posted by Mateusz Jurczyk, Google Project
Zero

When tackling a new vulnerability research target, especially a closed-source one, I
prioritize gathering as much information about it as possible. This gets especially interesting when
it’s a subsystem as old and fundamental as the Windows registry. In that case, tidbits of valuable data
can lurk in forgotten documentation, out-of-print books, and dusty open-source code – each potentially
offering a critical piece of the puzzle. Uncovering them takes some effort, but the payoff is often immense.
Scraps of information can contain hints as to
how certain parts of the software are implemented, as well as
why – what were
the design decisions that lead to certain outcomes etc. When seeing the big picture, it becomes much easier
to reason about the software, understand the intentions of the original developers, and think of the
possible corner cases. At other times, it simply speeds up the process of reverse engineering and saves the
time spent on deducing certain parts of the logic, if someone else had already put in the time and
effort.

One great explanation for how to go beyond the
binary
 and utilize all available sources of information was presented by Alex Ionescu in
the keynote of OffensiveCon 2019 titled
“Reversing
Without Reversing”
. My registry security audit did involve a lot of hands-on
reverse engineering too, but it was heavily supplemented with information not coming directly from
ntoskrnl.exe. And while Alex’s talk discussed researching Windows as a whole, this blog post provides a
concrete case study of how to apply these ideas in practice. The second goal of the post is to consolidate
all collected materials into a single, comprehensive summary that can be easily accessed by future
researchers on this topic. The full list may seem overwhelming as it includes some references to overlapping
information, so the ones I find key have been marked with the 🔑 symbol. I highly recommend reviewing
these resources, as they provide context that will be helpful for understanding future posts.

Microsoft Learn

Official documentation is probably the first and
most intuitive thing to study when dealing with a new API. For Microsoft, this means the Microsoft Learn
(formerly MSDN Library), a vast body of technical information maintained for the benefit of Windows software
developers. It is wholly available online, and includes the following sections and articles devoted to the
registry:

  • 🔑


    Registry


     – the main page about all registry-related subjects. It contains a wealth of knowledge, and
    is a must read for anyone deeply interested in this system mechanism. It is divided into three sections:

  • Windows
    registry information for advanced users
     – a separate article that
    discusses the principles of the registry. It appears to be somewhat outdated (the latest version
    mentioned is Windows Vista), and based on an old KB256986 article that can be traced back to
    at
    least 2004
    .
  • Inside
    the Windows NT Registry
     and Inside
    the Registry
     – two articles
    published by Mark Russinovich in the Windows NT Magazine in 1997 and 1999, respectively.
  • Windows
    2000 Registry Reference
     – a web
    mirror of regentry.chm, an official help file bundled with the Windows 2000 Resource Kit. It includes a
    brief introduction to the registry followed by detailed descriptions of the standard registry content,
    i.e. keys and values used for advanced configuration of the system and applications.
  • Using
    the Registry Functions to Consume Counter Data
     – information about collecting performance data through
    the registry pseudo-keys: HKEY_PERFORMANCE_DATA, HKEY_PERFORMANCE_TEXT and
    HKEY_PERFORMANCE_NLSTEXT.
  • Offline
    Registry Library
     – complete
    documentation of the built-in Windows offreg.dll library, which can be used to inspect / operate on
    registry hives without loading them in the operating system.
  • Registry system call documentation, e.g.
    ZwCreateKey – a reference guide to the kernel-mode support of the
    registry, which reveals numerous details about how it works internally and how the high-level API
    functions are implemented under the hood.
  • Filtering
    Registry Calls
     – a set of eight articles detailing how to correctly
    implement
    registry callbacks as a kernel driver developer.

Blogs and
online resources

Due to the fact that the registry stores a
substantial amount of traces of user activity, it is a popular source of information in forensic
investigations. As a result, a number of articles and blog posts have been published throughout the years,
focusing on the internal hive structure, registry-related kernel objects, and recovering deleted data. Below
is the list of non-official registry resources I have managed to find online, from earliest to
latest:

  • WinReg.txt,
    author unknown (signed as B.D.) – documentation of the hive binary formats in Windows 3.x
    (
    SHCC3.10), Windows 95 (CREG) and Windows
    NT (regf) based on reverse engineering. It was likely the first public write-up outlining the
    undocumented structure of the hives.
  • Security
    Accounts Manager
    , author unknown (signed as
    [email protected]) – a comprehensive article primarily focused on the user
    management internals in Windows 2000 and XP, dissecting a number of binary structures used by the
    SAM component. Since user and credential
    management is highly tied to the registry (all of the authentication data is stored there), the article
    also includes a “Registry Structure” section that explains the encoding of regf hive
    files.
  • 🔑 Windows
    registry file format specification
    , Maxim
    Suhanov – a high-quality and relatively up-to-date specification of the regf format versions 1.3
    to 1.6, with extra bits of information regarding the ancient versions 1.1 and 1.2.
  • Windows
    NT Registry File (REGF) format specification
    , Joachim Metz – another
    independently developed specification of the regf format associated with the
    libregf library.
  • Push
    the Red Button
    , Brendan Dolan-Gavitt (moyix) – a personal blog focused on security, reverse engineering
    and forensics. It contains a number of interesting registry-related posts dating back to
    2007-2009.
  • Windows
    Incident Response
    , Harlan Carvey – a
    technical blog dedicated to incident response and digital analysis of Windows, with a variety of posts
    dealing with the registry published between 2006-2022.
  • My
    DFIR Blog
    , Maxim Suhanov – another blog concentrating on digital forensics
    with many mentions of the Windows registry. It provides some original information that’s hard to
    find elsewhere, see e.g.
    Containerized
    registry hives in Windows
    .
  • Digging
    Up the Past: Windows Registry Forensics Revisited
    , David Via – a blog post by Mandiant discussing the recovery
    of data from registry hives and transactional logs.
  • Creating
    Registry Links
     and Mysteries
    of the Registry
    , Pavel Yosifovich – two
    blog posts covering the creation of symbolic links in the registry, and its overall internal
    structure.
  • Windows
    Registry
    , Wikipedia contributors – as
    usual, Wikipedia doesn’t disappoint, and even though the article includes few deeply technical
    details, it features extensive sections on the history of the registry, its high level design and role
    in the system.

Furthermore, The
Old New Thing
 is a fantastic, technical blog exploring the quirks, design
decisions, and historical context behind Windows features. It is written by a Microsoft employee of over 30
years, Raymond Chen, with an astounding consistency of one post per day. While the blog posts are not
technically documentation, they are very highly regarded in the community and
can be considered a de-facto Microsoft knowledge resource – only more fun than Microsoft Learn. Over
the course of the last 20+ years, Raymond would sometimes write about the registry, sharing interesting
behind-the-scenes stories and anecdotes concerning this feature. I have tried to find and compile all of the
relevant registry-related posts in the single list below:

Academic papers and
presentations

Recovering meaningful artifacts from the registry during digital forensics is also a
problem known in academia. To find relevant works, I often begin by typing the titles of a few known papers
in
Google
Scholar
, and then delve into a breadth-first search
of their bibliographies. Here’s what I managed to find pertaining to the registry:

Open-source
software

To paraphrase a famous saying, source code is
worth a thousand words
. Sometimes it is far easier to grasp
a concept or design by looking straight at code instead of reading an English explanation. And while the
canonical implementation of the registry is the Windows kernel, a number of open-source projects have been
developed over the years to operate on registry hives. They are typically either based on regf format
analysis performed by the developer itself, or on existing documentation and other open-source tools. The
three main reasons for their existence are a) computer forensics, b) emulating Windows behavior on other
host platforms, c) directly accessing the SAM hive to change/reset local user credentials. Whatever the
reason, such projects may prove useful in better understanding the internal hive format, and help in
building proof-of-concept hives if necessary. A list of all the relevant open-source libraries and utilities
I have found is shown below:

  • libregf – a library written in C with Python bindings,
  • hivex –
    a library written in C as part of the
    libguestfs project, with bindings for OCaml, Perl, Python and
    Ruby,
  • cmlib –
    a module implemented in C as part of
    ReactOS, which closely resembles the Windows implementation,
  • chntpw (The Offline Windows Password Editor) – a tool developed
    in C between 1997-2014 to manage Windows user passwords offline directly in the SAM hive. The
    registry-related code is located in ntreg.c (regf parser) and reged.c (a basic registry editor),
  • Samba – the Samba project includes yet another implementation
    of the Windows registry (under source3/registry and source4/lib/registry),
  • regipy – a Python registry hive parsing library and
    accompanying tools,
  • yarp –
    literally
    yet another registry parser (in Python),
  • Registry – a hive parser written in C#,
  • nt-hive – a hive parser written in Rust (with read-only
    capabilities),
  • Notatin – another hive parser written in Rust, including Python
    bindings and helper binaries.

Lastly, at the time of this writing, simply searching for some internal kernel
function names on
GitHub might
reveal how certain functionality was implemented in Windows itself 20+ years ago.

SDK Headers

Header files distributed with Software Development Kits are an interesting case, because on one hand
they are an official resource with information that Microsoft intends the developers to use, but on the
other – they are a bit more concealed, as online documentation isn’t always kept up to date with
regards to their contents. We can thus explore their local copies on disk and sometimes find artifacts
(function declarations, structure definitions, comments) that are not publicly documented online. Some of
the headers most relevant to the registry are:

  • winreg.h (user-mode) – the primary registry header on the list,
    containing the prototypes of functions and structures from the official Registry API.
  • wdm.h (kernel-mode)
    – specifies a number of interesting constants/flags and types used by the system call interface of
    the registry, for example hive load flags (third argument of
    NtLoadKey2, such
    as
    REG_LOAD_HIVE_OPEN_HANDLE etc.) or key/value query structures
    (
    KEY_TRUST_INFORMATION, KEY_VALUE_LAYER_INFORMATION, etc.).
  • ntddk.h (kernel-mode)
    – contains some types not found elsewhere, e.g.
    KEY_LAYER_INFORMATION.
  • winnt.h (user-mode) – mostly equivalent to wdm.h.
  • winternl.h (user-mode)
    – contains the declarations of some registry-related system calls
    (
    NtRenameKey, NtSetInformationKey).

Security research

Learning about prior security research can be
especially useful when starting a new project yourself. Not only does it often reveal deep technical details
about the target, but it also comes from like-minded professionals who look at the code through a security
lens, and may inspire ideas of further weaknesses or areas that require more attention. When it comes to the
registry, I think that relatively little work has been done in the public space compared to its high
complexity and the pivotal role it plays in the Windows operating system. Nevertheless, there were some
materials that I found extremely insightful, especially those by my colleague James Forshaw from Project
Zero. The full list of security-relevant resources I have managed to gather on this topic is shown below
(including some references to my own publications from the past):

  • Case
    study of recent Windows vulnerabilities
     (2010), Gynvael Coldwind, Mateusz Jurczyk – a
    presentation on several security bugs Gynvael and I found during our brief registry research in
    2009/2010.
  • Microsoft
    Kernel Integer Overflow Vulnerability
     (2016), Honggang Ren – a write-up on CVE-2016-0070, a
    Windows kernel vulnerability in the loading of malformed hive files.
  • Project
    Zero bug tracker
     (2016), James Forshaw,
    Mateusz Jurczyk – four bug reports submitted to Microsoft as a result of naive registry hive
    fuzzing.
  • Project
    Zero bug tracker
     (2014-2020), James
    Forshaw – 17 vulnerabilities related to the registry discovered by James, many of them are logic
    issues at the intersection of registry and other system mechanisms (security impersonation, file
    system).

Books

For a 20+ year old codebase such as the registry, it is expected that some resources
covering it in the early days were published on paper rather than on the Internet. For this reason, part of
my standard routine is to search
Google
Books
 for various technical terms and keywords
related to the specific technology and see what pop ups. For the registry, these could be e.g.
“regedit”, “regf”, “hbin”, “LOG1”, “RegCreateKey”,
“NtCreateKey”, “HvAllocateCell”, “RegistryMachine”, “key control
block” and so on. In some cases this yields books with unique, strictly technical information, while in
others the most insightful part is the historical perspective and being able to see how the given technology
was perceived soon after it first came out. And sometimes the value of the book is a complete surprise until
it arrives in the mailbox, as it is neither offered for sale as an ebook nor has preview available in Google
Books, and so a hard copy is required.

The books that I found which are either fully or
partially dedicated to the Windows registry are (latest to oldest):

Patents

Another useful source of information that may be otherwise difficult to find are
patents, indexed by
Google
Patents
. A particularly valuable result that I found this way is 🔑 Containerized Configuration (US20170279678A1), Microsoft’s patent from 2016 that thoroughly
explains the core concepts behind differencing hives and layered keys in registry. These mechanisms are part
of a new feature introduced in Windows 10 Anniversary Update to better support containerization, but any
official documentation of how it works is nowhere to be found. The patent is thus a great aid in
understanding the intricate aspects of this new registry functionality, adding the necessary context and
helping to make sense of otherwise highly cryptic kernel functions like
CmpPrepareDiscardAndReplaceKcbAndUnbackedHigherLayers.

Manual analysis

So far, all of the resources we’ve discussed
were accessible through a web browser, a text editor, or in physical form. But there is another type of
information source that is equally, if not more, important, and that requires more specialized tooling to
make sense of it. What I mean by that is the knowledge we can extract from the executable images in Windows
responsible for handling the registry, both in terms of the “standard” reverse-engineering and
also fully taking advantage of any helpful artifacts in or around them. I’ll write more about the
hands-on reversing process in upcoming posts, and now we will turn our attention to those artifacts that
present us with clear-cut information without the need for deduction.

On a side note, by far the most essential file to be looking at is ntoskrnl.exe, the core NT kernel
image. It contains the entirety of the kernel-space registry implementation and is of interest both from the
security and functional perspective. I have personally spent 99% of my manual analysis time looking at that
particular binary, but it’s worth noting that there are a few other executables and libraries related to
the registry as well:

  • winload.exe
    – the Windows Boot Loader, which executes before the Windows kernel. One of its responsibilities
    is to load the SYSTEM hive into memory and read some configuration from it, so it includes a partial
    copy of the registry code from ntoskrnl.exe.
  • offreg.dll
    – the Offline Registry Library, which also shares some registry code with the kernel (but executes
    in user-mode).
  • kernelbase.dll – one of the primary WinAPI libraries,
    implementing a majority of the user-space Registry API.
  • ntdll.dll
    – another core user-mode library which provides a bridge between the Registry API and the kernel
    registry implementation.
  • regsvc.dll
    – a DLL implementing the Remote Registry Service.

Let’s investigate what types of information about the
registry are readily available to us by
running a disassembler/decompiler. I personally use IDA Pro + Hex-Rays and so the examples below are based
on them.

🔑 Public symbols (PDB)

Microsoft makes public symbols available for a majority of executable images found in
C:Windows, for the benefit of developers and security researchers. By “public” symbols I mean PDB
files that mainly contain the names of functions found in the binaries, which help in symbolizing system
stack traces during debugging or in the event of a crash. In the past, the symbols used to be bundled with
the system installation media or on a separate Resource Kit disc, and later they were available for download
in the form of pre-packaged archives from the Microsoft website. Both of these channels have been
deprecated, and currently the only supported way to obtain the symbols is on a per-file basis from the
Microsoft
Symbol Server
. The PDB files can be downloaded directly with the official SymChk tool, or indirectly through software that supports the symbol
server (e.g. IDA Pro, WinDbg).

As for ntoskrnl.exe specifically, its accompanying symbols are one of the most
invaluable sources of information. As mentioned in an earlier post, the Windows kernel follows a consistent
naming convention, so we can immediately see which internal routines are related to the registry, and where
the entry points (registry-related system call handlers) that we might start our analysis from are. It shows
us the extent of the code we are dealing with (1000+ registry functions) and makes it possible to perform
analysis such as the one shown in
blog
post #2
 (counting lines of code per system version) out-of-the-box, without doing
any reverse engineering work. And perhaps most importantly, the function names make it substantially easier
to reason about the code while doing the actual reversing, especially for functions with very descriptive
names, like
CmpCheckAndFixSecurityCellsRefcount or
CmpPromoteSingleKeyFromParentKcbAndChildKeyNode.

Screenshot showing names and addresses

The other type of information we can find in the
kernel debug symbols are types: enums, structures and unions. However, there are two caveats. First, only
some types are included in the PDBs, and it’s not clear what criteria Microsoft uses to decide whether
to publish them or not. My rough estimate is that ~50% of the registry types can be found there, mostly the
fundamental ones. Secondly, even though the prototypes of some types are in the symbols, neither the
function arguments nor local variables are annotated with their types, so it is still necessary to determine
the corresponding types and manually annotate the variables for the decompiled output to make any sense.
Nevertheless, having access to this information is still a huge help both in understanding code on a local
level and also grasping the bigger picture.

The structures that can be found in the public
symbols are:

  • Hive descriptors
    (
    HHIVE, CMHIVE)
    and related structures
  • Hive bin and cell structures
    (
    HBIN, CM_KEY_NODE,
    CM_KEY_VALUE, CM_KEY_SECURITY, …)
  • Key object related structures
    (
    CM_KEY_BODY, CM_KEY_CONTROL_BLOCK, …)
  • Some transaction related structures
    (
    CM_TRANS, CM_KCB_UOW, …)
  • Some layered-key related structures
    (
    CM_KCB_LAYER_INFO, …)

Meanwhile, the ones that are missing and need to
be manually reconstructed are:

  • The parse context and path information
    structures (as used by
    CmpParseKey)
  • Some
    transaction related structures (on-disk transaction log records, lightweight transaction object
    descriptors, …)
  • Virtualization-related structures
  • Most
    layered-key related structures

Most of the relevant type names start with
“CM”, so it’s easy to find them in the Local Types window in IDA:

Screenshot from IDA showing Local Types that start with CM


I would like to take this opportunity to thank Microsoft for making
the symbols available for download, and encourage other vendors to do the same for their products.
🙂

Debug/Checked builds of Windows

Microsoft used to publish debug/checked builds of Windows (in addition to
“free” builds) from Windows NT to early Windows 10. The
difference between
them was that the debug/checked builds had some compiler optimizations disabled, and they enabled extra
debugging checks to identify internal system state inconsistencies as early as possible. The developers of
kernel-mode drivers were encouraged to test them on debug/checked Windows builds before considering them as
stable and shipping them to customers. Unfortunately, these special builds
have been
discontinued
 and don’t exist anymore for the
latest Windows 10 and 11.

These old builds can be quite valuable in the context of reverse engineering, because
the extra checks may reveal some invariants and assumptions that the code makes, but which are not obvious
when looking at retail builds. What is more, the checks are often verbose and include calls to functions
like
RtlAssert, DbgPrint,
DbgPrintEx etc., passing a textual
representation of the failed assertion, the source code file name and/or the line number. These may disclose
the names of variables, structure members, enums, constants and other types of information. Let’s see
some examples:

DbgPrintEx(DPFLTR_CONFIG_ID, 24u, “tImplausible size %lxn”, v13);

DbgPrintEx(DPFLTR_CONFIG_ID, 24u, “tKey is bigger than containing cell.n”);

DbgPrintEx(DPFLTR_CONFIG_ID, 0, “invalid name starting with NULL on key %08lxn”, a3);

DbgPrintEx(DPFLTR_CONFIG_ID, 0, “invalid (ODD) name length on key %08lxn”, a3);

DbgPrintEx(DPFLTR_CONFIG_ID, 24u, “tNo key signaturen”);

DbgPrintEx(DPFLTR_CONFIG_ID, 0, “tData:%08lx – unallocated Datan”, v20);

DbgPrintEx(DPFLTR_CONFIG_ID, 24u, “Class:%08lx – Implausible size n”, v20);

DbgPrintEx(DPFLTR_CONFIG_ID, 24u, “SecurityCell is HCELL_NIL for (%p,%08lx) !!!n”, a1, v67);

DbgPrintEx(DPFLTR_CONFIG_ID, 24u, “SecurityCell %08lx bad security for (%p,%08lx) !!!n”, v86, a1, v73);

DbgPrintEx(DPFLTR_CONFIG_ID, 0, “Root cell cannot be a symlink or predefined handlen”);

DbgPrintEx(DPFLTR_CONFIG_ID, 0, “invalid flags on root key %lxn”, v31);

DbgPrintEx(DPFLTR_CONFIG_ID, 24u, “tWrong parent value.n”);

The CmpCheckKey function is responsible for verifying the structural correctness
of every key in a newly loaded hive, and for every problem it encounters, it prints a more or less verbose
message. This can help us better understand what each of these checks is intended to accomplish.

DbgPrintEx(DPFLTR_CONFIG_ID, 0, “CmKCBToVirtualPath ==> Could not get name even from parent KCB =
%p!!!!n”
, a1);

This message can be interpreted as some kind of a fallback mechanism failing when
converting a registry path. It could indicate an interesting/brittle code construct, and indeed, the
surrounding code did turn out to be affected by a 16-bit integer overflow and a resulting pool memory
corruption (reported in Project Zero
issue
#2341
). In consequence, the entire block of code (including the vulnerability) was
removed, as it was functionally redundant and didn’t serve any practical purpose.

RtlAssert(“(*VirtContext) & CMP_VIRT_IDENTITY_RESTRICTED”, “minkernel\ntos\config\cmveng.c”, 3554u, 0i64);

This single line of code in CmpIsSystemEntity reveals
a few pieces of information: the name of the function argument (
VirtContext), an
internal name of a flag that is not documented in any other resources
(
CMP_VIRT_IDENTITY_RESTRICTED), and the source file name and line number of the
expression (
minkernelntosconfigcmveng.c:3554). Such information can be ported into our main disassembler database
(such as an .idb) and later help us better understand other areas of code that use the same
object/flags.

DbgPrintEx(DPFLTR_CONFIG_ID, 22u, “Error[1] %lx while processing CmLogRecActionDeleteKeyn”, v12);

This and similar calls in CmpDoReDoRecord inform us of
the internal names of the transaction record types (
CmLogRecActionCreateKey,
CmLogRecActionDeleteKey etc.), which again
are not publicly mentioned anywhere else.

Debugging and
experimentation

Poking and prodding the registry of a running
Windows system is the last way of learning about it that comes to my mind. In some sense it is a required
step, because we can only get so far by reading static documentation and code. At some point, we will be
forced to investigate the real memory mappings corresponding to the hives, explore the contents of in-memory
registry objects, or verify that a specific function behaves the way we think it does. Thankfully, there are
some tools that make it possible to peek into the internal registry state beyond what the standard utilities
like Regedit allow. They are briefly described in the sections below.

Extended
Regedit alternatives

The built-in Regedit.exe utility offers quite basic functionality, and while it is
adequate for most tinkering and system administration purposes, some third party developers have created
custom registry editors with an extended set of options.
I haven’t personally used them so
I cannot attest to their quality, but they may offer some benefits to other
researchers.
 One example is Total
Registry
, whose main advantage is being able to
browse the internal registry tree structure (rooted in Registry) in addition to the standard high-level
view with the five HKEY_* root keys.

Process
Monitor

Process
Monitor
 is a part of the Sysinternals suite of
utilities, and is a widely known program for monitoring all file system, registry and process/thread
activity in Windows in real time. Of course in this case, we are specifically interested in registry
monitoring. For every operation taking place, we can see a corresponding line in the output window, which
specifies the time, type of operation, originating process, registry key path, result of the operation and
other details (all of this is highly configurable):

Process monitor screenshot as described above

ProcMon is a great tool for exploring what the registry is like as an interface, and
how applications in the system use it. It is the most helpful when dealing with logical bugs, and attacking
more privileged processes
through the registry rather
than attacking the registry implementation itself. For example, I used it to find a suitable exploitation
primitive for Project Zero
issue
#2492
, which allowed me to demonstrate that predefined keys were inherently insecure,
leading to their deprecation. One of its advantages is that it works out-of-the-box without any special
system configuration (other than the admin rights required to load a driver), and it’s certainly a must
have in a researcher’s toolbox.

🔑
WinDbg and the !reg extension

WinDbg attached as a kernel debugger to a test (virtual) machine is the ultimate tool
to explore the inner workings of the Windows kernel. I have used it extensively at every step of my
research, to analyze how the registry works, reproduce any bugs that I found, and develop reliable
proof-of-concept exploits. While its standard debugger functionality is powerful enough for most tasks, it
also comes with a dedicated
!reg extension that automates the process of traversing registry-specific
structures and presents them in an accessible way. The full list of its options is shown below:

reg <command>      <params>        Registry extensions

    querykey|q     <FullKeyPath>   Dump subkeys and values

    keyinfo        <HiveAddr> <KnodeAddr>  Dump subkeys and values, given knode

    kcb        <Address>       Dump registry key-control-blocks

    knode      <Address>       Dump registry key-node struct

    kbody      <Address>       Dump registry key-body struct

    kvalue     <Address>       Dump registry key-value struct

    valuelist  <HiveAddr> <KnodeAddr>  Dumps list of values for a particular knode

    subkeylist <HiveAddr> <KnodeAddr>  Dumps list of subkeys for a particular knode

    baseblock  <HiveAddr>      Dump the baseblock for the specified hive

    seccache   <HiveAddr>      Dump the security cache for the specified hive

    hashindex  <HiveAddr> <conv_key>   Find the hash entry given a Kcb ConvKey

    openkeys   <HiveAddr|0>    Dump the keys opened inside the specified hive

    openhandles <HiveAddr|0>   Dump the handles opened inside the specified hive

    findkcb    <FullKeyPath>   Find the kcb for the corresponding path

    hivelist           
     
 Displays the list of the hives in the system

    viewlist   <HiveAddr>      Dump the pinned/mapped view list for the specified hive

    freebins   <HiveAddr>      Dump the free bins for the specified hive

    freecells  <BinAddr>       Dump the free cells in the specified bin

    dirtyvector<HiveAddr>   
 
 Dump the dirty vector for the specified hive

    cellindex  <HiveAddr> <cellindex>  Finds the VA for a specified cell index

    freehints  <HiveAddr> <Storage> <Display>  Dumps freehint info

    translist  <RmAddr|0>      Displays the list of active transactions in this RM

    uowlist    <TransAddr>     Displays the list of UoW attached to this transaction

    locktable  <KcbAddr|ThreadAddr>  Displays relevant LOCK table content

    convkey    <KeyPath>       Displays hash keys for a key path input

    postblocklist         
   
 Displays the list of threads which have 1 or more postblocks posted

    notifylist           
   
 Displays the list of notify blocks in the system

    ixlock     <LockAddr>      Dumps ownership of an intent lock

    finalize   <conv_key>      Finalizes the specified path or component hash

    dumppool   [s|r]         
 Dump registry allocated paged pool

       s  Save list of registry pages to temporary file

       r  Restore list of registry pages from temp. file

As we can see, the extension offers a wide selection of commands related to various
components of the registry: hives, keys, values, security descriptors, transactions, notifications and so
on. I have found many of them to be immensely useful, either on a regular basis (e.g.
querykey, kcb,
hivelist), or for more specialized tasks when experimenting with
a particular feature (e.g.
translist, uowlist for
transactions).

The best way to discover its potential is to see it in action on a specific example.
I used a Windows 11 guest system for this purpose. Let’s query an existing
HKEY_LOCAL_MACHINESoftwareDefaultUserEnvironment key to find out more about it:

kd> !reg querykey RegistryMachineSoftwareDefaultUserEnvironment

Found KCB = ffff888788731ad0 :: REGISTRYMACHINESOFTWAREDEFAULTUSERENVIRONMENT

Hive     
   
ffff88877af5c000

KeyNode     
000001e6ed0334b4

[ValueType]   
     
[ValueName] 
                 
[ValueData]

REG_EXPAND_SZ   
   
Path     
                   
%USERPROFILE%AppDataLocalMicrosoftWindowsApps;

REG_EXPAND_SZ   
   
TEMP     
                   
%USERPROFILE%AppDataLocalTemp

REG_EXPAND_SZ   
   
TMP     
                     
%USERPROFILE%AppDataLocalTemp

Here, we have referenced the key by its internal NT object manager registry path
starting with
Registry. The relation between the high-level paths known from
Regedit / the Registry API and the internal paths used by the kernel will be detailed in a future post
– for now, we just need to know that these paths are equivalent. We can learn a few things from the
command output: the key is cached in memory and the KCB (
Key Control
Block
, represented by the CM_KEY_CONTROL_BLOCK structure) is
located at address
0xffff888788731ad0. The address of the SOFTWARE hive descriptor
is
0xffff88877af5c000, and that’s where the
HHIVE / CMHIVE structures are stored.
HHIVE is the first member of CMHIVE at offset 0,
hence why their addresses line up, similar to how the
KPROCESS /
EPROCESS structures work. Furthermore, the key node
(
CM_KEY_NODE), the definitive representation of a key within the hive file, is
mapped at address
0x1e6ed0334b4. You may
notice that this is a user-mode address, and that’s because in modern versions of Windows, hive files
are generally operated on via section-based mappings within the user address space of a thin
“Registry” process (you can find it in Task Manager). Lastly, we can see that the key has three
values and we are provided with their types, names and data.

Next, we can use !reg kcb to learn more about the
key based on its cached KCB data:

kd> !reg kcb ffff888788731ad0

Key     
       
: REGISTRYMACHINESOFTWAREDEFAULTUSERENVIRONMENT

RefCount     
   
: 0x0000000000000001

Flags     
     
: CompressedName,

ExtFlags     
   
:

Parent     
     
: 0xffff88877ab517e0

KeyHive     
   
: 0xffff88877af5c000

KeyCell     
   
: 0xe824b0 [cell index]

TotalLevels   
 
: 4

LayerHeight   
 
: 0

MaxNameLen   
   
: 0x0

MaxValueNameLen 
: 0x8

MaxValueDataLen 
: 0x66

LastWriteTime   
: 0x 1d861d2:0xdb7718d1

KeyBodyListHead 
: 0xffff888788731b48 0xffff888788731b48

SubKeyCount   
 
: 0

Owner     
     
: 0x0000000000000000

KCBLock     
   
: 0xffff888788731bc8

KeyLock     
   
: 0xffff888788731bd8

This is a summary of some of the KCB components that the author of the extension
deemed the most important. We can see the value of the reference count, flags shown in textual form, the KCB
address of the key’s parent, the address of the hive, etc. Let’s resolve the virtual address of the
key node by using
!reg cellindex:

kd> !reg cellindex 0xffff88877af5c000 0xe824b0

Map = ffff88877ec20000 Type = 0 Table = 7 Block = 82 Offset = 4b0

MapTable   
 
= ffff88877ec37000 

MapEntry   
 
= ffff88877ec37c30 

BinAddress = 000001e6ed033001, BlockOffset = 0000000000000000

BlockAddress = 000001e6ed033000 

pcell:  000001e6ed0334b4

The result is 0x1e6ed0334b4, the same value
that
 !reg
querykey
 returned to us earlier. In order to inspect the contents of the key node, we can
use
!reg knode:

kd> !reg knode 1e6ed0334b4

Signature: CM_KEY_NODE_SIGNATURE (kn)

Name     
           
: DefaultUserEnvironment

ParentCell   
       
: 0x20

Security     
       
: 0x98f300 [cell index]

Class     
         
: 0xffffffff [cell index]

Flags     
         
: 0x20

MaxNameLen   
       
: 0x0

MaxClassLen   
     
: 0x0

MaxValueNameLen   
 
: 0x8

MaxValueDataLen   
 
: 0x66

LastWriteTime   
   
: 0x 1d861d2:0xdb7718d1

SubKeyCount[Stable 
]: 0x0

SubKeyLists[Stable 
]: 0xffffffff

SubKeyCount[Volatile]: 0x0

SubKeyLists[Volatile]: 0xffffffff

ValueList.Count   
 
: 0x3

ValueList.List   
   
: 0xe825a8

A very similar effect can be achieved by finding the Registry process, switching to
its context, and inspecting the memory directly by overlaying it onto the
CM_KEY_NODE structure layout:

kd> !process 0 0

**** NT ACTIVE PROCESS DUMP ****

PROCESS ffffe30198ef5040

    SessionId: none  Cid: 0004    Peb: 00000000  ParentCid: 0000

    DirBase: 001ae002  ObjectTable: ffff88877a285f00  HandleCount: 3302.

    Image: System

PROCESS ffffe30198fe1080

    SessionId: none  Cid: 0040    Peb: 00000000  ParentCid: 0004

    DirBase: 1002c002  ObjectTable: ffff88877a277b40  HandleCount:   0.

    Image: Registry

[…]

kd> .process ffffe30198fe1080

Implicit process is now ffffe301`98fe1080

WARNING: .cache forcedecodeuser is not
enabled

kd> dt _CM_KEY_NODE 1e6ed0334b4

nt!_CM_KEY_NODE

   +0x000 Signature    
   : 0x6b6e

   +0x002 Flags      
     : 0x20

   +0x004 LastWriteTime    :
_LARGE_INTEGER 0x01d861d2`db7718d1

   +0x00c AccessBits    
  : 0x3 ”

   +0x00d LayerSemantics   :
0y00

   +0x00d Spare1      
    : 0y00000 (0)

   +0x00d InheritClass     :
0y0

   +0x00e Spare2      
    : 0

   +0x010 Parent      
    : 0x20

   +0x014 SubKeyCounts     :
[2] 0

   +0x01c SubKeyLists    
 : [2] 0xffffffff

   +0x024 ValueList    
   : _CHILD_LIST

   +0x01c ChildHiveReference :
_CM_KEY_REFERENCE

   +0x02c Security      
  : 0x98f300

   +0x030 Class      
     : 0xffffffff

   +0x034 MaxNameLen    
  : 0y0000000000000000 (0)

   +0x034 UserFlags    
   : 0y0000

   +0x034 VirtControlFlags :
0y0000

   +0x034 Debug      
     : 0y00000000 (0)

   +0x038 MaxClassLen    
 : 0

   +0x03c MaxValueNameLen  :
8

   +0x040 MaxValueDataLen  :
0x66

   +0x044 WorkVar      
   : 0

   +0x048 NameLength    
  : 0x16

   +0x04a ClassLength    
 : 0

   +0x04c Name      
      : [1]  “æ•„”

In the listing above, we can see the full extent of information stored in the hive
for each key. The name in the last line is incorrectly displayed as æ•„, because formally the type of
CM_KEY_NODE.Name is wchar_t[1], but since the name
consists of ASCII-only characters, it is compressed down so that each
wchar_t element stores two characters of the name (as indicated by the flag 0x20
translated by WinDbg as
CompressedName). So æ•„ is in
fact the two first letter of the name, “De”, represented as a UTF-16 code point.

This is only a glimpse of what is possible with WinDbg and the
!reg extension. I highly encourage you to
experiment with other options if you’re curious about the mechanics of the registry and want to explore
further.

Conclusion

In this post, I have aimed to share my
methodology for gathering information and learning about new vulnerability research targets. I hope that you
find some of it useful, either as a generalized approach that applies to other software, or as a
comprehensive knowledge base for the registry itself. Also, if you think I’ve missed any resources,
I’ll be more than happy to learn about them. See you in the next post!