Registry Analysis (Windows Forensic Analysis) Part 1

Introduction

To most administrators and forensic analysts, the Registry probably looks like the entrance to a dark, forbidding cave on the landscape of the Windows operating system. Others might see the Registry as a dark door at the end of a long hallway, with the words "Abandon hope, all ye who enter here" scrawled on it. The truth is that the Registry is a veritable gold mine of information for both the administrator and the forensic investigator. In many cases, software used by attackers will create a footprint within the Registry, leaving the investigator clues about the incident. Knowing where to look within the Registry, and how to interpret what you find, will go a long way toward giving you valuable insight into activity that occurred on the system.

The purpose of this topic is to provide you with a deeper understanding of the Registry and the wealth of information it holds. Besides configuration information, the Windows Registry holds information regarding recently accessed files and considerable information about user activities. All of this can be extremely valuable, depending on the nature of the case you’re working on and the questions you need to answer. Most of the Registry analysis that I’ll address in this topic will be postmortem—in other words, after you’ve acquired an image of the system. However, in some instances I will describe analysis from a live system as well as provide examples of what the keys and values look like on a live system. There are a few minor considerations to keep in mind when you’re performing live versus postmortem analysis; I will point those out when I discuss the subject.


Throughout this topic, we will discuss various Registry keys and values that can be of interest during an investigation. However, you should not consider this topic to be a comprehensive listing of all possible Registry values that might be of interest. Although Registry keys associated directly with the operating system could be fairly stagnant once the system has been installed, they can change as Service Packs and patches are installed, as well as across versions of the Windows operating system itself. Also, a great many applications are available, all with their own unique Registry entries, and they can also change between application versions. Think of this topic as a guide and a reference listing, but also keep in mind that it is by no means complete. My goal in this topic is to describe the format of Registry files, how the Registry can be examined, and some important keys within the Registry and how to analyze them. By the time you reach the end of this topic, you should understand what the Registry holds and be adept at retrieving and analyzing information from within the Registry.

Inside the Registry

So, what is the Registry? If you remember back to DOS and early versions of Windows (3.1, 3.11, etc.), configuration information (drivers, settings) for the system was largely managed by several files—specifically, autoexec.bat, config.sys, win.ini (on Windows),and system.ini. Various settings within these files determined what programs were loaded and how the system looked and responded to user input.

Later versions of Windows replaced these files with the Registry, a central hierarchical database (http://support.microsoft.com/kb/256986) that maintains configuration settings for applications, hardware devices, and users. This "database" (I’m using the term loosely) replaces the text-based configuration files used by the previous versions of Microsoft operating systems.

For the most part, administrators directly interact with the Registry through some intermediary application, the most common of which are the graphical user interface (GUI) Registry editors that ship with Windows—regedit.exe or regedt32.exe. In many cases, the extent of a user’s or Administrator’s interaction with the Registry is through an installation program (software applications, patches), which does not permit the user to directly interact with specific keys and values within the Registry. Windows XP and 2003 distributions include reg.exe, a command-line interface (CLI) tool that can be used from the command prompt or in scripts. For more information on the GUI Registry editing utilities, see the "RegEdit and RegEdt32" sidebar.

Tools & Traps…

RegEdit and RegEdt32

Interestingly, the two Registry editing utilities provided with Windows operating systems are not equivalent on all versions of Windows (http://support.microsoft.com/kb/141377).

For example, on Windows NT and 2000, you would use regedit.exe primarily for its search capabilities but not to modify access control lists (ACLs) on Registry keys. Regedit.exe has an added limitation of not "understanding" the REG_EXPAND_SZ (a variable-length string) or REG_MULTI_SZ (a multiple-string) data type. If values with either of these data types are edited, the data is saved as a REG_SZ (a fixed-length string) data type and the functionality of the key is lost. Regedt32.exe, on the other hand, does not allow you to import or export hive (.reg) files.

On Windows XP and 2003, regedit.exe allows you to do all of these things, and regedt32.exe is nothing more than a small program that runs regedit.exe.

When the administrator opens regedit.exe, he sees a treelike structure with five root folders, or "hives," in the navigation area of the GUI, as Figure 4.1 illustrates. This folderlike structure allows the administrator to navigate easily through the Registry, much like a file system.

Figure 4.1 Regedit.exe View Showing Five Root Hives

Regedit.exe View Showing Five Root Hives

Each of these hives plays an important role in the function of the system. The HKEY_ USERS hive contains all the actively loaded user profiles for that system. HKEY_CURRENT_ USER is the active, loaded user profile for the currently logged-on user. The HKEY_LOCAL_ MACHINE hive contains a vast array of configuration information for the system, including hardware settings and software settings. The HKEY_CURRENT_CONFIG hive contains the hardware profile the system uses at startup. Finally, the HKEY_CLASSES_ROOT hive contains configuration information relating to which application is used to open various files on the system. This hive is subclassed to both HKEY_CURRENT_USER\Software\Classes (user-specific settings) and HKEY_LOCAL_MACHINE\Software\Classes (systemwide settings).

All of this is good and fine, but it helps to know where the hives come from and where they exist on the hard drive within the file system. The contents of much of the Registry visible in the Registry Editor are available in several files, as listed in Table 4.1.

Table 4.1 Registry Paths and Corresponding Files

Registry Path

File Path

HKEY_LOCAL_MACHINE\System

%WINDIR%\system32\conf ig\System

HKEY_LOCAL_MACHINE\SAM

%WINDIR%\system32\config\Sam

HKEY_LOCAL_MACHINE\Security

%WINDIR%\system32\config\Security

HKEY_LOCAL_MACHINE\Software

%WINDIR%\system32\config\Software

HKEY_LOCAL_MACHINE\Hardware

Volatile hive

HKEY_LOCAL_MACHINE\System\Clone

Volatile hive

HKEY_USERS\User SID

User profile (NTUSER.DAT); "Documents and Settings\User (changed to "Users\User" on Vista)

HKEY_USERS\Default

%WINDIR%\system32\config\default

Tip::

Vista and Windows 7 include additional Registry hive files, specifically the Components hive file (found in the system32\config directory) and the usrclass.dat file, which is located in the C:\Users\username\AppData\Local\ Microsoft\Windows directory. We will discuss these hives, particularly the usrclass.dat hive, later in this topic.

You’ll notice that several of the Registry paths are volatile and do not exist in files on the hard drive. These hives are created during system startup and are not available when the system shuts down. This is important to remember when you’re performing postmortem forensic analysis as well as live response on a running system. If data in volatile hives is important to troubleshooting a system or conducting incident response activities, you should consider either exporting the entire volatile hive to a .reg file via regedit.exe, or employing some other mechanism to collect specific data from the volatile hive before shutting the system down.

In addition to the different sections or hives, the Registry supports several different data types for the various values that it contains. Table 4.2 lists the various data types and their descriptions.

Table 4.2 Registry Data Types and Descriptions

Data Type

Description

REG_BINARY

Raw binary data

REG_DWORD

Data represented as a 32-bit (4-byte) integer

REG_SZ

A fixed-length text string

REG_EXPAND_SZ

A variable-length data string

REG_MULTI_SZ

Multiple strings, separated by a space, comma, or other delimiter

REG_NONE

No data type

REG_QWORD

Data represented by a 64-bit (8-byte) integer

REG_LINK

A Unicode string naming a symbolic link

REG_RESOURCE_LIST

A series of nested arrays designed to store a resource list

REG_RESOURCE_ REQUIREMENTS_LIST

A series of nested arrays designed to store a device driver’s list of possible hardware resources

REG_FULL_RESOURCE_ DESCRIPTOR

A series of nested arrays designed to store a resource list used by a physical hardware device

As you can see, a variety of data types are found in the Registry. There don’t seem to be any rules or consistency between values found in different keys; values that serve similar purposes may have different data types, allowing their data to be formatted and stored differently. This can become an issue when you’re performing text searches for data within the Registry. Where one application might store a list of recently accessed documents as ASCII text strings, another might store a similar list as Unicode strings in a binary data type, in which case an ASCII text search would miss that data. In fact, a Microsoft Knowledge Base article (http://support.microsoft.com/kb/161678) specifically states that you can use the Find tool in RegEdit only to find ASCII string data and not DWORD or binary data.

Registry Structure within a Hive File

Now that you’ve seen where the Registry hive files are located, let’s take a look inside those files and see the structure of the Registry itself, at a much lower level. You’re probably wondering at this point why we would want to do this. Well, by understanding the basic components of the Registry, we might be able to glean some extra information through keyword searches of other locations and sources, such as the page file, physical memory, or even unallocated space. If we know what to look for or what we’re looking at, we might be able to extract an extra bit of information. Also, by knowing more about the information that is available within the Registry, we will have a better understanding of what is possible and what to look for.

Mark Russinovich wrote an excellent article for Windows NT Magazine called "Inside the Registry" (http://technet.microsoft.com/en-us/library/cc750583.aspx), which describes the different components, or cells, of the Registry. This same information is covered in Windows Internals, which Mark co-wrote with David Solomon.

Each type of cell has a specific structure and contains specific types of information. The various types of cells are:

■ Key cell This cell contains Registry key information and includes offsets to other cells as well as the LastWrite time for the key (signature: kn).

■ Value cell This cell holds a value and its data (signature: kv).

■ Subkey list cell This is a cell made up of a series of indexes (or offsets) pointing to key cells; these are all subkeys to the parent key cell.

■ Value list cell This is a cell made up of a series of indexes (or offsets) pointing to value cells; these are all of the values of a common key cell.

■ Security descriptor cell This is a cell that contains security descriptor information for a key cell (signature: ks).

Figure 4.2 illustrates the various types of cells (with the exception of the security descriptor cell) as they appear in the Registry Editor.

Figure 4.2 Excerpt of Windows Registry Showing Keys, Values, and Data

Excerpt of Windows Registry Showing Keys, Values, and Data

Figure 4.3 illustrates an excerpt of a Registry file opened in a hex editor. The figure shows the signatures for a key cell as well as for a value cell. Note that due to endian issues, the signature for a key cell (i.e., kn) appears as nk and for a value cell (i.e., kv) appears as vk.

Figure 4.3 Excerpt of a Raw Registry File Showing Key and Value Cell Signatures

Excerpt of a Raw Registry File Showing Key and Value Cell Signatures

As mentioned earlier, these signatures can provide us with extremely valuable information during an investigation. Using these signatures, we can potentially carve Registry key and value information out of the unallocated clusters of an acquired image or even out of RAM dumps.

Tip::

Understanding the binary structure of Registry keys and values can be extremely useful during an investigation. Most investigations that involve some degree of Registry analysis will focus on the Registry files themselves, using tools presented in this topic. However, in examining unallocated space or the contents of a RAM dump, you could locate the signatures such as those illustrated in Figure 4.3. Knowing the format of the Registry key and value structures allows you to, if necessary, extract and parse the data into something understandable. You might find a Registry key in memory, for example, and be able to extract the LastWrite time. Understanding the structure of Registry keys and values provides context to information retrieved from a memory dump or from unallocated space within an acquired image, as well as allows you to parse deleted keys from within the "unallocated" space within hive files themselves. We will address the topic of locating Registry keys within the unallocated space of hive files later in this topic.

The best reference for the actual programmatic binary structure of the various cells within Registry hive files isn’t available from the vendor; oddly enough, it’s available from a guy who wrote a Linux utility. Peter Nordahl-Hagen created the Offline Registry and Password Editor (http://home.eunet.no/pnordahl/ntpasswd/), a utility that allows you to boot a Windows system (assuming you have physical access to the system) to a Linux disk and edit the Registry to include changing passwords. Peter provides the source for his utility, and if you choose to download it, you’ll be most interested in the file named ntreg.h. This is a C header file that defines the structures for the different types of cells that make up the Registry.

Tip::

Tim Morgan wrote an excellent paper describing the structure of the Windows Registry, describing the various cell types and how they are interconnected. Tim makes the paper available at the Sentinelchicken.com Web site (www.sentinelchicken.com/data/TheWindowsNTRegistryFileFormat.pdf).

From Peter’s source code, we can see that the key cell is 76 bytes long, plus a variable-length name. Therefore, if we locate the signature for a key cell (as illustrated in Figure 4.3) in a RAM dump or in unallocated clusters, we might very well have the contents of a key cell. From there, we can parse the next 74 bytes (the signature is two bytes long—in little endian hex format, 0x6B6E) and see such things as the name of the key and its LastWrite time.

We can see values that give us the LastWrite time as well as the numbers of subkeys and values associated with the key cell. We can also get offsets to additional data structures, such as lists of other key cells (i.e., the subkeys), as well as to the value list (i.e., values for this key).

Tip::

The $offset value is important when working with Registry keys, particularly those within Registry hive files. The 4-byte (DWORD) value immediately preceding the key signature found at $offset is the size of the key. When read as an unsigned long value, in many instances this value will be a negative number. Researchers have found that a negative size value indicates that the key itself is in use; a positive size value indicates that the key has been deleted. We will discuss this at greater length later in the topic.

The value cells are only 18 bytes long with a variable-length name and variable-length data. The data type is stored in a 4-byte (DWORD) value located within the value cell itself. Using the same technique as with the key cell, we can parse information about the value cell from whatever source we’re looking at and see the actual values. Perl code to parse the value cell looks like this:

tmp13-142

As with the key cell, the value cell identifier (i.e., $vk{id}) was illustrated in Figure 4.3.

Although we can parse Registry value data from various sources, there could be insufficient information to add context to what we discover, such as which Registry key the value originally belonged to, when it was created, or how it was used. Remember, Registry keys have a time stamp associated with them (i.e., the LastWrite time); Registry values do not. Extracting information about keys and values from RAM dumps or unallocated space might be of limited value, depending on the needs of the investigation.

Awhile ago, I wrote a Perl script that I call the Offline Registry Parser, or regp.pl, for short, based on the structures that Peter provided. It incorporates the Perl code you just saw. The script enables you to parse a raw Registry file, extracting the key and value cell information and printing the information to the console in ASCII format. The output can be redirected to a file for archiving, searching, or comparing with other files. There are several advantages to using this script, and code like it. First, it provides an open source example for how to parse through the raw Registry files; someone with an understanding of Perl programming can easily extend the capabilities of the script, such as recording the output in a spreadsheet or database for easier processing. Second, the script relies on only basic Perl functionality; it does not use any fancy platform-specific modules, nor does it rely on the Microsoft application program interface (API). This means you can run the script on any system that supports Perl, including Linux. If an investigator is using The Sleuth Kit (www.sleuthkit.org) or PyFlag (www.pyflag.net/cgi-bin/moin.cgi) on a Linux system as an analysis platform, she is not prevented from parsing, viewing, and analyzing the contents of the Registry files. Although this doesn’t add any new functionality to any of the available forensic analysis tools, it does provide options for data reduction and faster, more efficient processing. Right about the time I was considering extending this script into something more functional and flexible, James MacFarlane released his Parse::Win32Registry Perl module, which provides object-oriented access to Registry hive files, including for Windows 98 and ME systems. As with regp.pl, James’ module code is based in part on Peter’s descriptions of the various components of the Registry.

Tip::

Both PlainSight (www.plainsight.info/) and the SANS Investigative Forensic Toolkit (SIFT) Workstation (available from http://forensics.sans.org/community/downloads/) are Linux-based and include RegRipper (www.regripper.net), a Perl-based tool for parsing Registry hive file data during postmortem analysis. RegRipper is based on the Parse::Win32Registry module from James MacFarlane, and installs easily on Windows, Linux, and Mac OS X.

The reg.pl Perl script (as well as a stand-alone Windows executable "compiled" from the script using Perl2Exe) is available on the media that accompanies this topic.

Another interesting file Peter provides in the source code for his utility is sam.h. This file provides some very valuable information regarding structures within the SAM portion of the Registry. We will discuss these structures and how you can use them later in this topic, in the "Finding Users" section.

The Registry As a Log File

The key cells within the Registry constitute the keys or folders you see when you open the Registry Editor. This is the only one of the structures that contains a time value, called the LastWrite time. The LastWrite time is a 64-bit FILETIME object, which is analogous to the last modification time on a file (see the "Registry Key LastWrite Time" sidebar for more information). Not only does this information provide a time frame reference for certain user activities on the system, but in several cases it will also tell you when a specific value was added to a key or was modified.

Notes from the Underground…

Registry Key LastWrite Time

Registry keys have several properties associated with them, one of which is the LastWrite time, which is similar to the last modification time associated with files and directories. The LastWrite time is stored as a FILETIME object (or structure) that represents the number of 100-nanosecond intervals since January 1, 1601 (based on the Gregorian calendar). To convert this 64-bit value into something legible, you must translate (http://msdn.microsoft.com/en-us/library/ms724280.aspx) it to a SYSTEMTIME structure. Translating the FILETIME structure directly to a SYSTEMTIME structure using the FileTimeToSystemTime() API will allow you to display the date in UTC format, which is loosely defined as Greenwich Mean Time (GMT). To translate these values into something that accurately represents the current system time, you must first pass the FILETIME structure through the FileTimeToLocalFileTime() API. This API takes into account daylight saving time and the time zone information of the local system. The FILETIME object can be read into Perl as two 32-bit numbers and then translated into a 32-bit UNIX time value (http://en.wikipedia.org/wiki/Unix_time), and displayed using built-in Perl functions.

One particular instance where I’ve found this to be useful is during postmortem intrusion investigations. I like to open the Registry Viewer in ProDiscover and locate the area where Registry keys specific to Windows services are maintained. From there, I will sort the first level of keys based on their LastWrite times, and invariably I’ve been able to easily locate services that were installed during the intrusion, such as remote access backdoors or even rootkits.

Tip::

We’ll discuss rootkits in greater detail in next topic. However, the W32/Opanki. worm!MS06-040, as identified by Network Associates, serves as a useful example (http://vil.nai.com/vil/Content/v_140546.htm) for our discussion here. The worm installs a rootkit component, creating two Windows services (in addition to a number of other artifacts). Because most compromises occur sometime after the operating system is installed, sorting the Services keys based on their LastWrite times will often quickly reveal the issue. This is a useful technique for locating any of the myriad bits of malware that create Windows Services when they infect a system.

In addition to the LastWrite times, the Registry maintains time-stamp information in some of the data associated with specific values. We will discuss some of these values later in this topic, but for now, it’s enough to know that there are Registry keys whose values contain eight bytes of binary data that comprise a FILETIME object.

Warning::

The one thing that is consistent about the Windows Registry is its lack of consistency. I know you’re probably reading this and thinking that maybe I’ve been up too late, or that I’ve had too much coffee. Well, if you’ve ever just spent time browsing through the Registry, you know what I mean. Some values in the Registry contain binary data, which, when parsed or translated, can be read as a 64-bit FILETIME object. Two examples of this are within some UserAssist key values and the ShutdownTime value. However, in other instances a time-stamp value is maintained as a 32-bit UNIX time.

So, why do I refer to the Registry as a log file? It’s because log files generally have some sort of action or event associated with a time, and in many cases, this is true for the Registry. Although the Registry can hold literally thousands of entries, not all of them are of interest during an investigation. You will be interested in specific keys during different types of investigations, and if you understand how the system creates, modifies, and uses those keys and values, you will begin to see how the Registry actually records the occurrence of different events, along with an associated time stamp.

Next post:

Previous post: