Firmware Extraction Series: Reading Flash with flashrom

Introduction to FlashROM

It’s been over six months since my last post. The firmware extraction series has now reached Part 11. In my opinion, this topic isn’t particularly sensitive, so I’m sharing it openly.

Flashrom is an open-source project designed for extracting (and programming) flash firmware. It supports multiple hardware platforms, as well as SPI and parallel flash interfaces.

I recently encountered a NAND SPI flash chip: the IS38SML01G1, an automotive-grade storage device. Initially, I assumed “SPI flash” implied NOR flash, so I attempted to wire it up and read it with a standard programmer without checking the datasheet. Most standard programmers were unable to read it; even the RT809H failed. After reviewing the datasheet, I realized it was NAND flash. At the time, only the REVELPROG-IS supported reading and writing for this chip.

I initially intended to use an FT2232HL, but due to poor official documentation and support, I opted for a Raspberry Pi 3B instead.

The flashchips.c file stores configuration information for various chips—it is well-designed and highly extensible.

{
    .vendor		= Vendor name
    .name		= Chip name
    .bustype		= Supported flash bus types (Parallel, LPC...)
    .manufacture_id	= Manufacturer chip ID
    .model_id		= Model chip ID
    .total_size		= Total size in (binary) kbytes
    .page_size		= Page or eraseblock(?) size in bytes
    .tested		= Test status
    .probe		= Probe function
    .probe_timing	= Probe function delay
    .block_erasers[]	= Array of erase layouts and erase functions
    {
        .eraseblocks[]	= Array of { blocksize, blockcount }
        .block_erase	= Block erase function
    }
    .printlock		= Chip lock status function
    .unlock		= Chip unlock function
    .write		= Chip write function
    .read		= Chip read function
    .voltage		= Voltage range in millivolt
}

Based on the datasheet, I added the configuration for the 38SM device. According to the datasheet, this NAND SPI flash has 1024 blocks, with each block containing 64 pages. Each page consists of 2K + 64 bytes, where the 64 bytes are reserved for the spare (OOB) area. total_size is specified in KB, so the spare area is excluded from this value. The voltage range is set to 2.7V–3.6V, as per the datasheet.

{
	.vendor		= "ISSI",
	.name		= "IS38SML01G1",
	.bustype	= BUS_SPI,
	.manufacture_id	= ISSI_NAND_ID,
	.model_id	= ISSI_NAND_ID_SPI,
	.total_size	= 131072, /* kb */
	.page_size	= 2048, /* bytes, actual page size is 64 */
	.tested		= {.probe = OK, .read = OK, .erase = NA, .write = NA},
	.probe		= probe_spi_rdid5,
	.probe_timing	= TIMING_ZERO,
	.block_erasers	=
	{
		{
			.eraseblocks = { {64 * 2048, 1024} },
			.block_erase = spi_block_erase_d8,
		}
	},
	.write		= NULL,
	.read		= spi_read_issi,
	.voltage	= {2700, 3600},
},

Reading flash contents requires implementing chip initialization and read functionality. Therefore, we only need to define the probe and read function pointers. The figure below illustrates the command definitions, including the opcode byte, address bytes, dummy bytes, and the data bytes returned by the device. Data is transferred MSB-first.

command_set

Initialization requires reading the chip ID, so we must first define the IDs. The Mark Code and Device Code are useful identifiers. I also added the Communication Code 0x7F7F7F for completeness.

jedec_id

#define ISSI_NAND_ID		0xC8
#define ISSI_NAND_ID_SPI	0x21
#define ISSI_38SML01G1		0x7F7F7F

Flashrom’s built-in probe_spi_rdid4 reads the JEDEC ID by sending 0x9F. However, this chip requires a single dummy byte when sending the Read ID command. Consequently, using the standard probe_spi_rdid results in the first read byte being 0x00. The timing diagram provided in the ISSI datasheet is of extremely low quality.

read_id_timing

Thus, a new function is required, which I named probe_spi_rdid5. I also added a read function, spi_read_issi. Both must be declared in chipdrivers.h.

int probe_spi_rdid5(struct flashctx *flash);
int spi_read_issi(struct flashctx *flash, uint8_t *buf, unsigned int start, unsigned int len);

For the Read ID function: if the first byte is 0x00, it should be skipped. Alternatively, a dummy byte can be included when sending the ID command, which avoids the need to check the MISO data for padding.

int probe_spi_rdid5(struct flashctx *flash)
{
	const struct flashchip *chip = flash->chip;
	unsigned char readarr[6];
	uint32_t id1;
	uint32_t id2;
	uint32_t bytes = 6;

	if (spi_rdid(flash, readarr, bytes)) {
		return 0;
	}

	if (!oddparity(readarr[0]))
		msg_cdbg("RDID byte 0 parity violation. ");

	/* Check if this is a continuation vendor ID.
	 * FIXME: Handle continuation device IDs.
	 */
	
	if (readarr[0] == 0x00) {
		if (!oddparity(readarr[1]))
			msg_cdbg("RDID byte 1 parity violation. ");
		id1 = (readarr[0] << 8) | readarr[1];
		id2 = readarr[2];
	} else {
		id1 = readarr[0];
		id2 = (readarr[1] << 8) | readarr[2];
	}

	msg_cdbg("%s: id1 0x%02x, id2 0x%02x\n", __func__, id1, id2);

	if (id1 == chip->manufacture_id && id2 == chip->model_id)
		return 1;

	/* Test if this is a pure vendor match. */
	if (id1 == chip->manufacture_id && GENERIC_DEVICE_ID == chip->model_id)
		return 1;

	/* Test if there is any vendor ID. */
	if (GENERIC_MANUF_ID == chip->manufacture_id && id1 != 0xff && id1 != 0x00)
		return 1;

	return 0;
}

Next is the read function. It is important to first understand the chip’s read flow: the NAND controller loads NAND data into cache memory (one page at a time) before reading it out from the cache to output to the host.

blockdiagram

Therefore, a page-read command must first be sent to instruct the controller which page to read. While data is being transferred into the cache, no other read/write operations should be performed. During this process, the status register will indicate a busy state (i.e., OIP == 1).

After sending the page-read command, the status must be polled by repeatedly sending 0x0F 0xC0 until OIP == 0.

status_register

The complete read sequence is as follows:

0x13 page read
0x0F 0xC0 status polling
0x03 cache read

According to the command definition, the page-read command uses 3 address bytes, one of which is a dummy byte. This allows for a maximum address of 0xFFFF (65536 in decimal). 1024 blocks * 64 pages equals 65536 pages. Here, I temporarily interpret the dummy byte as [7:0] and the address as [23:8].

page_read

For the cache read, the address length is 2 bytes, plus 1 byte + 4 bits of dummy cycles, resulting in a cache addressing range of 12 bits (4096). The datasheet states the range is 0–2112, corresponding to 2048 bytes (data) + 64 bytes (OOB area).

page_cache_read

The implementation of spi_read_issi is provided below:

int spi_read_issi(struct flashctx *flash, uint8_t *buf, unsigned int start, unsigned int len)
{
	uint8_t cmd[4];
	uint8_t page_read_resp[1];
	unsigned int ret = 0;
	unsigned int buf_off = 0;
	uint8_t cache_read_cmd[4];
	uint8_t get_feature_cmd[2] = {0x0f, 0xc0};

	for (unsigned int address_h = 0; address_h < 256; address_h++)
	{
		for (unsigned int address_l = 0; address_l < 256; address_l++)
		{
			cmd[0] = 0x13; /* page read cmd */
			cmd[1] = 0x00; /* dummy byte */
			cmd[3] = (uint8_t)address_h;
			cmd[2] = (uint8_t)address_l;
			ret = spi_send_command(flash, sizeof(cmd), 1, cmd, page_read_resp);
			/* 7-0 bits: ECC_S1, ECC_S0, P_Fail, E_Fail, WEL3, OIP */
			uint8_t status[1] = {0};
			int get_feature_ret = 1;
			
			{
				internal_sleep(10);
				get_feature_ret = spi_send_command(flash, sizeof(get_feature_cmd), sizeof(status), get_feature_cmd, status);
			}while (get_feature_ret);
			/* printf("\nStatus: 0x%X, get_feature_ret:%d\n", (unsigned int)status[0], get_feature_ret); */

			cache_read_cmd[0] = 0x03; /* page read cmd */
			cache_read_cmd[1] = 0x00;
			cache_read_cmd[2] = 0x00;
			cache_read_cmd[3] = 0x00; /* dummy byte */
			
			if (status[0] == 0)
			{
				int cache_read_ret = spi_send_command(flash, sizeof(cache_read_cmd), 2048, cache_read_cmd, buf + 2048 * buf_off);
				ret = cache_read_ret;
			} else {
				printf("device busy. timeout\n");
				ret = spi_send_command(flash, sizeof(get_feature_cmd), sizeof(status), get_feature_cmd, status);
			}
			/* Send Read */
			unsigned int *buf_addr = (unsigned int *)((unsigned int)buf + 2048 * buf_off);
			
			if (buf_addr[0] != 0xffffffff){
				printf("buf_off:%d, address: 0x%x%x\nbuf_addr: 0x%X\ndata:\n", buf_off, (int) cmd[2], (int)cmd[1], (unsigned int)buf_addr);
				/*int* = 4* int8 */
				for (int b = 0; b < 512; b++)
				{
					printf("%08x", buf_addr[b]);
				}
				printf("\n");
			}
			// printf("\n");
			if (ret){
				printf("reading err");
				break;
			}

			buf_off++;
		}
	}
	return ret;
}

Fly-wiring

First, I secured the chip using a clip fixture and applied heat with a hot-air gun at 400°C, preheating from the bottom for 12 seconds.

target_device

I then soldered the chip onto an adapter board.

sop8

Initially, I overlooked the potential for shorts on the underside of the WSON package, which required me to rework the fly-wires.

wson8

jumping1

jumping2

I connected it to a Raspberry Pi 3B. reading_by_rpi

The wiring configuration is as follows. Note that HOLD should be tied to VCC.

RPi header	SPI flash
25	GND
24	/CS
23	SCK
21	DO
19	DI
17	VCC 3.3V (+ /HOLD, /WP)

Enable SPI.

vi /boot/config.txt
dtparam=spi=on

Load kernel modules.

# If that fails you may wanna try the older spi_bcm2708 module instead
sudo modprobe spi_bcm2835
sudo modprobe spidev

flashrom -p linux_spi:dev=/dev/spidev0.0,spispeed=10000 -c IS38SML01G1 -V -r /tmp/is38_nooob.bin

reading

Target Device Initialization Analysis

The extracted 128MB dump consisted almost entirely of 0xFF. However, the vendor confirmed that the chip should contain software and configuration data. The ISSI flash datasheet lacked clear documentation regarding the address format for page reads. Since the address consists of 3 bytes, and the high/low bytes are consecutive, there are two permutations. Combined with the dummy byte, this results in four possible address modes. I was unable to determine the correct one. I dumped all four variants; while the data distribution changed, I could not confirm the correct mapping.

flashdump1 flashdump2 flashdump3 flashdump4

Even with a driver implemented strictly according to the datasheet, the dump was predominantly 0xFF. Suspecting an issue, I used a logic analyzer to investigate. You only need to capture three channels: MOSI, MISO, CLK. Use MSB; set CPOL and CPHA to 0.

spi_mode kingst_spiconf

Triggering on the rising edge with a sampling rate of 200MHz, I first captured the SPI traffic during the Raspberry Pi’s read operation. It matched the datasheet.

Read JEDEC ID:

rpi_spi_init

Read status:

rpi_status

When reading the cache, after sending 4 bytes, MISO remained high throughout. This behavior seemed anomalous.

rpi_spi_read_cache

Then, I captured traffic from the target device.

logical_analyzer_probe

The JEDEC ID read was normal. Unlike the Raspberry Pi, the target device only returned the first two bytes of the ID.

target_init

The sequence from page read to cache read involves sending the page-read command, reading the status, waiting for the controller to return 0, and then sending the cache-read command. This is where the discrepancy occurred. MISO still output 0xFF, but in 4-byte chunks. After each 4-byte output, the master would “receive” 4 bytes of unknown data, alternating in a loop. Since 0x03 confirms a single-lane transfer, and the Raspberry Pi’s MOSI line was idle during this phase, these bytes were likely not originating from the slave.

target_reading

This suggested that my code wasn’t the issue; rather, the device wasn’t actively using this storage chip at that moment.

Consequences of Reading the OOB

Initially, I attempted to dump the spare area as well by setting the response buffer to 2112 bytes. In the resulting dump file, I observed an ELF header, which led me to believe the target device ran ELF binaries. However, this seemed unusual—why would an automotive gateway run Linux?

dump1

Later, after checking the memory address layout, I realized I had exceeded the heap size and performed an out-of-bounds read into adjacent library data.

maps

The math confirms this: the first segment is the actual heap size, the second is the actual read size, and the third is the normal read size. Therefore, one should not attempt to read the spare area in this context.

address_calc

REVELPROG-IS Unboxing

Over a month later, I purchased a REVELPROG-IS to verify the accuracy of flashrom’s read results.

package

Made in Poland, the packaging and programmer appear to be of good quality.

revelprog

The circuit design is simple, featuring a single STM32F103. While it is somewhat expensive, it at least hasn’t been aggressively cloned.

STM32F103

Verifying the Dump

Since the WSON8 socket hadn’t arrived, I temporarily used fly-wires for the connection.

reading_with_prog

The read speed was extremely slow, taking several minutes. It does not support speed adjustment, making flashrom significantly faster.

reading_with_prog2

The data read was consistent with the earlier dump.

result

The address format matches the third variant. Although I’ve forgotten the exact ordering, I plan to use this programmer for future dumps.

flashdump5

Errata

In subsequent research, I identified a few pitfalls in this post. address_h and address_l were assumptions made without knowledge of the actual addressing rules, as they were undocumented in the datasheet. I incorrectly assumed both were 8-bit. After reviewing datasheets for several other chips, I confirmed this was an issue.

readpage_address

In reality, these two parameters represent the block address and page address. The device has 1024 blocks (10 bits) and 64 pages per block (6 bits), totaling exactly 16 bits. Different chip capacities follow different addressing rules.

Consequently, the read results can contain duplicated data. Furthermore, since the valid data was located near the beginning of the storage and the rest was irrelevant, the final result appeared consistent with the programmer’s dump—despite the incorrect address interpretation.

Additionally, the status check in the sample code is not strictly correct: the status doesn’t necessarily have to be 0 for reads to proceed. BBM LUT FULL (Look-Up Table) may also be 1, and ECC Err Status can be 0x20.

status