

# **Technical Note**

**Detecting and Managing Soft Error Rate in Managed NAND Controllers** 

## Introduction

All managed NAND controllers use static random access memory (SRAM) for storing code, data and user buffers.

SRAM, like all random access memory technologies, can be subject to bit flips typically caused by naturally occurring phenomenon such as high-energy events (for example, neutron or alpha particles strikes). When encountered during WRITE and READ operations on the managed NAND, bit flips can lead to errors or "stuck" conditions. Bit flips perpetuating a stuck condition can be referred to as the soft error rate (SER) of the controller.

The purpose of this tech note is to outline a method of detecting and handling these bit flips (the SER) in a manner consistent with JEDEC specifications for operation of managed NAND devices.



### **Overview**

Detecting and managing SER is important to the life of the managed NAND controller. Studies of the frequency and effect of the SER in a managed NAND controller show approximately .05% (500 DPPM) of single-bit errors lead to corruption of code, tables, or user data.

Additionally, there are particular bit flips that the controller should not attempt to recover from, such as a bit flip in the instruction SRAM for the processor. In this scenario, running interrupt handler code may not be safe. For this reason, a simple parity check can be the best solution for SRAM bit flips. Since the data in SRAM is only a copy of main memory, data can be disregarded and re-fetched if it is found to be corrupted.

#### **Parity Check Solutions**

During normal operation, SRAM parity is calculated every time data is written to SRAM, then checked every time SRAM is read. Having automatic checking by hardware assures the parity code will be created and checked for every operation, and it will have no impact on system performance.

If a single-bit error occurs, the last data written to the NAND will be valid, and firmware can recover following a power cycle or hardware reset.

If, however, corrupted data is written to the NAND, it will lead to a stuck condition because rebooting reloads the corrupted log or table data from the NAND.

Possible parity check solutions include:

- Single Error Detection (SED) Detect single bit errors (no correction capability)
- Single Error Correction (SEC) Detect and correct single bit errors
- Single Error Correction Double Error Detection (SECDED) Detect up to 2-bit errors, correct single bit errors.

For controllers with small amounts of SRAM cache, the simple SED implementation can be effective because of the relatively low frequency of errors. Limited overhead allows for negligible performance disadvantages, and parity checks prevent writing of corrupt data to NAND.



#### Recommendation

The managed NAND controller must reinitialize following a detected bit flip, slowing down device operation in the event of an error. A Single Error Correction (SEC) or Single Error Correction Double Error Detection (SECDED) solution could be implemented, which would reduce the frequency of hardware resets but also increase energy consumption of the controller.

For current generation managed NAND products, Micron believes the best way to mitigate the effects of SER while balancing reliability, performance and energy usage is to:

- 1. Implement parity check within the controller using SED.
- 2. Perform a controller reset when an error is detected to avoid the potential for permanent timeout.

These simple implementations do not require any error correction, making it more energy efficient and faster to implement than either the SEC or SECDED solution.

Future solutions may require a greater amount of SRAM, which has an inherently higher SER. In this instance, Micron may choose to implement a SEC or SECDED implementation to control DPPM.

#### **Figure 1: SER Hardware Error Protection**





## **Revision History**

**Rev. A – 1/2018** 

• Initial release

8000 S. Federal Way, P.O. Box 6, Boise, ID 83707-0006, Tel: 208-368-4000 www.micron.com/products/support Sales inquiries: 800-932-4992 Micron and the Micron logo are trademarks of Micron Technology, Inc. All other trademarks are the property of their respective owners.