Improving Device Aging Reliability with Approximate Computing

Table of contents
  1. Brief of Device Aging Effects
  2. Device Aging Effects Mitigation
    1. Summurized Innovations
  3. Applications
    1. On Neural Network Accelerator, Device Aging Effects Mitigation and Power-Accuracy Trade-off
    2. On Image Processing Accelerator,Device Aging Effects mitigation and Low-Energy Design

Brief of Device Aging Effects

In Very-Large-Scale Integration (VLSI) design, Bias Temperature Instability (BTI) and Hot Carrier Injection (HCI) are significant reliability concerns that affect the performance and longevity of semiconductor devices.

  1. BTI: BTI refers to a degradation mechanism in MOSFET devices, primarily affecting the threshold voltage over time. It is caused by the combined effect of bias voltage and temperature stress, leading to a shift in the device’s electrical characteristics. BTI can be further classified into two types:
    1. Negative Bias Temperature Instability (NBTI)
      • Occurs in p-channel MOSFETs (PMOS).
      • When a negative bias (Vgs < 0) is applied, typically during normal operation, the threshold voltage of the PMOS transistor increases over time.
      • This results in slower switching speeds and increased power consumption.
      • NBTI is more pronounced at higher temperatures and voltages.
    2. Positive Bias Temperature Instability (PBTI):
      • Occurs in n-channel MOSFETs (NMOS).
      • When a positive bias (Vgs > 0) is applied, the threshold voltage of the NMOS transistor increases over time.
      • While less studied than NBTI, PBTI can also lead to performance degradation, especially in advanced technology nodes with high-k metal gate stacks.
  2. Hot Carrier Injection (HCI)

    HCI is another degradation mechanism in MOSFET devices caused by high-energy carriers (electrons or holes) that gain sufficient kinetic energy to overcome the energy barriers at the silicon-silicon dioxide interface. This phenomenon is more pronounced in n-channel MOSFETs (NMOS), and primarily under conditions of high electric fields, electrons (or holes) in the channel can gain enough energy from the electric field to become “hot” and get injected into the gate oxide. These hot carriers can create traps and interface states in the gate oxide, leading to permanent damage.

    • Increase in the threshold voltage.
    • Decrease in carrier mobility.
    • Increase in subthreshold leakage.
    • Degradation of device performance and reliability over time.
    • To mitigate HCI, design techniques such as lightly doped drain (LDD) structures, halo implants, and the use of lower supply voltages can help.
The BTI/HCI effects will increase dealy.

Device Aging Effects Mitigation with Stochastic Compuing

Many applications are actually error tolerant, like image process and machine learning worklords. We can sacrifice part of the computing precision to compensate the aging effect while the accuracy of the result is still acceptable.

  1. Stochastic Computing (SC), an important branch of approximate computing, operates with Stochastic Numbers (SNs). These numbers represent values by the probability of a bit being ‘1’ in a binary stream. Unlike binary encoding, where each bit has different significance, such as the most significant bit (MSB) and the least significant bit (LSB), each bit of an SN is equally important.

    SC is introduced to mitigate the effects of device aging from two perspectives:

    • Error Tolerance: A single bit error (which may occur due to device failure) won’t cause a significant impact.
    • Latency Reduction: Fewer computation clock cycles are needed, which can compensate for the decreased clock frequency caused by increased delays due to device aging.
Classical SC Designs

Stochastic Computing: Innovations in Two Stages

  1. Stochastic Number Generation:
    • Traditional Method: Based on stochastic number generators (SNGs), which can lead to result fluctuations.
    • Our Innovation: We use a deterministic encoding method based on finite state machines to avoid these fluctuations.
  2. Computation on Stochastic Numbers:
    • Traditional Method: Requires generating a complete stochastic number (SN) stream as the result, leading to significant latency.
    • Our Innovation: We use a counting-based method that directly counts the input SNs. This allows us to obtain results via counter counts without generating the resultant SN stream.

Besides, we also develop other techniques, for example, below is the design of our proposed stochastic computing divider, Which achieve accuracy close to fixed-point divider (we are SOTA among stochastic dividers), and lower area & power consumption:

Fast and Scaled Counting-based Stochastic Computing Divider
Hardware performance and accuracy.

Applications

  1. Device Aging Effects Mitigation and Power-Accuracy Trade-off on Neural Network Accelerator with Proposed Approximate Multiplier

  2. Device Aging Effects mitigation and Low-Energy Design on Image Processing Accelerator with Proposed Approximate Divider