#### Heavy-Ion Testing of the Freescale/NXP Qorivva 32-bit Automotive-Grade MCU

Ted Wilcox<sup>1</sup>, Christina Seidleck<sup>1</sup>, Megan Casey<sup>2</sup>, Ken LaBel<sup>2</sup> <sup>1</sup>AS&D, <sup>2</sup>NASA-GSFC

## Acronyms

- ADC Analog to Digital Converter
- COTS Commercial Off The Shelf
- CPU Central Processing Unit
- DMA Direct Memory Access
- DRESET Destructive Reset
- DSPI Deserial Serial Peripheral Interface
- ECC Error Correcting Code
- EDAC Error Detection and Correction
- eDMA Enhanced Direct Memory Access
- GPIO General Purpose Input Output
- HW Hardware
- ISR Interrupt Service Routine
- JTAG Joint Test Action Group
- LBNL Lawrence Berkeley National Laboratory
- LET Linear Energy Transfer

- LQFP Low Profile Quad Flat Pack
- MCU Microcontroller Unit
- POR Power on Reset
- RTC Real Time Clock
- SECDED Single error correct, double error detect
- SEE Single Event Effects
- SEFI Single Event Functional Interrupt
- SEL Single Event Latchup
- SEU Single Event Upset
- SRAM Static Random Access Memory
- SW Software

# Objective

- Evaluate single-event effect radiation response of inexpensive COTS automotive-grade parts, specifically a 32-bit microcontroller
  - Wider temperature range (-40°C to +125°C)
  - Guaranteed product longevity/availability
  - Built-in hw/sw safety features (ECC SRAM & Flash, Clock monitors, low-voltage detection, fault collection and reporting)
- Develop/improve internal SEE test flow for inexpensive automotivegrade microcontrollers (lessons learned?)
- Determine feasibility of test vehicle for low-cost, low-reliability apps (CubeSATs, etc) → low-power



# **Key Questions**

- Does this device perform "well-enough" under heavy-ion irradiation to be recommended for ultra-low cost missions or instruments (CubeSATs and the like)?
  - Not *qualification*, but a starting-point for system design
  - It's going to upset, but what are the common error signatures?
- What results can we derive from heavy-ion testing using a commercially-available evaluation system for this microcontroller?
- What considerations do we need for testing (especially with limited time and money)?

## **Device Under Test**

- Freescale MPC56xx Family
  - 32-bit Power Architecture MCU
  - Automotive/Industrial applications
  - Specific Part: **SPC5606B** (S prefix = "automotive-qualified")
    - 90nm Power Architecture e200z0 core
    - 1 MB ECC Flash memory
    - 80 KB ECC SRAM
    - 64 MHz Processor Core
    - 144 LQFP (plastic)
    - Temperature Range -40 to +125 C
  - Commercial Evaluation Board





## **Test Considerations**

- Assume choice of automotive microcontroller driven largely by cost
  - Limits ability to test in application-specific manner (testing isn't cheap...)
- But, any microprocessor or microcontroller is a complex part...
- Large number of complex elements each capable of affecting radiation response
- How to identify error type?
  - Almost like testing SOC/board

Image source: http://www.nxp.com/products/microcontrollers-andprocessors/power-architecture-processors/mpc5xxx-5xxx-32-bitmcus



#### Complex device $\rightarrow$ Complex test $\rightarrow$ Complex data



## Test Software Components

- **DSPI** (Deserial/Serial Peripheral Interface) loopback test of communications peripheral.
- **eDMA** (enhanced Direct Memory Access) Moves block of data in SRAM using software handshaking and interrupt-driven error-detection.
- Memory EDAC Uses internal EDAC module (SECDED) to report bit upsets in SRAM and flash memory, as well as bus access issues (flash stall/abort).
- Math Prime number testing to load-test arithmetic units
- ADC Interrupt-driven test of ADC peripheral reports if analog value departs expected range. Reports back number of successful conversions.
- **Software Watchdog** Serviced by Periodic Interval Timer (ISR). If service routine fails to reset watchdog (CPU is locked up or program execution stops) board is automatically reset. Software will report any resets.
- **Reset Generation Module** Board will be functionally reset by software if other test modules' counters fail to increment for some reason.
- Individual test programs to characterize **low-power sleep mode** performance and to focus on individual items from above list as necessary
- Additionally, there is built-in **low-voltage detection** for the 1.2V core supply that will reset the device. Any POR, whether intentional or due to low-voltage event, will be logged.

## **Radiation Test Conditions**

- Testing at LBNL 88" Cyclotron
  - Ion Species Used: B, O, Ne, Si, Ar, Cu, Kr, Ag @ 10MeV/AMU tune
  - Nominal LET Range: 0.89 to 48.15 MeV\*cm<sup>2</sup>/mg
  - Angular Testing up to 45 degrees (effective LET 68.09 MeV\*cm<sup>2</sup>/mg)
  - Room temperature exposures in vacuum

| BASE | Elon List : . |         |    |     |       |        |        |         |       |       |
|------|---------------|---------|----|-----|-------|--------|--------|---------|-------|-------|
| lon  | Cocktail      | Energy  | Ζ  | Α   | Chg.  | % Nat. | LET 0° | LET 60° | Range | (Max) |
|      | (AMeV)        | (MeV)   |    |     | State | Abund. | (MeV/m | ig/cm2) | (µm)  |       |
| В    | 10            | 108.01  | 5  | 11  | +3    | 80.1   | 0.89   | 1.78    | 305.7 | -     |
| 0    | 10            | 183.47  | 8  | 18  | +5    | 0.2    | 2.19   | 4.38    | 226.4 |       |
| Ne   | 10            | 216.28  | 10 | 22  | +6    | 9.25   | 3.49   | 6.98    | 174.6 |       |
| Si   | 10            | 291.77  | 14 | 29  | +8    | 4.67   | 6.09   | 12.18   | 141.7 |       |
| Ar   | 10            | 400.00  | 18 | 40  | +11   | 99.6   | 9.74   | 19.48   | 130.1 |       |
| V    | 10            | 508.27  | 23 | 51  | +14   | 99.75  | 14.59  | 29.18   | 113.4 |       |
| Cu   | 10            | 659.19  | 29 | 65  | +18   | 30.83  | 21.17  | 42.34   | 108.0 |       |
| Kr   | 10            | 885.59  | 36 | 86  | +24   | 17.3   | 30.86  | 61.72   | 109.9 |       |
| Y    | 10            | 928.49  | 39 | 89  | +25   | 100    | 34.73  | 69.46   | 102.2 |       |
| Ag   | 10            | 1039.42 | 47 | 107 | +29   | 51.839 | 48.15  | 96.30   | 90.0  |       |
| Xe   | 10            | 1232.55 | 54 | 124 | +34   | 0.1    | 58.78  | 117.56  | 90.0  |       |
| Au*  | 10            | 1955.87 | 79 | 197 | +54   | 100    | 85.76  | 171.52  | 105.9 | Imag  |

#### Test Setup

- Customized self-test software (C code) running on target
  MCU during irradiation
- Data output via RS232 to PC (one-way monitoring only)
- Power supply logs of main +12V power to motherboard



Images sourced: Dell.com, Keithley.com, NXP.com

## Summary of Results

- Testing started with lowest LET available and gradually increased
- At low LET (<1 MeV\*cm<sup>2</sup>/mg) we only recorded:
  - Single-bit SRAM upsets (low cross-section 3.3x10<sup>-7</sup> cm<sup>2</sup>, automatically corrected by EDAC)
  - Rare CPU reset events not associated with increased power consumption (we'll call these SEFIs; ~3x10<sup>-7</sup>cm<sup>2</sup>)
- As LET increased we saw increasing single-bit upset cross-section (and eventually disabled logging of that event), occasional double-bit errors, and rare peripheral errors, but reset events (now associated with high current) began to dominate the test
- No parts were rendered inoperable during testing. Processor lockups were often self-recovered when high current caused an undervoltage condition.

## **SEL & SEFI Cross-Section**

- DRESET ("destructive" reset) is an error flagged by the MCU as it comes out of an unexpected reset.
- At higher LET these were associated with recovery from a high-current state (SEL?).
- But at low LET (<~8 MeV\*cm<sup>2</sup>/mg) no high-current events were noted (SEFIs?)



#### **Cross-Section of DRESET Events**

# Breakdown of Reset Events (SEL/SEFI)



- - Two distinct responses lowcurrent resets at low LET, highcurrent resets at high LET
  - No high-current events below LET of 8 MeV\*cm<sup>2</sup>/mg, but above 20 they dominate all other events.

# High-Current Events (SEL)

- Supply current to motherboard is a single +12V line
  - Current limited, but typical "high-current" event did not reach supply's limit
- But, +12V supply is internally regulated on-board *and* on-chip:
  - V<sub>DD LV</sub> is 1.2V core voltage, with maximum specified output current of 150 mA
    - This closely relates to the peak seen during high-current events



#### Low-LET SEFI Events

- Runs at lower LET (<~8 MeV\*cm<sup>2</sup>/mg) showed processor resets that were not associated with high-current spikes.
- Critically, these were NOT sufficient to induce automatic reset due to overcurrent and internal watchdog timer did not always function → an external watchdog is recommended to reliably initiate POR



### Low Power Standby Mode

- Microcontroller can be put into a powered-down standby mode to conserve power (<100 uA) while monitoring for external interrupt signal
- Two tests performed:
  - 1. Device in standby with periodic wake-up from internal RTC
  - 2. Device in standby with external interrupt wake-up
- Even with low LET (3.49 MeV\*cm<sup>2</sup>/mg) the device can get stuck in a sleep mode where it could not wake itself – occasionally this was accompanied by a power increase
- External interrupt-driven wakeup was more reliable.

## Low Power Standby Mode Examples



## **Other Error Events**

- Events noted at low LET where resets were far enough apart to allow full self-test loops to consistently complete:
  - Single bit errors in SRAM common, threshold < 0.89 MeV\*cm<sup>2</sup>/mg
  - Double bit errors in SRAM, threshold between 1.78 and 3.10 MeV\*cm<sup>2</sup>/mg
  - Single bit flash errors, threshold between 1.78 and 3.10 MeV\*cm<sup>2</sup>/mg
    - Note: no uncorrectable flash errors
  - One ADC readback error, threshold between 1.78 and 3.10 MeV\*cm<sup>2</sup>/mg
- Other events noted at higher LET (threshold > 3.10 MeV\*cm<sup>2</sup>/mg, but frequency of resets made it difficult to complete self-test loops):
  - DMA transfer errors
  - Possible UART hits (corrupted datastream)
  - DSPI halt

- Insufficient data on any of these events to provide individual cross-sections.
- Total of all errors (except single-bit SRAM) consistently < 1/10th SEL/SEFI count)</p>

# Summary

- Internal ECC functionality helps reduce (but not eliminate) soft-errors in SRAM (ECC fully effective with program flash)
- As expected, SEFI and SEL dominate the device response
  - May recover on its own due to on-chip low-voltage detection/POR circuitry
  - But not always SEFI with no current increase not generally internally-recoverable (even with internal watchdog enabled)
  - Not immune from events during low-power sleep modes
- May be useful as an inexpensive off-the-shelf part, but not a "rad-hard" part
  - Could use app-specific testing to better define expected performance
  - On-board detection circuitry needed to recover from certain events
  - Need to combine with TID and other reliability data to get full picture...

## Thank You

• Questions?