NVME Queue Timeout

While doing some initial testing, I came across this during boot on a brand-new WD Blue SN550 but the NVME error log is empty (and SMART data similarly empty).

[   35.807925] nvme nvme0: I/O 640 QID 8 timeout, aborting
[   35.813368] nvme nvme0: Abort status: 0x0
[   62.461515] random: crng init done
[   66.527910] nvme nvme0: I/O 640 QID 8 timeout, reset controller
[   66.564218] blk_update_request: I/O error, dev nvme0n1, sector 2072 op 0x0:(READ) flags 0x80700 phys_seg 1 prio class 0
[   66.581299] nvme nvme0: 8/0/0 default/read/poll queues

Hmm… bit weird. WD/SanDisk’s have been the least buggy in my experience so far, I have an SN550 in one of my dev systems here and I haven’t seen any timeouts with it.

What kernel are you running and can you provide a full lspci -vv listing for the device?

Kernel is debian stable. lspci follows. I’ll monitor it and see if it happens again, hopefully not.

0002:01:00.0 Non-Volatile memory controller: Sandisk Corp WD Blue SN550 NVMe SSD (rev 01) (prog-if 02 [NVM Express])
Subsystem: Sandisk Corp WD Blue SN550 NVMe SSD
Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx+
Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
Latency: 0, Cache Line Size: 32 bytes
Interrupt: pin A routed to IRQ 368
NUMA node: 0
IOMMU group: 3
Region 0: Memory at 3040000000 (64-bit, non-prefetchable) [size=16K]
Region 4: Memory at 3040004000 (64-bit, non-prefetchable) [size=256]
Capabilities: [80] Power Management version 3
	Flags: PMEClk- DSI- D1- D2- AuxCurrent=0mA PME(D0-,D1-,D2-,D3hot-,D3cold-)
	Status: D0 NoSoftRst+ PME-Enable- DSel=0 DScale=0 PME-
Capabilities: [90] MSI: Enable- Count=1/32 Maskable- 64bit+
	Address: 0000000000000000  Data: 0000
Capabilities: [b0] MSI-X: Enable+ Count=17 Masked-
	Vector table: BAR=0 offset=00002000
	PBA: BAR=4 offset=00000000
Capabilities: [c0] Express (v2) Endpoint, MSI 00
	DevCap:	MaxPayload 512 bytes, PhantFunc 0, Latency L0s <1us, L1 unlimited
		ExtTag- AttnBtn- AttnInd- PwrInd- RBE+ FLReset+ SlotPowerLimit 0.000W
	DevCtl:	CorrErr+ NonFatalErr+ FatalErr+ UnsupReq+
		RlxdOrd+ ExtTag- PhantFunc- AuxPwr- NoSnoop+ FLReset-
		MaxPayload 128 bytes, MaxReadReq 512 bytes
	DevSta:	CorrErr- NonFatalErr- FatalErr- UnsupReq- AuxPwr- TransPend-
	LnkCap:	Port #0, Speed 8GT/s, Width x4, ASPM L1, Exit Latency L1 <8us
		ClockPM+ Surprise- LLActRep- BwNot- ASPMOptComp+
	LnkCtl:	ASPM Disabled; RCB 64 bytes, Disabled- CommClk-
		ExtSynch- ClockPM- AutWidDis- BWInt- AutBWInt-
	LnkSta:	Speed 8GT/s (ok), Width x2 (downgraded)
		TrErr- Train- SlotClk+ DLActive- BWMgmt- ABWMgmt-
	DevCap2: Completion Timeout: Range B, TimeoutDis+ NROPrPrP- LTR+
		 10BitTagComp- 10BitTagReq- OBFF Not Supported, ExtFmt+ EETLPPrefix-
		 EmergencyPowerReduction Not Supported, EmergencyPowerReductionInit-
		 FRS- TPHComp- ExtTPHComp-
		 AtomicOpsCap: 32bit- 64bit- 128bitCAS-
	DevCtl2: Completion Timeout: 50us to 50ms, TimeoutDis- LTR- OBFF Disabled,
		 AtomicOpsCtl: ReqEn-
	LnkCap2: Supported Link Speeds: 2.5-8GT/s, Crosslink- Retimer- 2Retimers- DRS-
	LnkCtl2: Target Link Speed: 8GT/s, EnterCompliance- SpeedDis-
		 Transmit Margin: Normal Operating Range, EnterModifiedCompliance- ComplianceSOS-
		 Compliance De-emphasis: -6dB
	LnkSta2: Current De-emphasis Level: -6dB, EqualizationComplete+ EqualizationPhase1+
		 EqualizationPhase2+ EqualizationPhase3- LinkEqualizationRequest-
		 Retimer- 2Retimers- CrosslinkRes: unsupported
Capabilities: [100 v2] Advanced Error Reporting
	UESta:	DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-
	UEMsk:	DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-
	UESvrt:	DLP+ SDES+ TLP- FCP+ CmpltTO- CmpltAbrt- UnxCmplt- RxOF+ MalfTLP+ ECRC- UnsupReq- ACSViol-
	CESta:	RxErr- BadTLP- BadDLLP- Rollover- Timeout- AdvNonFatalErr-
	CEMsk:	RxErr- BadTLP- BadDLLP- Rollover- Timeout- AdvNonFatalErr+
	AERCap:	First Error Pointer: 00, ECRCGenCap+ ECRCGenEn- ECRCChkCap+ ECRCChkEn-
		MultHdrRecCap- MultHdrRecEn- TLPPfxPres- HdrLogCap-
	HeaderLog: 00000000 00000000 00000000 00000000
Capabilities: [150 v1] Device Serial Number 00-00-00-00-00-00-00-00
Capabilities: [1b8 v1] Latency Tolerance Reporting
	Max snoop latency: 0ns
	Max no snoop latency: 0ns
Capabilities: [300 v1] Secondary PCI Express
	LnkCtl3: LnkEquIntrruptEn- PerformEqu-
	LaneErrStat: 0
Capabilities: [900 v1] L1 PM Substates
	L1SubCap: PCI-PM_L1.2+ PCI-PM_L1.1- ASPM_L1.2+ ASPM_L1.1- L1_PM_Substates+
		  PortCommonModeRestoreTime=32us PortTPowerOnTime=10us
	L1SubCtl1: PCI-PM_L1.2- PCI-PM_L1.1- ASPM_L1.2- ASPM_L1.1-
		   T_CommonMode=0us LTR1.2_Threshold=0ns
	L1SubCtl2: T_PwrOn=10us
Kernel driver in use: nvme
Kernel modules: nvme