Discussion:
[E1000-devel] AMD-Vi: Event logged IO_PAGE_FAULT - ixgbe Detected Tx Unit Hang - Reset adapter - master disable timed out
Alexander Duyck
2016-06-09 16:03:40 UTC
Permalink
Jun 9 14:40:09 computer kernel: AMD-Vi: Event logged [IO_PAGE_FAULT device=04:00.0 domain=0x000e address=0x00000000000178c0 flags=0x0050]
Jun 9 14:40:09 computer kernel: AMD-Vi: Event logged [IO_PAGE_FAULT device=04:00.0 domain=0x000e address=0x0000000000017900 flags=0x0050]
Jun 9 14:40:13 computer kernel: ixgbe 0000:04:00.0 enp4s0: Detected Tx Unit Hang#012 Tx Queue <2>#012 TDH, TDT <186>, <194>#012 next_to_use <194>#012 next_to_clean <186>#012tx_buffer_info[next_to_clean]#012 time_stamp <11df79bf7>#012 jiffies <11df7aac8>
Jun 9 14:40:13 computer kernel: ixgbe 0000:04:00.0 enp4s0: Detected Tx Unit Hang#012 Tx Queue <3>#012 TDH, TDT <1e4>, <2>#012 next_to_use <2>#012 next_to_clean <1e4>#012tx_buffer_info[next_to_clean]#012 time_stamp <11df79a0f>#012 jiffies <11df7aac8>
Jun 9 14:40:13 computer kernel: ixgbe 0000:04:00.0 enp4s0: tx hang 1 detected on queue 3, resetting adapter
Jun 9 14:40:13 computer kernel: ixgbe 0000:04:00.0 enp4s0: Detected Tx Unit Hang#012 Tx Queue <24>#012 TDH, TDT <1ec>, <2>#012 next_to_use <2>#012 next_to_clean <1ec>#012tx_buffer_info[next_to_clean]#012 time_stamp <11df79a0f>#012 jiffies <11df7aac8>
Jun 9 14:40:13 computer kernel: ixgbe 0000:04:00.0 enp4s0: initiating reset due to tx timeout
Jun 9 14:40:13 computer kernel: ixgbe 0000:04:00.0 enp4s0: tx hang 1 detected on queue 24, resetting adapter
Jun 9 14:40:13 computer kernel: ixgbe 0000:04:00.0 enp4s0: initiating reset due to tx timeout
Jun 9 14:40:13 computer kernel: ixgbe 0000:04:00.0 enp4s0: Reset adapter
Jun 9 14:40:13 computer kernel: ixgbe 0000:04:00.0 enp4s0: tx hang 2 detected on queue 2, resetting adapter
Jun 9 14:40:14 computer kernel: ixgbe 0000:04:00.0: master disable timed out
...
And today, no other NIC connected to the same switch saw any "glitch".
I got you an "lspci -vvv" output, however, some interesting
"pcilib: sysfs_read_vpd: read failed: Input/output error"
04:00.0 Ethernet controller: Intel Corporation Ethernet Controller 10-Gigabit X540-AT2 (rev 01)
Subsystem: Intel Corporation Ethernet Converged Network Adapter X540-T1
Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR+ FastB2B- DisINTx+
Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort+ <MAbort- >SERR- <PERR- INTx-
Latency: 0, Cache Line Size: 64 bytes
Interrupt: pin A routed to IRQ 59
Region 0: Memory at dce00000 (64-bit, prefetchable) [size=2M]
Region 4: Memory at dcdfc000 (64-bit, prefetchable) [size=16K]
Expansion ROM at dfd80000 [disabled] [size=512K]
Capabilities: [40] Power Management version 3
Flags: PMEClk- DSI+ D1- D2- AuxCurrent=0mA PME(D0+,D1-,D2-,D3hot+,D3cold-)
Status: D0 NoSoftRst- PME-Enable- DSel=0 DScale=1 PME-
Capabilities: [50] MSI: Enable- Count=1/1 Maskable+ 64bit+
Address: 0000000000000000 Data: 0000
Masking: 00000000 Pending: 00000000
Capabilities: [70] MSI-X: Enable+ Count=64 Masked-
Vector table: BAR=4 offset=00000000
PBA: BAR=4 offset=00002000
Capabilities: [a0] Express (v2) Endpoint, MSI 00
DevCap: MaxPayload 512 bytes, PhantFunc 0, Latency L0s <512ns, L1 <64us
ExtTag- AttnBtn- AttnInd- PwrInd- RBE+ FLReset+
DevCtl: Report errors: Correctable+ Non-Fatal+ Fatal+ Unsupported+
RlxdOrd- ExtTag- PhantFunc- AuxPwr- NoSnoop+ FLReset-
MaxPayload 128 bytes, MaxReadReq 512 bytes
DevSta: CorrErr+ UncorrErr- FatalErr- UnsuppReq+ AuxPwr- TransPend-
LnkCap: Port #0, Speed 5GT/s, Width x8, ASPM L0s L1, Exit Latency L0s <1us, L1 <8us
ClockPM- Surprise- LLActRep- BwNot-
LnkCtl: ASPM Disabled; RCB 64 bytes Disabled- CommClk+
ExtSynch- ClockPM- AutWidDis- BWInt- AutBWInt-
LnkSta: Speed 5GT/s, Width x8, TrErr- Train- SlotClk+ DLActive- BWMgmt- ABWMgmt-
DevCap2: Completion Timeout: Range ABCD, TimeoutDis+, LTR-, OBFF Not Supported
DevCtl2: Completion Timeout: 50us to 50ms, TimeoutDis-, LTR-, OBFF Disabled
LnkCtl2: Target Link Speed: 5GT/s, EnterCompliance- SpeedDis-
Transmit Margin: Normal Operating Range, EnterModifiedCompliance- ComplianceSOS-
Compliance De-emphasis: -6dB
LnkSta2: Current De-emphasis Level: -3.5dB, EqualizationComplete-, EqualizationPhase1-
EqualizationPhase2-, EqualizationPhase3-, LinkEqualizationRequest-
Capabilities: [100 v2] Advanced Error Reporting
UESta: DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-
UEMsk: DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-
UESvrt: DLP+ SDES+ TLP- FCP+ CmpltTO- CmpltAbrt- UnxCmplt- RxOF+ MalfTLP+ ECRC- UnsupReq- ACSViol-
CESta: RxErr- BadTLP- BadDLLP- Rollover- Timeout- NonFatalErr-
CEMsk: RxErr- BadTLP- BadDLLP- Rollover- Timeout- NonFatalErr+
AERCap: First Error Pointer: 00, Gpcilib: sysfs_read_vpd: read failed: Input/output error
enCap+ CGenEn- ChkCap+ ChkEn-
Capabilities: [140 v1] Device Serial Number a0-36-9f-ff-ff-80-xx-xx
Capabilities: [150 v1] Alternative Routing-ID Interpretation (ARI)
ARICap: MFVC- ACS-, Next Function: 0
ARICtl: MFVC- ACS-, Function Group: 0
Capabilities: [160 v1] Single Root I/O Virtualization (SR-IOV)
IOVCap: Migration-, Interrupt Message Number: 000
IOVCtl: Enable- Migration- Interrupt- MSE- ARIHierarchy+
IOVSta: Migration-
Initial VFs: 64, Total VFs: 64, Number of VFs: 0, Function Dependency Link: 00
VF offset: 128, stride: 2, Device ID: 1515
Supported Page Size: 00000553, System Page Size: 00000001
Region 0: Memory at 0000000000000000 (64-bit, non-prefetchable)
Region 3: Memory at 0000000000000000 (64-bit, non-prefetchable)
VF Migration: offset: 00000000, BIR: 0
Capabilities: [1d0 v1] Access Control Services
ACSCap: SrcValid- TransBlk- ReqRedir- CmpltRedir- UpstreamFwd- EgressCtrl- DirectTrans-
ACSCtl: SrcValid- TransBlk- ReqRedir- CmpltRedir- UpstreamFwd- EgressCtrl- DirectTrans-
Kernel driver in use: ixgbe
This time I'll reboot the machine, and also try "iommu=pt" as suggested
in different places for use with 10G NICs.
That might be a good place to start.

I'm adding, or at least attempting to, the mailing list and maintainer
for the IOMMU code. You might want to check with the AMD-Vi IOMMU
maintainers to see if they have any other advice as this seems like
something that may have been introduced with changes to the IOMMU as
the ixgbe driver hasn't had any updates to the DMA mapping/unmapping
code in some time and it was working in the 4.4 kernel series and
still works on my system which runs an Intel IOMMU so I am wondering
if this may be something specifically related to changes in the AMD
IOMMU code.

- Alex
Lutz Vieweg
2016-06-09 16:57:45 UTC
Permalink
Post by Alexander Duyck
This time I'll reboot the machine, and also try "iommu=pt" as suggested
in different places for use with 10G NICs.
That might be a good place to start.
I'm adding, or at least attempting to, the mailing list and maintainer
for the IOMMU code. You might want to check with the AMD-Vi IOMMU
maintainers to see if they have any other advice as this seems like
something that may have been introduced with changes to the IOMMU as
the ixgbe driver hasn't had any updates to the DMA mapping/unmapping
code in some time and it was working in the 4.4 kernel series and
still works on my system which runs an Intel IOMMU so I am wondering
if this may be something specifically related to changes in the AMD
IOMMU code.
After having rebooted the system with "iommu=pt", the following change
Post by Alexander Duyck
[ 4.869591] iommu: Adding device 0000:04:00.0 to group 13
...
Post by Alexander Duyck
[ 4.873105] AMD-Vi: Found IOMMU at 0000:00:00.2 cap 0x40
[ 4.873347] AMD-Vi: Found IOMMU at 0000:40:00.2 cap 0x40
[ 4.873586] AMD-Vi: Interrupt remapping enabled
[ 4.874108] AMD-Vi: Lazy IO/TLB flushing enabled
[ 4.832580] iommu: Adding device 0000:04:00.0 to group 13
[ 4.832838] iommu: Using direct mapping for device 0000:04:00.0
...
Post by Alexander Duyck
[ 4.837074] AMD-Vi: Found IOMMU at 0000:00:00.2 cap 0x40
[ 4.837305] AMD-Vi: Found IOMMU at 0000:40:00.2 cap 0x40
[ 4.837535] AMD-Vi: Interrupt remapping enabled
[ 4.838062] AMD-Vi: Lazy IO/TLB flushing enabled
[ 4.838291] PCI-DMA: Using software bounce buffering for IO (SWIOTLB)
[ 4.838533] software IO TLB [mem 0xd3e80000-0xd7e80000] (64MB) mapped at [ffff8800d3e80000-ffff8800d7e7ffff]
I hope that doesn't mean all my network data is now passing through
an additional copy-by-CPU... that would be kind of the opposite of what
"iommu=pt" seemed to promise :-)
Post by Alexander Duyck
[ 0.000000] AGP: Checking aperture...
[ 0.000000] AGP: No AGP bridge found
[ 0.000000] AGP: Node 0: aperture [bus addr 0x00000000-0x01ffffff] (32MB)
[ 0.000000] AGP: Your BIOS doesn't leave an aperture memory hole
[ 0.000000] AGP: Please enable the IOMMU option in the BIOS setup
[ 0.000000] AGP: This costs you 64MB of RAM
[ 0.000000] AGP: Mapping aperture over RAM [mem 0xcc000000-0xcfffffff] (65536KB)
I checked and the IOMMU-option is definitely enabled in the BIOS setup.
So I assume right that these message are irrelevant (since AGP as a whole
is irrelevant on this server)?

Regards,

Lutz Vieweg
Wan ZongShun
2016-06-13 02:46:39 UTC
Permalink
Post by Lutz Vieweg
Post by Alexander Duyck
This time I'll reboot the machine, and also try "iommu=pt" as suggested
in different places for use with 10G NICs.
That might be a good place to start.
I'm adding, or at least attempting to, the mailing list and maintainer
for the IOMMU code. You might want to check with the AMD-Vi IOMMU
maintainers to see if they have any other advice as this seems like
something that may have been introduced with changes to the IOMMU as
the ixgbe driver hasn't had any updates to the DMA mapping/unmapping
code in some time and it was working in the 4.4 kernel series and
still works on my system which runs an Intel IOMMU so I am wondering
if this may be something specifically related to changes in the AMD
IOMMU code.
After having rebooted the system with "iommu=pt", the following change
Post by Alexander Duyck
[ 4.869591] iommu: Adding device 0000:04:00.0 to group 13
...
Post by Alexander Duyck
[ 4.873105] AMD-Vi: Found IOMMU at 0000:00:00.2 cap 0x40
[ 4.873347] AMD-Vi: Found IOMMU at 0000:40:00.2 cap 0x40
[ 4.873586] AMD-Vi: Interrupt remapping enabled
[ 4.874108] AMD-Vi: Lazy IO/TLB flushing enabled
Ok, so there are two iommus controller in your system.
Post by Lutz Vieweg
Post by Alexander Duyck
[ 4.832580] iommu: Adding device 0000:04:00.0 to group 13
[ 4.832838] iommu: Using direct mapping for device 0000:04:00.0
That is right, you will pass through AMD IOMMU when you set iommu=pt.
Post by Lutz Vieweg
...
Post by Alexander Duyck
[ 4.837074] AMD-Vi: Found IOMMU at 0000:00:00.2 cap 0x40
[ 4.837305] AMD-Vi: Found IOMMU at 0000:40:00.2 cap 0x40
[ 4.837535] AMD-Vi: Interrupt remapping enabled
[ 4.838062] AMD-Vi: Lazy IO/TLB flushing enabled
[ 4.838291] PCI-DMA: Using software bounce buffering for IO (SWIOTLB)
[ 4.838533] software IO TLB [mem 0xd3e80000-0xd7e80000] (64MB) mapped
at [ffff8800d3e80000-ffff8800d7e7ffff]
I hope that doesn't mean all my network data is now passing through
an additional copy-by-CPU... that would be kind of the opposite of what
"iommu=pt" seemed to promise :-)
It depends.

Firstly, I need to know if your ethernet card works well now or not
after you set iommu=pt.

If your ethernet card with 64bit(not 32bit) DMA addressable cap, that
is ok, you will not be impacted by bounce buffer. But iommu=pt is a
terrible option, that make all devices bypass the iommu.

If you want to get further help, Please try:

(1)Please add 'amd_iommu_dump' option in your kernel boot option, and
send your full kernel logs, lspci info, don't add iommu=pt.
(2) Add amd_iommu=fullflush option to kernel boot option, just try it.
Post by Lutz Vieweg
Post by Alexander Duyck
[ 0.000000] AGP: Checking aperture...
[ 0.000000] AGP: No AGP bridge found
[ 0.000000] AGP: Node 0: aperture [bus addr 0x00000000-0x01ffffff] (32MB)
[ 0.000000] AGP: Your BIOS doesn't leave an aperture memory hole
[ 0.000000] AGP: Please enable the IOMMU option in the BIOS setup
[ 0.000000] AGP: This costs you 64MB of RAM
[ 0.000000] AGP: Mapping aperture over RAM [mem 0xcc000000-0xcfffffff] (65536KB)
I checked and the IOMMU-option is definitely enabled in the BIOS setup.
So I assume right that these message are irrelevant (since AGP as a whole
is irrelevant on this server)?
Please cat /proc/iomem, send the information.
Post by Lutz Vieweg
Regards,
Lutz Vieweg
_______________________________________________
iommu mailing list
https://lists.linuxfoundation.org/mailman/listinfo/iommu
--
---
Vincent Wan(Zongshun)
www.mcuos.com
Lutz Vieweg
2016-06-13 17:40:11 UTC
Permalink
Post by Wan ZongShun
Post by Lutz Vieweg
Post by Alexander Duyck
[ 4.832580] iommu: Adding device 0000:04:00.0 to group 13
[ 4.832838] iommu: Using direct mapping for device 0000:04:00.0
That is right, you will pass through AMD IOMMU when you set iommu=pt.
Post by Lutz Vieweg
...
Post by Alexander Duyck
[ 4.837074] AMD-Vi: Found IOMMU at 0000:00:00.2 cap 0x40
[ 4.837305] AMD-Vi: Found IOMMU at 0000:40:00.2 cap 0x40
[ 4.837535] AMD-Vi: Interrupt remapping enabled
[ 4.838062] AMD-Vi: Lazy IO/TLB flushing enabled
[ 4.838291] PCI-DMA: Using software bounce buffering for IO (SWIOTLB)
[ 4.838533] software IO TLB [mem 0xd3e80000-0xd7e80000] (64MB) mapped
at [ffff8800d3e80000-ffff8800d7e7ffff]
I hope that doesn't mean all my network data is now passing through
an additional copy-by-CPU... that would be kind of the opposite of what
"iommu=pt" seemed to promise :-)
It depends.
Firstly, I need to know if your ethernet card works well now or not
after you set iommu=pt.
Too early to tell - the NIC worked for the last 4 days now without
failing, however, that is only about the same time as it took after
the upgrade to linux-4.6.1 before the bug was encountered, first.

I'd say celebration of "works with iommu=pt" has to wait for at least
two weeks or so before it is reasonably probable it works for this reason.
Post by Wan ZongShun
If your ethernet card with 64bit(not 32bit) DMA addressable cap, that
is ok, you will not be impacted by bounce buffer.
But iommu=pt is a terrible option, that make all devices bypass the iommu.
Why is that terrible? The documentation I found on what iommu=pt actually
means were pretty scarce, but I noticed how many places recommended to use
this option for 10G NICs.
Post by Wan ZongShun
(1)Please add 'amd_iommu_dump' option in your kernel boot option, and
send your full kernel logs, lspci info, don't add iommu=pt.
(2) Add amd_iommu=fullflush option to kernel boot option, just try it.
Will try that when the NIC becomes unavailable again.
Post by Wan ZongShun
Post by Lutz Vieweg
Post by Alexander Duyck
[ 0.000000] AGP: Checking aperture...
[ 0.000000] AGP: No AGP bridge found
[ 0.000000] AGP: Node 0: aperture [bus addr 0x00000000-0x01ffffff] (32MB)
[ 0.000000] AGP: Your BIOS doesn't leave an aperture memory hole
[ 0.000000] AGP: Please enable the IOMMU option in the BIOS setup
[ 0.000000] AGP: This costs you 64MB of RAM
[ 0.000000] AGP: Mapping aperture over RAM [mem 0xcc000000-0xcfffffff] (65536KB)
I checked and the IOMMU-option is definitely enabled in the BIOS setup.
So I assume right that these message are irrelevant (since AGP as a whole
is irrelevant on this server)?
Please cat /proc/iomem, send the information.
00000000-00000fff : reserved
00001000-00097bff : System RAM
00097c00-0009ffff : reserved
000a0000-000bffff : PCI Bus 0000:00
000c0000-000c7fff : Video ROM
000ce800-000d43ff : Adapter ROM
000d4800-000d57ff : Adapter ROM
000e6000-000fffff : reserved
000f0000-000fffff : System ROM
00100000-d7e7ffff : System RAM
01000000-01688c05 : Kernel code
01688c06-01d4f53f : Kernel data
01eea000-02174fff : Kernel bss
d7e80000-d7e8dfff : RAM buffer
d7e8e000-d7e8ffff : reserved
d7e90000-d7eb3fff : ACPI Tables
d7eb4000-d7edffff : ACPI Non-volatile Storage
d7ee0000-d7ffffff : reserved
d9000000-daffffff : PCI Bus 0000:40
d9000000-d90003ff : IOAPIC 2
d9010000-d9013fff : amd_iommu
db000000-dcffffff : PCI Bus 0000:00
db000000-dbffffff : PCI Bus 0000:01
db000000-dbffffff : 0000:01:04.0
db000000-dbffffff : mgadrmfb_vram
dcd00000-dcffffff : PCI Bus 0000:04
dcdfc000-dcdfffff : 0000:04:00.0
dcdfc000-dcdfffff : ixgbe
dce00000-dcffffff : 0000:04:00.0
dce00000-dcffffff : ixgbe
dd000000-dfffffff : PCI Bus 0000:00
def00000-df7fffff : PCI Bus 0000:01
deffc000-deffffff : 0000:01:04.0
deffc000-deffffff : mgadrmfb_mmio
df000000-df7fffff : 0000:01:04.0
dfaf6000-dfaf6fff : 0000:00:12.1
dfaf6000-dfaf6fff : ohci_hcd
dfaf7000-dfaf7fff : 0000:00:12.0
dfaf7000-dfaf7fff : ohci_hcd
dfaf8400-dfaf87ff : 0000:00:11.0
dfaf8400-dfaf87ff : ahci
dfaf8800-dfaf88ff : 0000:00:12.2
dfaf8800-dfaf88ff : ehci_hcd
dfaf8c00-dfaf8cff : 0000:00:13.2
dfaf8c00-dfaf8cff : ehci_hcd
dfaf9000-dfaf9fff : 0000:00:13.1
dfaf9000-dfaf9fff : ohci_hcd
dfafa000-dfafafff : 0000:00:13.0
dfafa000-dfafafff : ohci_hcd
dfafb000-dfafbfff : 0000:00:14.5
dfafb000-dfafbfff : ohci_hcd
dfb00000-dfbfffff : PCI Bus 0000:02
dfb1c000-dfb1ffff : 0000:02:00.1
dfb1c000-dfb1ffff : igb
dfb20000-dfb3ffff : 0000:02:00.1
dfb40000-dfb5ffff : 0000:02:00.1
dfb40000-dfb5ffff : igb
dfb60000-dfb7ffff : 0000:02:00.1
dfb60000-dfb7ffff : igb
dfb9c000-dfb9ffff : 0000:02:00.0
dfb9c000-dfb9ffff : igb
dfba0000-dfbbffff : 0000:02:00.0
dfbc0000-dfbdffff : 0000:02:00.0
dfbc0000-dfbdffff : igb
dfbe0000-dfbfffff : 0000:02:00.0
dfbe0000-dfbfffff : igb
dfc00000-dfcfffff : PCI Bus 0000:03
dfc3c000-dfc3ffff : 0000:03:00.0
dfc3c000-dfc3ffff : mpt2sas
dfc40000-dfc7ffff : 0000:03:00.0
dfc40000-dfc7ffff : mpt2sas
dfc80000-dfcfffff : 0000:03:00.0
dfd00000-dfdfffff : PCI Bus 0000:04
dfd80000-dfdfffff : 0000:04:00.0
dfe00000-dfffffff : PCI Bus 0000:05
dfeb0000-dfebffff : 0000:05:00.0
dfeb0000-dfebffff : mpt2sas
dfec0000-dfefffff : 0000:05:00.0
dfec0000-dfefffff : mpt2sas
dff00000-dfffffff : 0000:05:00.0
e0000000-efffffff : PCI MMCONFIG 0000 [bus 00-ff]
e0000000-efffffff : reserved
e0000000-efffffff : pnp 00:0a
f6000000-f6003fff : amd_iommu
fec00000-fec003ff : IOAPIC 0
fec10000-fec1001f : pnp 00:04
fec20000-fec203ff : IOAPIC 1
fed00000-fed003ff : HPET 2
fed00000-fed003ff : PNP0103:00
fed40000-fed44fff : PCI Bus 0000:00
fee00000-fee00fff : Local APIC
fee00000-fee00fff : pnp 00:03
ffb80000-ffbfffff : pnp 00:04
ffe00000-ffffffff : reserved
ffe50000-ffe5e05f : pnp 00:04
100000000-2026ffffff : System RAM
2027000000-2027ffffff : RAM buffer
Regards,

Lutz Vieweg
Wan ZongShun
2016-06-14 03:01:33 UTC
Permalink
Post by Lutz Vieweg
Post by Wan ZongShun
Post by Lutz Vieweg
Post by Alexander Duyck
[ 4.832580] iommu: Adding device 0000:04:00.0 to group 13
[ 4.832838] iommu: Using direct mapping for device 0000:04:00.0
That is right, you will pass through AMD IOMMU when you set iommu=pt.
Post by Lutz Vieweg
...
Post by Alexander Duyck
[ 4.837074] AMD-Vi: Found IOMMU at 0000:00:00.2 cap 0x40
[ 4.837305] AMD-Vi: Found IOMMU at 0000:40:00.2 cap 0x40
[ 4.837535] AMD-Vi: Interrupt remapping enabled
[ 4.838062] AMD-Vi: Lazy IO/TLB flushing enabled
[ 4.838291] PCI-DMA: Using software bounce buffering for IO (SWIOTLB)
[ 4.838533] software IO TLB [mem 0xd3e80000-0xd7e80000] (64MB) mapped
at [ffff8800d3e80000-ffff8800d7e7ffff]
I hope that doesn't mean all my network data is now passing through
an additional copy-by-CPU... that would be kind of the opposite of what
"iommu=pt" seemed to promise :-)
It depends.
Firstly, I need to know if your ethernet card works well now or not
after you set iommu=pt.
Too early to tell - the NIC worked for the last 4 days now without
failing, however, that is only about the same time as it took after
the upgrade to linux-4.6.1 before the bug was encountered, first.
I'd say celebration of "works with iommu=pt" has to wait for at least
two weeks or so before it is reasonably probable it works for this reason.
Post by Wan ZongShun
If your ethernet card with 64bit(not 32bit) DMA addressable cap, that
is ok, you will not be impacted by bounce buffer.
But iommu=pt is a terrible option, that make all devices bypass the iommu.
Why is that terrible? The documentation I found on what iommu=pt actually
means were pretty scarce, but I noticed how many places recommended to use
this option for 10G NICs.
I supposed it will work well for your card after set iommu=pt, but it
is not rootcause for your issue.
The iommu=pt just let your all system devices bypassed the iommu, if
there are some device with 32bit DMA addressable cap in your system,
they will be impacted by bounce buffer, it is bad for performance.


Wan Zongshun.
Post by Lutz Vieweg
Post by Wan ZongShun
(1)Please add 'amd_iommu_dump' option in your kernel boot option, and
send your full kernel logs, lspci info, don't add iommu=pt.
(2) Add amd_iommu=fullflush option to kernel boot option, just try it.
Will try that when the NIC becomes unavailable again.
Post by Wan ZongShun
Post by Lutz Vieweg
Post by Alexander Duyck
[ 0.000000] AGP: Checking aperture...
[ 0.000000] AGP: No AGP bridge found
[ 0.000000] AGP: Node 0: aperture [bus addr 0x00000000-0x01ffffff] (32MB)
[ 0.000000] AGP: Your BIOS doesn't leave an aperture memory hole
[ 0.000000] AGP: Please enable the IOMMU option in the BIOS setup
[ 0.000000] AGP: This costs you 64MB of RAM
[ 0.000000] AGP: Mapping aperture over RAM [mem
0xcc000000-0xcfffffff]
(65536KB)
I checked and the IOMMU-option is definitely enabled in the BIOS setup.
So I assume right that these message are irrelevant (since AGP as a whole
is irrelevant on this server)?
Please cat /proc/iomem, send the information.
This AGP should be used for old GPU, so I don't think it will impact
your this issue.
Post by Lutz Vieweg
Post by Wan ZongShun
00000000-00000fff : reserved
00001000-00097bff : System RAM
00097c00-0009ffff : reserved
000a0000-000bffff : PCI Bus 0000:00
000c0000-000c7fff : Video ROM
000ce800-000d43ff : Adapter ROM
000d4800-000d57ff : Adapter ROM
000e6000-000fffff : reserved
000f0000-000fffff : System ROM
00100000-d7e7ffff : System RAM
01000000-01688c05 : Kernel code
01688c06-01d4f53f : Kernel data
01eea000-02174fff : Kernel bss
d7e80000-d7e8dfff : RAM buffer
d7e8e000-d7e8ffff : reserved
d7e90000-d7eb3fff : ACPI Tables
d7eb4000-d7edffff : ACPI Non-volatile Storage
d7ee0000-d7ffffff : reserved
d9000000-daffffff : PCI Bus 0000:40
d9000000-d90003ff : IOAPIC 2
d9010000-d9013fff : amd_iommu
db000000-dcffffff : PCI Bus 0000:00
db000000-dbffffff : PCI Bus 0000:01
db000000-dbffffff : 0000:01:04.0
db000000-dbffffff : mgadrmfb_vram
dcd00000-dcffffff : PCI Bus 0000:04
dcdfc000-dcdfffff : 0000:04:00.0
dcdfc000-dcdfffff : ixgbe
dce00000-dcffffff : 0000:04:00.0
dce00000-dcffffff : ixgbe
dd000000-dfffffff : PCI Bus 0000:00
def00000-df7fffff : PCI Bus 0000:01
deffc000-deffffff : 0000:01:04.0
deffc000-deffffff : mgadrmfb_mmio
df000000-df7fffff : 0000:01:04.0
dfaf6000-dfaf6fff : 0000:00:12.1
dfaf6000-dfaf6fff : ohci_hcd
dfaf7000-dfaf7fff : 0000:00:12.0
dfaf7000-dfaf7fff : ohci_hcd
dfaf8400-dfaf87ff : 0000:00:11.0
dfaf8400-dfaf87ff : ahci
dfaf8800-dfaf88ff : 0000:00:12.2
dfaf8800-dfaf88ff : ehci_hcd
dfaf8c00-dfaf8cff : 0000:00:13.2
dfaf8c00-dfaf8cff : ehci_hcd
dfaf9000-dfaf9fff : 0000:00:13.1
dfaf9000-dfaf9fff : ohci_hcd
dfafa000-dfafafff : 0000:00:13.0
dfafa000-dfafafff : ohci_hcd
dfafb000-dfafbfff : 0000:00:14.5
dfafb000-dfafbfff : ohci_hcd
dfb00000-dfbfffff : PCI Bus 0000:02
dfb1c000-dfb1ffff : 0000:02:00.1
dfb1c000-dfb1ffff : igb
dfb20000-dfb3ffff : 0000:02:00.1
dfb40000-dfb5ffff : 0000:02:00.1
dfb40000-dfb5ffff : igb
dfb60000-dfb7ffff : 0000:02:00.1
dfb60000-dfb7ffff : igb
dfb9c000-dfb9ffff : 0000:02:00.0
dfb9c000-dfb9ffff : igb
dfba0000-dfbbffff : 0000:02:00.0
dfbc0000-dfbdffff : 0000:02:00.0
dfbc0000-dfbdffff : igb
dfbe0000-dfbfffff : 0000:02:00.0
dfbe0000-dfbfffff : igb
dfc00000-dfcfffff : PCI Bus 0000:03
dfc3c000-dfc3ffff : 0000:03:00.0
dfc3c000-dfc3ffff : mpt2sas
dfc40000-dfc7ffff : 0000:03:00.0
dfc40000-dfc7ffff : mpt2sas
dfc80000-dfcfffff : 0000:03:00.0
dfd00000-dfdfffff : PCI Bus 0000:04
dfd80000-dfdfffff : 0000:04:00.0
dfe00000-dfffffff : PCI Bus 0000:05
dfeb0000-dfebffff : 0000:05:00.0
dfeb0000-dfebffff : mpt2sas
dfec0000-dfefffff : 0000:05:00.0
dfec0000-dfefffff : mpt2sas
dff00000-dfffffff : 0000:05:00.0
e0000000-efffffff : PCI MMCONFIG 0000 [bus 00-ff]
e0000000-efffffff : reserved
e0000000-efffffff : pnp 00:0a
f6000000-f6003fff : amd_iommu
fec00000-fec003ff : IOAPIC 0
fec10000-fec1001f : pnp 00:04
fec20000-fec203ff : IOAPIC 1
fed00000-fed003ff : HPET 2
fed00000-fed003ff : PNP0103:00
fed40000-fed44fff : PCI Bus 0000:00
fee00000-fee00fff : Local APIC
fee00000-fee00fff : pnp 00:03
ffb80000-ffbfffff : pnp 00:04
ffe00000-ffffffff : reserved
ffe50000-ffe5e05f : pnp 00:04
100000000-2026ffffff : System RAM
2027000000-2027ffffff : RAM buffer
Regards,
Lutz Vieweg
_______________________________________________
iommu mailing list
https://lists.linuxfoundation.org/mailman/listinfo/iommu
--
---
Vincent Wan(Zongshun)
www.mcuos.com
Lutz Vieweg
2016-08-29 12:30:16 UTC
Permalink
Post by Lutz Vieweg
Post by Wan ZongShun
Firstly, I need to know if your ethernet card works well now or not
after you set iommu=pt.
Too early to tell - the NIC worked for the last 4 days now without
failing, however, that is only about the same time as it took after
the upgrade to linux-4.6.1 before the bug was encountered, first.
I can now say that after using the option iommu=pt with linux-4.6.1,
the machine ran for > 2 months without problems.

For other reasons (btrfs-stuff) I had to upgrade the machine to
linux-4.7.2 last week, and the "iommu=pt" option wasn't active
after this upgrade.
It only took 4 days until the
"AMD-Vi: Event logged IO_PAGE_FAULT... ixgbe Detected Tx Unit Hang"
issue occured again.

So this evening, I'll reboot linux-4.7.2 with "iommu=pt" again,
as that really seemed to help.

Regards,

Lutz Vieweg
Post by Lutz Vieweg
Post by Wan ZongShun
If your ethernet card with 64bit(not 32bit) DMA addressable cap, that
is ok, you will not be impacted by bounce buffer.
But iommu=pt is a terrible option, that make all devices bypass the iommu.
Why is that terrible? The documentation I found on what iommu=pt actually
means were pretty scarce, but I noticed how many places recommended to use
this option for 10G NICs.
Post by Wan ZongShun
(1)Please add 'amd_iommu_dump' option in your kernel boot option, and
send your full kernel logs, lspci info, don't add iommu=pt.
(2) Add amd_iommu=fullflush option to kernel boot option, just try it.
Will try that when the NIC becomes unavailable again.
Post by Wan ZongShun
Post by Lutz Vieweg
Post by Alexander Duyck
[ 0.000000] AGP: Checking aperture...
[ 0.000000] AGP: No AGP bridge found
[ 0.000000] AGP: Node 0: aperture [bus addr 0x00000000-0x01ffffff] (32MB)
[ 0.000000] AGP: Your BIOS doesn't leave an aperture memory hole
[ 0.000000] AGP: Please enable the IOMMU option in the BIOS setup
[ 0.000000] AGP: This costs you 64MB of RAM
[ 0.000000] AGP: Mapping aperture over RAM [mem 0xcc000000-0xcfffffff] (65536KB)
I checked and the IOMMU-option is definitely enabled in the BIOS setup.
So I assume right that these message are irrelevant (since AGP as a whole
is irrelevant on this server)?
Please cat /proc/iomem, send the information.
00000000-00000fff : reserved
00001000-00097bff : System RAM
00097c00-0009ffff : reserved
000a0000-000bffff : PCI Bus 0000:00
000c0000-000c7fff : Video ROM
000ce800-000d43ff : Adapter ROM
000d4800-000d57ff : Adapter ROM
000e6000-000fffff : reserved
000f0000-000fffff : System ROM
00100000-d7e7ffff : System RAM
01000000-01688c05 : Kernel code
01688c06-01d4f53f : Kernel data
01eea000-02174fff : Kernel bss
d7e80000-d7e8dfff : RAM buffer
d7e8e000-d7e8ffff : reserved
d7e90000-d7eb3fff : ACPI Tables
d7eb4000-d7edffff : ACPI Non-volatile Storage
d7ee0000-d7ffffff : reserved
d9000000-daffffff : PCI Bus 0000:40
d9000000-d90003ff : IOAPIC 2
d9010000-d9013fff : amd_iommu
db000000-dcffffff : PCI Bus 0000:00
db000000-dbffffff : PCI Bus 0000:01
db000000-dbffffff : 0000:01:04.0
db000000-dbffffff : mgadrmfb_vram
dcd00000-dcffffff : PCI Bus 0000:04
dcdfc000-dcdfffff : 0000:04:00.0
dcdfc000-dcdfffff : ixgbe
dce00000-dcffffff : 0000:04:00.0
dce00000-dcffffff : ixgbe
dd000000-dfffffff : PCI Bus 0000:00
def00000-df7fffff : PCI Bus 0000:01
deffc000-deffffff : 0000:01:04.0
deffc000-deffffff : mgadrmfb_mmio
df000000-df7fffff : 0000:01:04.0
dfaf6000-dfaf6fff : 0000:00:12.1
dfaf6000-dfaf6fff : ohci_hcd
dfaf7000-dfaf7fff : 0000:00:12.0
dfaf7000-dfaf7fff : ohci_hcd
dfaf8400-dfaf87ff : 0000:00:11.0
dfaf8400-dfaf87ff : ahci
dfaf8800-dfaf88ff : 0000:00:12.2
dfaf8800-dfaf88ff : ehci_hcd
dfaf8c00-dfaf8cff : 0000:00:13.2
dfaf8c00-dfaf8cff : ehci_hcd
dfaf9000-dfaf9fff : 0000:00:13.1
dfaf9000-dfaf9fff : ohci_hcd
dfafa000-dfafafff : 0000:00:13.0
dfafa000-dfafafff : ohci_hcd
dfafb000-dfafbfff : 0000:00:14.5
dfafb000-dfafbfff : ohci_hcd
dfb00000-dfbfffff : PCI Bus 0000:02
dfb1c000-dfb1ffff : 0000:02:00.1
dfb1c000-dfb1ffff : igb
dfb20000-dfb3ffff : 0000:02:00.1
dfb40000-dfb5ffff : 0000:02:00.1
dfb40000-dfb5ffff : igb
dfb60000-dfb7ffff : 0000:02:00.1
dfb60000-dfb7ffff : igb
dfb9c000-dfb9ffff : 0000:02:00.0
dfb9c000-dfb9ffff : igb
dfba0000-dfbbffff : 0000:02:00.0
dfbc0000-dfbdffff : 0000:02:00.0
dfbc0000-dfbdffff : igb
dfbe0000-dfbfffff : 0000:02:00.0
dfbe0000-dfbfffff : igb
dfc00000-dfcfffff : PCI Bus 0000:03
dfc3c000-dfc3ffff : 0000:03:00.0
dfc3c000-dfc3ffff : mpt2sas
dfc40000-dfc7ffff : 0000:03:00.0
dfc40000-dfc7ffff : mpt2sas
dfc80000-dfcfffff : 0000:03:00.0
dfd00000-dfdfffff : PCI Bus 0000:04
dfd80000-dfdfffff : 0000:04:00.0
dfe00000-dfffffff : PCI Bus 0000:05
dfeb0000-dfebffff : 0000:05:00.0
dfeb0000-dfebffff : mpt2sas
dfec0000-dfefffff : 0000:05:00.0
dfec0000-dfefffff : mpt2sas
dff00000-dfffffff : 0000:05:00.0
e0000000-efffffff : PCI MMCONFIG 0000 [bus 00-ff]
e0000000-efffffff : reserved
e0000000-efffffff : pnp 00:0a
f6000000-f6003fff : amd_iommu
fec00000-fec003ff : IOAPIC 0
fec10000-fec1001f : pnp 00:04
fec20000-fec203ff : IOAPIC 1
fed00000-fed003ff : HPET 2
fed00000-fed003ff : PNP0103:00
fed40000-fed44fff : PCI Bus 0000:00
fee00000-fee00fff : Local APIC
fee00000-fee00fff : pnp 00:03
ffb80000-ffbfffff : pnp 00:04
ffe00000-ffffffff : reserved
ffe50000-ffe5e05f : pnp 00:04
100000000-2026ffffff : System RAM
2027000000-2027ffffff : RAM buffer
Regards,
Lutz Vieweg
Lutz Vieweg
2016-08-29 12:29:10 UTC
Permalink
Post by Lutz Vieweg
Post by Wan ZongShun
Firstly, I need to know if your ethernet card works well now or not
after you set iommu=pt.
Too early to tell - the NIC worked for the last 4 days now without
failing, however, that is only about the same time as it took after
the upgrade to linux-4.6.1 before the bug was encountered, first.
I can now say that after using the option iommu=pt with linux-4.6.1,
the machine ran for > 2 months without problems.

For other reasons (btrfs-stuff) I had to upgrade the machine to
linux-4.7.2 last week, and the "iommu=pt" option wasn't active
after this upgrade.
It only took 4 days until the
"AMD-Vi: Event logged IO_PAGE_FAULT... ixgbe Detected Tx Unit Hang"
issue occured again.

So this evening, I'll reboot linux-4.7.2 with "iommu=pt" again,
as that really seemed to help.

Regards,

Lutz Vieweg
Post by Lutz Vieweg
Post by Wan ZongShun
If your ethernet card with 64bit(not 32bit) DMA addressable cap, that
is ok, you will not be impacted by bounce buffer.
But iommu=pt is a terrible option, that make all devices bypass the iommu.
Why is that terrible? The documentation I found on what iommu=pt actually
means were pretty scarce, but I noticed how many places recommended to use
this option for 10G NICs.
Post by Wan ZongShun
(1)Please add 'amd_iommu_dump' option in your kernel boot option, and
send your full kernel logs, lspci info, don't add iommu=pt.
(2) Add amd_iommu=fullflush option to kernel boot option, just try it.
Will try that when the NIC becomes unavailable again.
Post by Wan ZongShun
Post by Lutz Vieweg
Post by Alexander Duyck
[ 0.000000] AGP: Checking aperture...
[ 0.000000] AGP: No AGP bridge found
[ 0.000000] AGP: Node 0: aperture [bus addr 0x00000000-0x01ffffff] (32MB)
[ 0.000000] AGP: Your BIOS doesn't leave an aperture memory hole
[ 0.000000] AGP: Please enable the IOMMU option in the BIOS setup
[ 0.000000] AGP: This costs you 64MB of RAM
[ 0.000000] AGP: Mapping aperture over RAM [mem 0xcc000000-0xcfffffff] (65536KB)
I checked and the IOMMU-option is definitely enabled in the BIOS setup.
So I assume right that these message are irrelevant (since AGP as a whole
is irrelevant on this server)?
Please cat /proc/iomem, send the information.
00000000-00000fff : reserved
00001000-00097bff : System RAM
00097c00-0009ffff : reserved
000a0000-000bffff : PCI Bus 0000:00
000c0000-000c7fff : Video ROM
000ce800-000d43ff : Adapter ROM
000d4800-000d57ff : Adapter ROM
000e6000-000fffff : reserved
000f0000-000fffff : System ROM
00100000-d7e7ffff : System RAM
01000000-01688c05 : Kernel code
01688c06-01d4f53f : Kernel data
01eea000-02174fff : Kernel bss
d7e80000-d7e8dfff : RAM buffer
d7e8e000-d7e8ffff : reserved
d7e90000-d7eb3fff : ACPI Tables
d7eb4000-d7edffff : ACPI Non-volatile Storage
d7ee0000-d7ffffff : reserved
d9000000-daffffff : PCI Bus 0000:40
d9000000-d90003ff : IOAPIC 2
d9010000-d9013fff : amd_iommu
db000000-dcffffff : PCI Bus 0000:00
db000000-dbffffff : PCI Bus 0000:01
db000000-dbffffff : 0000:01:04.0
db000000-dbffffff : mgadrmfb_vram
dcd00000-dcffffff : PCI Bus 0000:04
dcdfc000-dcdfffff : 0000:04:00.0
dcdfc000-dcdfffff : ixgbe
dce00000-dcffffff : 0000:04:00.0
dce00000-dcffffff : ixgbe
dd000000-dfffffff : PCI Bus 0000:00
def00000-df7fffff : PCI Bus 0000:01
deffc000-deffffff : 0000:01:04.0
deffc000-deffffff : mgadrmfb_mmio
df000000-df7fffff : 0000:01:04.0
dfaf6000-dfaf6fff : 0000:00:12.1
dfaf6000-dfaf6fff : ohci_hcd
dfaf7000-dfaf7fff : 0000:00:12.0
dfaf7000-dfaf7fff : ohci_hcd
dfaf8400-dfaf87ff : 0000:00:11.0
dfaf8400-dfaf87ff : ahci
dfaf8800-dfaf88ff : 0000:00:12.2
dfaf8800-dfaf88ff : ehci_hcd
dfaf8c00-dfaf8cff : 0000:00:13.2
dfaf8c00-dfaf8cff : ehci_hcd
dfaf9000-dfaf9fff : 0000:00:13.1
dfaf9000-dfaf9fff : ohci_hcd
dfafa000-dfafafff : 0000:00:13.0
dfafa000-dfafafff : ohci_hcd
dfafb000-dfafbfff : 0000:00:14.5
dfafb000-dfafbfff : ohci_hcd
dfb00000-dfbfffff : PCI Bus 0000:02
dfb1c000-dfb1ffff : 0000:02:00.1
dfb1c000-dfb1ffff : igb
dfb20000-dfb3ffff : 0000:02:00.1
dfb40000-dfb5ffff : 0000:02:00.1
dfb40000-dfb5ffff : igb
dfb60000-dfb7ffff : 0000:02:00.1
dfb60000-dfb7ffff : igb
dfb9c000-dfb9ffff : 0000:02:00.0
dfb9c000-dfb9ffff : igb
dfba0000-dfbbffff : 0000:02:00.0
dfbc0000-dfbdffff : 0000:02:00.0
dfbc0000-dfbdffff : igb
dfbe0000-dfbfffff : 0000:02:00.0
dfbe0000-dfbfffff : igb
dfc00000-dfcfffff : PCI Bus 0000:03
dfc3c000-dfc3ffff : 0000:03:00.0
dfc3c000-dfc3ffff : mpt2sas
dfc40000-dfc7ffff : 0000:03:00.0
dfc40000-dfc7ffff : mpt2sas
dfc80000-dfcfffff : 0000:03:00.0
dfd00000-dfdfffff : PCI Bus 0000:04
dfd80000-dfdfffff : 0000:04:00.0
dfe00000-dfffffff : PCI Bus 0000:05
dfeb0000-dfebffff : 0000:05:00.0
dfeb0000-dfebffff : mpt2sas
dfec0000-dfefffff : 0000:05:00.0
dfec0000-dfefffff : mpt2sas
dff00000-dfffffff : 0000:05:00.0
e0000000-efffffff : PCI MMCONFIG 0000 [bus 00-ff]
e0000000-efffffff : reserved
e0000000-efffffff : pnp 00:0a
f6000000-f6003fff : amd_iommu
fec00000-fec003ff : IOAPIC 0
fec10000-fec1001f : pnp 00:04
fec20000-fec203ff : IOAPIC 1
fed00000-fed003ff : HPET 2
fed00000-fed003ff : PNP0103:00
fed40000-fed44fff : PCI Bus 0000:00
fee00000-fee00fff : Local APIC
fee00000-fee00fff : pnp 00:03
ffb80000-ffbfffff : pnp 00:04
ffe00000-ffffffff : reserved
ffe50000-ffe5e05f : pnp 00:04
100000000-2026ffffff : System RAM
2027000000-2027ffffff : RAM buffer
Regards,
Lutz Vieweg
Lutz Vieweg
2016-08-29 12:29:47 UTC
Permalink
Post by Lutz Vieweg
Post by Wan ZongShun
Firstly, I need to know if your ethernet card works well now or not
after you set iommu=pt.
Too early to tell - the NIC worked for the last 4 days now without
failing, however, that is only about the same time as it took after
the upgrade to linux-4.6.1 before the bug was encountered, first.
I can now say that after using the option iommu=pt with linux-4.6.1,
the machine ran for > 2 months without problems.

For other reasons (btrfs-stuff) I had to upgrade the machine to
linux-4.7.2 last week, and the "iommu=pt" option wasn't active
after this upgrade.
It only took 4 days until the
"AMD-Vi: Event logged IO_PAGE_FAULT... ixgbe Detected Tx Unit Hang"
issue occured again.

So this evening, I'll reboot linux-4.7.2 with "iommu=pt" again,
as that really seemed to help.

Regards,

Lutz Vieweg
Post by Lutz Vieweg
Post by Wan ZongShun
If your ethernet card with 64bit(not 32bit) DMA addressable cap, that
is ok, you will not be impacted by bounce buffer.
But iommu=pt is a terrible option, that make all devices bypass the iommu.
Why is that terrible? The documentation I found on what iommu=pt actually
means were pretty scarce, but I noticed how many places recommended to use
this option for 10G NICs.
Post by Wan ZongShun
(1)Please add 'amd_iommu_dump' option in your kernel boot option, and
send your full kernel logs, lspci info, don't add iommu=pt.
(2) Add amd_iommu=fullflush option to kernel boot option, just try it.
Will try that when the NIC becomes unavailable again.
Post by Wan ZongShun
Post by Lutz Vieweg
Post by Alexander Duyck
[ 0.000000] AGP: Checking aperture...
[ 0.000000] AGP: No AGP bridge found
[ 0.000000] AGP: Node 0: aperture [bus addr 0x00000000-0x01ffffff] (32MB)
[ 0.000000] AGP: Your BIOS doesn't leave an aperture memory hole
[ 0.000000] AGP: Please enable the IOMMU option in the BIOS setup
[ 0.000000] AGP: This costs you 64MB of RAM
[ 0.000000] AGP: Mapping aperture over RAM [mem 0xcc000000-0xcfffffff] (65536KB)
I checked and the IOMMU-option is definitely enabled in the BIOS setup.
So I assume right that these message are irrelevant (since AGP as a whole
is irrelevant on this server)?
Please cat /proc/iomem, send the information.
00000000-00000fff : reserved
00001000-00097bff : System RAM
00097c00-0009ffff : reserved
000a0000-000bffff : PCI Bus 0000:00
000c0000-000c7fff : Video ROM
000ce800-000d43ff : Adapter ROM
000d4800-000d57ff : Adapter ROM
000e6000-000fffff : reserved
000f0000-000fffff : System ROM
00100000-d7e7ffff : System RAM
01000000-01688c05 : Kernel code
01688c06-01d4f53f : Kernel data
01eea000-02174fff : Kernel bss
d7e80000-d7e8dfff : RAM buffer
d7e8e000-d7e8ffff : reserved
d7e90000-d7eb3fff : ACPI Tables
d7eb4000-d7edffff : ACPI Non-volatile Storage
d7ee0000-d7ffffff : reserved
d9000000-daffffff : PCI Bus 0000:40
d9000000-d90003ff : IOAPIC 2
d9010000-d9013fff : amd_iommu
db000000-dcffffff : PCI Bus 0000:00
db000000-dbffffff : PCI Bus 0000:01
db000000-dbffffff : 0000:01:04.0
db000000-dbffffff : mgadrmfb_vram
dcd00000-dcffffff : PCI Bus 0000:04
dcdfc000-dcdfffff : 0000:04:00.0
dcdfc000-dcdfffff : ixgbe
dce00000-dcffffff : 0000:04:00.0
dce00000-dcffffff : ixgbe
dd000000-dfffffff : PCI Bus 0000:00
def00000-df7fffff : PCI Bus 0000:01
deffc000-deffffff : 0000:01:04.0
deffc000-deffffff : mgadrmfb_mmio
df000000-df7fffff : 0000:01:04.0
dfaf6000-dfaf6fff : 0000:00:12.1
dfaf6000-dfaf6fff : ohci_hcd
dfaf7000-dfaf7fff : 0000:00:12.0
dfaf7000-dfaf7fff : ohci_hcd
dfaf8400-dfaf87ff : 0000:00:11.0
dfaf8400-dfaf87ff : ahci
dfaf8800-dfaf88ff : 0000:00:12.2
dfaf8800-dfaf88ff : ehci_hcd
dfaf8c00-dfaf8cff : 0000:00:13.2
dfaf8c00-dfaf8cff : ehci_hcd
dfaf9000-dfaf9fff : 0000:00:13.1
dfaf9000-dfaf9fff : ohci_hcd
dfafa000-dfafafff : 0000:00:13.0
dfafa000-dfafafff : ohci_hcd
dfafb000-dfafbfff : 0000:00:14.5
dfafb000-dfafbfff : ohci_hcd
dfb00000-dfbfffff : PCI Bus 0000:02
dfb1c000-dfb1ffff : 0000:02:00.1
dfb1c000-dfb1ffff : igb
dfb20000-dfb3ffff : 0000:02:00.1
dfb40000-dfb5ffff : 0000:02:00.1
dfb40000-dfb5ffff : igb
dfb60000-dfb7ffff : 0000:02:00.1
dfb60000-dfb7ffff : igb
dfb9c000-dfb9ffff : 0000:02:00.0
dfb9c000-dfb9ffff : igb
dfba0000-dfbbffff : 0000:02:00.0
dfbc0000-dfbdffff : 0000:02:00.0
dfbc0000-dfbdffff : igb
dfbe0000-dfbfffff : 0000:02:00.0
dfbe0000-dfbfffff : igb
dfc00000-dfcfffff : PCI Bus 0000:03
dfc3c000-dfc3ffff : 0000:03:00.0
dfc3c000-dfc3ffff : mpt2sas
dfc40000-dfc7ffff : 0000:03:00.0
dfc40000-dfc7ffff : mpt2sas
dfc80000-dfcfffff : 0000:03:00.0
dfd00000-dfdfffff : PCI Bus 0000:04
dfd80000-dfdfffff : 0000:04:00.0
dfe00000-dfffffff : PCI Bus 0000:05
dfeb0000-dfebffff : 0000:05:00.0
dfeb0000-dfebffff : mpt2sas
dfec0000-dfefffff : 0000:05:00.0
dfec0000-dfefffff : mpt2sas
dff00000-dfffffff : 0000:05:00.0
e0000000-efffffff : PCI MMCONFIG 0000 [bus 00-ff]
e0000000-efffffff : reserved
e0000000-efffffff : pnp 00:0a
f6000000-f6003fff : amd_iommu
fec00000-fec003ff : IOAPIC 0
fec10000-fec1001f : pnp 00:04
fec20000-fec203ff : IOAPIC 1
fed00000-fed003ff : HPET 2
fed00000-fed003ff : PNP0103:00
fed40000-fed44fff : PCI Bus 0000:00
fee00000-fee00fff : Local APIC
fee00000-fee00fff : pnp 00:03
ffb80000-ffbfffff : pnp 00:04
ffe00000-ffffffff : reserved
ffe50000-ffe5e05f : pnp 00:04
100000000-2026ffffff : System RAM
2027000000-2027ffffff : RAM buffer
Regards,
Lutz Vieweg
Lutz Vieweg
2016-08-29 12:30:09 UTC
Permalink
Post by Lutz Vieweg
Post by Wan ZongShun
Firstly, I need to know if your ethernet card works well now or not
after you set iommu=pt.
Too early to tell - the NIC worked for the last 4 days now without
failing, however, that is only about the same time as it took after
the upgrade to linux-4.6.1 before the bug was encountered, first.
I can now say that after using the option iommu=pt with linux-4.6.1,
the machine ran for > 2 months without problems.

For other reasons (btrfs-stuff) I had to upgrade the machine to
linux-4.7.2 last week, and the "iommu=pt" option wasn't active
after this upgrade.
It only took 4 days until the
"AMD-Vi: Event logged IO_PAGE_FAULT... ixgbe Detected Tx Unit Hang"
issue occured again.

So this evening, I'll reboot linux-4.7.2 with "iommu=pt" again,
as that really seemed to help.

Regards,

Lutz Vieweg
Post by Lutz Vieweg
Post by Wan ZongShun
If your ethernet card with 64bit(not 32bit) DMA addressable cap, that
is ok, you will not be impacted by bounce buffer.
But iommu=pt is a terrible option, that make all devices bypass the iommu.
Why is that terrible? The documentation I found on what iommu=pt actually
means were pretty scarce, but I noticed how many places recommended to use
this option for 10G NICs.
Post by Wan ZongShun
(1)Please add 'amd_iommu_dump' option in your kernel boot option, and
send your full kernel logs, lspci info, don't add iommu=pt.
(2) Add amd_iommu=fullflush option to kernel boot option, just try it.
Will try that when the NIC becomes unavailable again.
Post by Wan ZongShun
Post by Lutz Vieweg
Post by Alexander Duyck
[ 0.000000] AGP: Checking aperture...
[ 0.000000] AGP: No AGP bridge found
[ 0.000000] AGP: Node 0: aperture [bus addr 0x00000000-0x01ffffff] (32MB)
[ 0.000000] AGP: Your BIOS doesn't leave an aperture memory hole
[ 0.000000] AGP: Please enable the IOMMU option in the BIOS setup
[ 0.000000] AGP: This costs you 64MB of RAM
[ 0.000000] AGP: Mapping aperture over RAM [mem 0xcc000000-0xcfffffff] (65536KB)
I checked and the IOMMU-option is definitely enabled in the BIOS setup.
So I assume right that these message are irrelevant (since AGP as a whole
is irrelevant on this server)?
Please cat /proc/iomem, send the information.
00000000-00000fff : reserved
00001000-00097bff : System RAM
00097c00-0009ffff : reserved
000a0000-000bffff : PCI Bus 0000:00
000c0000-000c7fff : Video ROM
000ce800-000d43ff : Adapter ROM
000d4800-000d57ff : Adapter ROM
000e6000-000fffff : reserved
000f0000-000fffff : System ROM
00100000-d7e7ffff : System RAM
01000000-01688c05 : Kernel code
01688c06-01d4f53f : Kernel data
01eea000-02174fff : Kernel bss
d7e80000-d7e8dfff : RAM buffer
d7e8e000-d7e8ffff : reserved
d7e90000-d7eb3fff : ACPI Tables
d7eb4000-d7edffff : ACPI Non-volatile Storage
d7ee0000-d7ffffff : reserved
d9000000-daffffff : PCI Bus 0000:40
d9000000-d90003ff : IOAPIC 2
d9010000-d9013fff : amd_iommu
db000000-dcffffff : PCI Bus 0000:00
db000000-dbffffff : PCI Bus 0000:01
db000000-dbffffff : 0000:01:04.0
db000000-dbffffff : mgadrmfb_vram
dcd00000-dcffffff : PCI Bus 0000:04
dcdfc000-dcdfffff : 0000:04:00.0
dcdfc000-dcdfffff : ixgbe
dce00000-dcffffff : 0000:04:00.0
dce00000-dcffffff : ixgbe
dd000000-dfffffff : PCI Bus 0000:00
def00000-df7fffff : PCI Bus 0000:01
deffc000-deffffff : 0000:01:04.0
deffc000-deffffff : mgadrmfb_mmio
df000000-df7fffff : 0000:01:04.0
dfaf6000-dfaf6fff : 0000:00:12.1
dfaf6000-dfaf6fff : ohci_hcd
dfaf7000-dfaf7fff : 0000:00:12.0
dfaf7000-dfaf7fff : ohci_hcd
dfaf8400-dfaf87ff : 0000:00:11.0
dfaf8400-dfaf87ff : ahci
dfaf8800-dfaf88ff : 0000:00:12.2
dfaf8800-dfaf88ff : ehci_hcd
dfaf8c00-dfaf8cff : 0000:00:13.2
dfaf8c00-dfaf8cff : ehci_hcd
dfaf9000-dfaf9fff : 0000:00:13.1
dfaf9000-dfaf9fff : ohci_hcd
dfafa000-dfafafff : 0000:00:13.0
dfafa000-dfafafff : ohci_hcd
dfafb000-dfafbfff : 0000:00:14.5
dfafb000-dfafbfff : ohci_hcd
dfb00000-dfbfffff : PCI Bus 0000:02
dfb1c000-dfb1ffff : 0000:02:00.1
dfb1c000-dfb1ffff : igb
dfb20000-dfb3ffff : 0000:02:00.1
dfb40000-dfb5ffff : 0000:02:00.1
dfb40000-dfb5ffff : igb
dfb60000-dfb7ffff : 0000:02:00.1
dfb60000-dfb7ffff : igb
dfb9c000-dfb9ffff : 0000:02:00.0
dfb9c000-dfb9ffff : igb
dfba0000-dfbbffff : 0000:02:00.0
dfbc0000-dfbdffff : 0000:02:00.0
dfbc0000-dfbdffff : igb
dfbe0000-dfbfffff : 0000:02:00.0
dfbe0000-dfbfffff : igb
dfc00000-dfcfffff : PCI Bus 0000:03
dfc3c000-dfc3ffff : 0000:03:00.0
dfc3c000-dfc3ffff : mpt2sas
dfc40000-dfc7ffff : 0000:03:00.0
dfc40000-dfc7ffff : mpt2sas
dfc80000-dfcfffff : 0000:03:00.0
dfd00000-dfdfffff : PCI Bus 0000:04
dfd80000-dfdfffff : 0000:04:00.0
dfe00000-dfffffff : PCI Bus 0000:05
dfeb0000-dfebffff : 0000:05:00.0
dfeb0000-dfebffff : mpt2sas
dfec0000-dfefffff : 0000:05:00.0
dfec0000-dfefffff : mpt2sas
dff00000-dfffffff : 0000:05:00.0
e0000000-efffffff : PCI MMCONFIG 0000 [bus 00-ff]
e0000000-efffffff : reserved
e0000000-efffffff : pnp 00:0a
f6000000-f6003fff : amd_iommu
fec00000-fec003ff : IOAPIC 0
fec10000-fec1001f : pnp 00:04
fec20000-fec203ff : IOAPIC 1
fed00000-fed003ff : HPET 2
fed00000-fed003ff : PNP0103:00
fed40000-fed44fff : PCI Bus 0000:00
fee00000-fee00fff : Local APIC
fee00000-fee00fff : pnp 00:03
ffb80000-ffbfffff : pnp 00:04
ffe00000-ffffffff : reserved
ffe50000-ffe5e05f : pnp 00:04
100000000-2026ffffff : System RAM
2027000000-2027ffffff : RAM buffer
Regards,
Lutz Vieweg
Lutz Vieweg
2016-08-29 12:30:30 UTC
Permalink
Post by Lutz Vieweg
Post by Wan ZongShun
Firstly, I need to know if your ethernet card works well now or not
after you set iommu=pt.
Too early to tell - the NIC worked for the last 4 days now without
failing, however, that is only about the same time as it took after
the upgrade to linux-4.6.1 before the bug was encountered, first.
I can now say that after using the option iommu=pt with linux-4.6.1,
the machine ran for > 2 months without problems.

For other reasons (btrfs-stuff) I had to upgrade the machine to
linux-4.7.2 last week, and the "iommu=pt" option wasn't active
after this upgrade.
It only took 4 days until the
"AMD-Vi: Event logged IO_PAGE_FAULT... ixgbe Detected Tx Unit Hang"
issue occured again.

So this evening, I'll reboot linux-4.7.2 with "iommu=pt" again,
as that really seemed to help.

Regards,

Lutz Vieweg
Post by Lutz Vieweg
Post by Wan ZongShun
If your ethernet card with 64bit(not 32bit) DMA addressable cap, that
is ok, you will not be impacted by bounce buffer.
But iommu=pt is a terrible option, that make all devices bypass the iommu.
Why is that terrible? The documentation I found on what iommu=pt actually
means were pretty scarce, but I noticed how many places recommended to use
this option for 10G NICs.
Post by Wan ZongShun
(1)Please add 'amd_iommu_dump' option in your kernel boot option, and
send your full kernel logs, lspci info, don't add iommu=pt.
(2) Add amd_iommu=fullflush option to kernel boot option, just try it.
Will try that when the NIC becomes unavailable again.
Post by Wan ZongShun
Post by Lutz Vieweg
Post by Alexander Duyck
[ 0.000000] AGP: Checking aperture...
[ 0.000000] AGP: No AGP bridge found
[ 0.000000] AGP: Node 0: aperture [bus addr 0x00000000-0x01ffffff] (32MB)
[ 0.000000] AGP: Your BIOS doesn't leave an aperture memory hole
[ 0.000000] AGP: Please enable the IOMMU option in the BIOS setup
[ 0.000000] AGP: This costs you 64MB of RAM
[ 0.000000] AGP: Mapping aperture over RAM [mem 0xcc000000-0xcfffffff] (65536KB)
I checked and the IOMMU-option is definitely enabled in the BIOS setup.
So I assume right that these message are irrelevant (since AGP as a whole
is irrelevant on this server)?
Please cat /proc/iomem, send the information.
00000000-00000fff : reserved
00001000-00097bff : System RAM
00097c00-0009ffff : reserved
000a0000-000bffff : PCI Bus 0000:00
000c0000-000c7fff : Video ROM
000ce800-000d43ff : Adapter ROM
000d4800-000d57ff : Adapter ROM
000e6000-000fffff : reserved
000f0000-000fffff : System ROM
00100000-d7e7ffff : System RAM
01000000-01688c05 : Kernel code
01688c06-01d4f53f : Kernel data
01eea000-02174fff : Kernel bss
d7e80000-d7e8dfff : RAM buffer
d7e8e000-d7e8ffff : reserved
d7e90000-d7eb3fff : ACPI Tables
d7eb4000-d7edffff : ACPI Non-volatile Storage
d7ee0000-d7ffffff : reserved
d9000000-daffffff : PCI Bus 0000:40
d9000000-d90003ff : IOAPIC 2
d9010000-d9013fff : amd_iommu
db000000-dcffffff : PCI Bus 0000:00
db000000-dbffffff : PCI Bus 0000:01
db000000-dbffffff : 0000:01:04.0
db000000-dbffffff : mgadrmfb_vram
dcd00000-dcffffff : PCI Bus 0000:04
dcdfc000-dcdfffff : 0000:04:00.0
dcdfc000-dcdfffff : ixgbe
dce00000-dcffffff : 0000:04:00.0
dce00000-dcffffff : ixgbe
dd000000-dfffffff : PCI Bus 0000:00
def00000-df7fffff : PCI Bus 0000:01
deffc000-deffffff : 0000:01:04.0
deffc000-deffffff : mgadrmfb_mmio
df000000-df7fffff : 0000:01:04.0
dfaf6000-dfaf6fff : 0000:00:12.1
dfaf6000-dfaf6fff : ohci_hcd
dfaf7000-dfaf7fff : 0000:00:12.0
dfaf7000-dfaf7fff : ohci_hcd
dfaf8400-dfaf87ff : 0000:00:11.0
dfaf8400-dfaf87ff : ahci
dfaf8800-dfaf88ff : 0000:00:12.2
dfaf8800-dfaf88ff : ehci_hcd
dfaf8c00-dfaf8cff : 0000:00:13.2
dfaf8c00-dfaf8cff : ehci_hcd
dfaf9000-dfaf9fff : 0000:00:13.1
dfaf9000-dfaf9fff : ohci_hcd
dfafa000-dfafafff : 0000:00:13.0
dfafa000-dfafafff : ohci_hcd
dfafb000-dfafbfff : 0000:00:14.5
dfafb000-dfafbfff : ohci_hcd
dfb00000-dfbfffff : PCI Bus 0000:02
dfb1c000-dfb1ffff : 0000:02:00.1
dfb1c000-dfb1ffff : igb
dfb20000-dfb3ffff : 0000:02:00.1
dfb40000-dfb5ffff : 0000:02:00.1
dfb40000-dfb5ffff : igb
dfb60000-dfb7ffff : 0000:02:00.1
dfb60000-dfb7ffff : igb
dfb9c000-dfb9ffff : 0000:02:00.0
dfb9c000-dfb9ffff : igb
dfba0000-dfbbffff : 0000:02:00.0
dfbc0000-dfbdffff : 0000:02:00.0
dfbc0000-dfbdffff : igb
dfbe0000-dfbfffff : 0000:02:00.0
dfbe0000-dfbfffff : igb
dfc00000-dfcfffff : PCI Bus 0000:03
dfc3c000-dfc3ffff : 0000:03:00.0
dfc3c000-dfc3ffff : mpt2sas
dfc40000-dfc7ffff : 0000:03:00.0
dfc40000-dfc7ffff : mpt2sas
dfc80000-dfcfffff : 0000:03:00.0
dfd00000-dfdfffff : PCI Bus 0000:04
dfd80000-dfdfffff : 0000:04:00.0
dfe00000-dfffffff : PCI Bus 0000:05
dfeb0000-dfebffff : 0000:05:00.0
dfeb0000-dfebffff : mpt2sas
dfec0000-dfefffff : 0000:05:00.0
dfec0000-dfefffff : mpt2sas
dff00000-dfffffff : 0000:05:00.0
e0000000-efffffff : PCI MMCONFIG 0000 [bus 00-ff]
e0000000-efffffff : reserved
e0000000-efffffff : pnp 00:0a
f6000000-f6003fff : amd_iommu
fec00000-fec003ff : IOAPIC 0
fec10000-fec1001f : pnp 00:04
fec20000-fec203ff : IOAPIC 1
fed00000-fed003ff : HPET 2
fed00000-fed003ff : PNP0103:00
fed40000-fed44fff : PCI Bus 0000:00
fee00000-fee00fff : Local APIC
fee00000-fee00fff : pnp 00:03
ffb80000-ffbfffff : pnp 00:04
ffe00000-ffffffff : reserved
ffe50000-ffe5e05f : pnp 00:04
100000000-2026ffffff : System RAM
2027000000-2027ffffff : RAM buffer
Regards,
Lutz Vieweg
Joerg Roedel
2016-06-13 09:08:03 UTC
Permalink
Jun 9 14:40:09 computer kernel: AMD-Vi: Event logged [IO_PAGE_FAULT device=04:00.0 domain=0x000e address=0x00000000000178c0 flags=0x0050]
Jun 9 14:40:09 computer kernel: AMD-Vi: Event logged [IO_PAGE_FAULT device=04:00.0 domain=0x000e address=0x0000000000017900 flags=0x0050]
Some more context would be helpful. Which kernel version was the last
that worked and with which version do you start to see these messages?


Joerg
Lutz Vieweg
2016-06-13 17:46:37 UTC
Permalink
Post by Joerg Roedel
Jun 9 14:40:09 computer kernel: AMD-Vi: Event logged [IO_PAGE_FAULT device=04:00.0 domain=0x000e address=0x00000000000178c0 flags=0x0050]
Jun 9 14:40:09 computer kernel: AMD-Vi: Event logged [IO_PAGE_FAULT device=04:00.0 domain=0x000e address=0x0000000000017900 flags=0x0050]
Some more context would be helpful. Which kernel version was the last
that worked and with which version do you start to see these messages?
Two servers were running linux-4.4.2 for many months,
both with 10Gbase-T NICs connected to the same switch, without
any such outage.

Both servers were recently upgraded to linux-4.6.1, and one
of the servers so far twice showed this "IO_PAGE_FAULT" symptom
within a period of ~ 7 days.

The hardware of the two servers is the same except for the
model of the Intel 10Gbase-T NIC: The server with the two fails
runs a fairly new
Intel Corporation Ethernet Controller 10-Gigabit X540-AT2 (rev 01)
while the other server (without symptoms so far) runs a much older
Intel Corporation 82598EB 10-Gigabit AT Network Connection (rev 01)
both using the same ixgbe driver module.

(Since both servers are working as a shared-nothing-cluster, they
pretty much do the same.)

Regards,

Lutz Vieweg
Loading...