[PATCH 0/4] iommu/arm-smmu: Fixes for 4.8

Discussion:

Will Deacon

2016-08-18 13:05:38 UTC

Hi all,

Here's the current pile of ARM SMMU fixes I have queued for 4.8. I plan
to send a pull to Joerg tomorrow afternoon.

Cheers,

Will

--->8

Robin Murphy (1):
iommu/io-pgtable-arm-v7s: Fix attributes when splitting blocks

Will Deacon (3):
iommu/arm-smmu: Fix CMDQ error handling
iommu/arm-smmu: Disable stalling faults for all endpoints
iommu/arm-smmu: Don't BUG() if we find aborting STEs with
disable_bypass

drivers/iommu/arm-smmu-v3.c | 7 +++++--
drivers/iommu/arm-smmu.c | 34 +++++++---------------------------
drivers/iommu/io-pgtable-arm-v7s.c | 4 +++-
3 files changed, 15 insertions(+), 30 deletions(-)

--
2.1.4

Will Deacon

2016-08-18 13:05:39 UTC

Permalink

From: Robin Murphy <***@arm.com>

Due to the attribute bits being all over the place in the different
types of short-descriptor PTEs, when remapping an existing entry, e.g.
splitting a section into pages, we take the approach of decomposing
the PTE attributes back to the IOMMU API flags to start from scratch.

On inspection, though, the existing code seems to have got the read-only
bit backwards and ignored the XN bit. How embarrassing...

Fortunately the primary user so far, the Mediatek IOMMU, both never
splits blocks (because it only serves non-overlapping DMA API calls) and
also ignores permissions anyway, but let's put things right before any
future users trip up.

Cc: <***@vger.kernel.org>
Signed-off-by: Robin Murphy <***@arm.com>
Signed-off-by: Will Deacon <***@arm.com>
---
drivers/iommu/io-pgtable-arm-v7s.c | 4 +++-
1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/drivers/iommu/io-pgtable-arm-v7s.c b/drivers/iommu/io-pgtable-arm-v7s.c
index 8c6139986d7d..def8ca1c982d 100644
--- a/drivers/iommu/io-pgtable-arm-v7s.c
+++ b/drivers/iommu/io-pgtable-arm-v7s.c
@@ -286,12 +286,14 @@ static int arm_v7s_pte_to_prot(arm_v7s_iopte pte, int lvl)
int prot = IOMMU_READ;
arm_v7s_iopte attr = pte >> ARM_V7S_ATTR_SHIFT(lvl);

- if (attr & ARM_V7S_PTE_AP_RDONLY)
+ if (!(attr & ARM_V7S_PTE_AP_RDONLY))
prot |= IOMMU_WRITE;
if ((attr & (ARM_V7S_TEX_MASK << ARM_V7S_TEX_SHIFT)) == 0)
prot |= IOMMU_MMIO;
else if (pte & ARM_V7S_ATTR_C)
prot |= IOMMU_CACHE;
+ if (pte & ARM_V7S_ATTR_XN(lvl))
+ prot |= IOMMU_NOEXEC;

return prot;
}

--
2.1.4

Will Deacon

2016-08-18 13:05:42 UTC

Permalink

The disable_bypass cmdline option changes the SMMUv3 driver to put down
faulting stream table entries by default, as opposed to bypassing
transactions from unconfigured devices.

In this mode of operation, it is entirely expected to see aborting
entries in the stream table if and when we come to installing a valid
translation, so don't trigger a BUG() as a result of misdiagnosing these
entries as stream table corruption.

Tested-by: Robin Murphy <***@arm.com>
Reported-by: Robin Murphy <***@arm.com>
Reviewed-by: Robin Murphy <***@arm.com>
Signed-off-by: Will Deacon <***@arm.com>
---
drivers/iommu/arm-smmu-v3.c | 3 +++
1 file changed, 3 insertions(+)

diff --git a/drivers/iommu/arm-smmu-v3.c b/drivers/iommu/arm-smmu-v3.c
index 330623f8e344..641e88761319 100644
--- a/drivers/iommu/arm-smmu-v3.c
+++ b/drivers/iommu/arm-smmu-v3.c
@@ -1034,6 +1034,9 @@ static void arm_smmu_write_strtab_ent(struct arm_smmu_device *smmu, u32 sid,
case STRTAB_STE_0_CFG_S2_TRANS:
ste_live = true;
break;
+ case STRTAB_STE_0_CFG_ABORT:
+ if (disable_bypass)
+ break;
default:
BUG(); /* STE corruption */
}

--
2.1.4

Will Deacon

2016-08-18 13:05:40 UTC

Permalink

In the unlikely event of a global command queue error, the ARM SMMUv3
driver attempts to convert the problematic command into a CMD_SYNC and
resume the command queue. Unfortunately, this code is pretty badly
broken:

1. It uses the index into the error string table as the CMDQ index,
so we probably read the wrong entry out of the queue

2. The arguments to queue_write are the wrong way round, so we end up
writing from the queue onto the stack.

These happily cancel out, so the kernel is likely to stay alive, but
the command queue will probably fault again when we resume.

This patch fixes the error handling code to use the correct queue index
and write back the CMD_SYNC to the faulting entry.

Cc: <***@vger.kernel.org>
Reported-by: Diwakar Subraveti <***@arm.com>
Signed-off-by: Will Deacon <***@arm.com>
---
drivers/iommu/arm-smmu-v3.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/iommu/arm-smmu-v3.c b/drivers/iommu/arm-smmu-v3.c
index ce801170d5f2..330623f8e344 100644
--- a/drivers/iommu/arm-smmu-v3.c
+++ b/drivers/iommu/arm-smmu-v3.c
@@ -879,7 +879,7 @@ static void arm_smmu_cmdq_skip_err(struct arm_smmu_device *smmu)
* We may have concurrent producers, so we need to be careful
* not to touch any of the shadow cmdq state.
*/
- queue_read(cmd, Q_ENT(q, idx), q->ent_dwords);
+ queue_read(cmd, Q_ENT(q, cons), q->ent_dwords);
dev_err(smmu->dev, "skipping command in error state:\n");
for (i = 0; i < ARRAY_SIZE(cmd); ++i)
dev_err(smmu->dev, "\t0x%016llx\n", (unsigned long long)cmd[i]);
@@ -890,7 +890,7 @@ static void arm_smmu_cmdq_skip_err(struct arm_smmu_device *smmu)
return;
}

- queue_write(cmd, Q_ENT(q, idx), q->ent_dwords);
+ queue_write(Q_ENT(q, cons), cmd, q->ent_dwords);
}

static void arm_smmu_cmdq_issue_cmd(struct arm_smmu_device *smmu,

--
2.1.4

Will Deacon

2016-08-18 13:05:41 UTC

Permalink

Enabling stalling faults can result in hardware deadlock on poorly
designed systems, particularly those with a PCI root complex upstream of
the SMMU.

Although it's not really Linux's job to save hardware integrators from
their own misfortune, it *is* our job to stop userspace (e.g. VFIO
clients) from hosing the system for everybody else, even if they might
already be required to have elevated privileges.

Given that the fault handling code currently executes entirely in IRQ
context, there is nothing that can sensibly be done to recover from
things like page faults anyway, so let's rip this code out for now and
avoid the potential for deadlock.

Cc: <***@vger.kernel.org>
Reported-by: Matt Evans <***@arm.com>
Signed-off-by: Will Deacon <***@arm.com>
---
drivers/iommu/arm-smmu.c | 34 +++++++---------------------------
1 file changed, 7 insertions(+), 27 deletions(-)

diff --git a/drivers/iommu/arm-smmu.c b/drivers/iommu/arm-smmu.c
index 4f49fe29f202..2db74ebc3240 100644
--- a/drivers/iommu/arm-smmu.c
+++ b/drivers/iommu/arm-smmu.c
@@ -686,8 +686,7 @@ static struct iommu_gather_ops arm_smmu_gather_ops = {

static irqreturn_t arm_smmu_context_fault(int irq, void *dev)
{
- int flags, ret;
- u32 fsr, fsynr, resume;
+ u32 fsr, fsynr;
unsigned long iova;
struct iommu_domain *domain = dev;
struct arm_smmu_domain *smmu_domain = to_smmu_domain(domain);
@@ -701,34 +700,15 @@ static irqreturn_t arm_smmu_context_fault(int irq, void *dev)
if (!(fsr & FSR_FAULT))
return IRQ_NONE;

- if (fsr & FSR_IGN)
- dev_err_ratelimited(smmu->dev,
- "Unexpected context fault (fsr 0x%x)\n",
- fsr);
-
fsynr = readl_relaxed(cb_base + ARM_SMMU_CB_FSYNR0);
- flags = fsynr & FSYNR0_WNR ? IOMMU_FAULT_WRITE : IOMMU_FAULT_READ;
-
iova = readq_relaxed(cb_base + ARM_SMMU_CB_FAR);
- if (!report_iommu_fault(domain, smmu->dev, iova, flags)) {
- ret = IRQ_HANDLED;
- resume = RESUME_RETRY;
- } else {
- dev_err_ratelimited(smmu->dev,
- "Unhandled context fault: iova=0x%08lx, fsynr=0x%x, cb=%d\n",
- iova, fsynr, cfg->cbndx);
- ret = IRQ_NONE;
- resume = RESUME_TERMINATE;
- }
-
- /* Clear the faulting FSR */
- writel(fsr, cb_base + ARM_SMMU_CB_FSR);

- /* Retry or terminate any stalled transactions */
- if (fsr & FSR_SS)
- writel_relaxed(resume, cb_base + ARM_SMMU_CB_RESUME);
+ dev_err_ratelimited(smmu->dev,
+ "Unhandled context fault: fsr=0x%x, iova=0x%08lx, fsynr=0x%x, cb=%d\n",
+ fsr, iova, fsynr, cfg->cbndx);

- return ret;
+ writel(fsr, cb_base + ARM_SMMU_CB_FSR);
+ return IRQ_HANDLED;
}

static irqreturn_t arm_smmu_global_fault(int irq, void *dev)
@@ -837,7 +817,7 @@ static void arm_smmu_init_context_bank(struct arm_smmu_domain *smmu_domain,
}

/* SCTLR */
- reg = SCTLR_CFCFG | SCTLR_CFIE | SCTLR_CFRE | SCTLR_M | SCTLR_EAE_SBOP;
+ reg = SCTLR_CFIE | SCTLR_CFRE | SCTLR_M | SCTLR_EAE_SBOP;
if (stage1)
reg |= SCTLR_S1_ASIDPNE;
#ifdef __BIG_ENDIAN

--
2.1.4

Rob Clark

2016-12-06 23:30:21 UTC

Permalink

Post by Will Deacon
Enabling stalling faults can result in hardware deadlock on poorly
designed systems, particularly those with a PCI root complex upstream of
the SMMU.
Although it's not really Linux's job to save hardware integrators from
their own misfortune, it *is* our job to stop userspace (e.g. VFIO
clients) from hosing the system for everybody else, even if they might
already be required to have elevated privileges.
Given that the fault handling code currently executes entirely in IRQ
context, there is nothing that can sensibly be done to recover from
things like page faults anyway, so let's rip this code out for now and
avoid the potential for deadlock.

Hi Will,

so, I'd like to re-introduce this feature, I *guess* as some sort of
opt-in quirk (ie. disabled by default unless something in DT tells you
otherwise?? But I'm open to suggestions. I'm not entirely sure what
hw was having problems due to this feature.)

On newer snapdragon devices we are using arm-smmu for the GPU, and
halting the GPU so the driver's fault handler can dump some GPU state
on faults is enormously helpful for debugging and tracking down where
in the gpu cmdstream the fault was triggered. In addition, we will
eventually want the ability to update pagetables from fault handler
and resuming the faulting transition.

Some additional comments below..

Post by Will Deacon
---
drivers/iommu/arm-smmu.c | 34 +++++++---------------------------
1 file changed, 7 insertions(+), 27 deletions(-)
diff --git a/drivers/iommu/arm-smmu.c b/drivers/iommu/arm-smmu.c
index 4f49fe29f202..2db74ebc3240 100644
--- a/drivers/iommu/arm-smmu.c
+++ b/drivers/iommu/arm-smmu.c
@@ -686,8 +686,7 @@ static struct iommu_gather_ops arm_smmu_gather_ops = {
static irqreturn_t arm_smmu_context_fault(int irq, void *dev)
{
- int flags, ret;
- u32 fsr, fsynr, resume;
+ u32 fsr, fsynr;
unsigned long iova;
struct iommu_domain *domain = dev;
struct arm_smmu_domain *smmu_domain = to_smmu_domain(domain);
@@ -701,34 +700,15 @@ static irqreturn_t arm_smmu_context_fault(int irq, void *dev)
if (!(fsr & FSR_FAULT))
return IRQ_NONE;
- if (fsr & FSR_IGN)
- dev_err_ratelimited(smmu->dev,
- "Unexpected context fault (fsr 0x%x)\n",
- fsr);
-
fsynr = readl_relaxed(cb_base + ARM_SMMU_CB_FSYNR0);
- flags = fsynr & FSYNR0_WNR ? IOMMU_FAULT_WRITE : IOMMU_FAULT_READ;
-
iova = readq_relaxed(cb_base + ARM_SMMU_CB_FAR);
- if (!report_iommu_fault(domain, smmu->dev, iova, flags)) {
- ret = IRQ_HANDLED;
- resume = RESUME_RETRY;
- } else {
- dev_err_ratelimited(smmu->dev,
- "Unhandled context fault: iova=0x%08lx, fsynr=0x%x, cb=%d\n",
- iova, fsynr, cfg->cbndx);

I would like to decouple this dev_err_ratelimit() print from the
RESUME_RETRY vs RESUME_TERMINATE behaviour. I need the ability to
indicate by return from my fault handler whether to resume or
terminate. But I already have my own ratelimted prints and would
prefer not to spam dmesg twice.

I'm thinking about report_iommu_fault() returning:

0 => RESUME_RETRY
-EFAULT => RESUME_TERMINATE but don't print
anything else (or specifically -ENOSYS?) => RESUME_TERMINATE and print

thoughts?

Post by Will Deacon
- ret = IRQ_NONE;
- resume = RESUME_TERMINATE;
- }
-
- /* Clear the faulting FSR */
- writel(fsr, cb_base + ARM_SMMU_CB_FSR);
- /* Retry or terminate any stalled transactions */
- if (fsr & FSR_SS)
- writel_relaxed(resume, cb_base + ARM_SMMU_CB_RESUME);

This might be a bug in qcom's implementation of the smmu spec, but
seems like we don't have SS bit set, yet we still require RESUME reg
to be written, otherwise gpu is perma-wedged. Maybe topic for a
separate quirk? I'm not sure if writing RESUME reg on other hw when
SS bit is not set is likely to cause problems? If not I suppose we
could just unconditionally write it.

Anyways, I'm not super-familiar w/ arm-smmu so suggestions welcome..
in between debugging freedreno I'll try to put together some patches.

BR,
-R

Post by Will Deacon
+ dev_err_ratelimited(smmu->dev,
+ "Unhandled context fault: fsr=0x%x, iova=0x%08lx, fsynr=0x%x, cb=%d\n",
+ fsr, iova, fsynr, cfg->cbndx);
- return ret;
+ writel(fsr, cb_base + ARM_SMMU_CB_FSR);
+ return IRQ_HANDLED;
}
static irqreturn_t arm_smmu_global_fault(int irq, void *dev)
@@ -837,7 +817,7 @@ static void arm_smmu_init_context_bank(struct arm_smmu_domain *smmu_domain,
}
/* SCTLR */
- reg = SCTLR_CFCFG | SCTLR_CFIE | SCTLR_CFRE | SCTLR_M | SCTLR_EAE_SBOP;
+ reg = SCTLR_CFIE | SCTLR_CFRE | SCTLR_M | SCTLR_EAE_SBOP;
if (stage1)
reg |= SCTLR_S1_ASIDPNE;
#ifdef __BIG_ENDIAN
--
2.1.4
_______________________________________________
iommu mailing list
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Jordan Crouse

2016-12-07 00:00:25 UTC

Permalink

Post by Joerg Roedel

Hi Will,
so, I'd like to re-introduce this feature, I *guess* as some sort of
opt-in quirk (ie. disabled by default unless something in DT tells you
otherwise?? But I'm open to suggestions. I'm not entirely sure what
hw was having problems due to this feature.)
On newer snapdragon devices we are using arm-smmu for the GPU, and
halting the GPU so the driver's fault handler can dump some GPU state
on faults is enormously helpful for debugging and tracking down where
in the gpu cmdstream the fault was triggered. In addition, we will
eventually want the ability to update pagetables from fault handler
and resuming the faulting transition.
Some additional comments below..

I would like to decouple this dev_err_ratelimit() print from the
RESUME_RETRY vs RESUME_TERMINATE behaviour. I need the ability to
indicate by return from my fault handler whether to resume or
terminate. But I already have my own ratelimted prints and would
prefer not to spam dmesg twice.
0 => RESUME_RETRY
-EFAULT => RESUME_TERMINATE but don't print
anything else (or specifically -ENOSYS?) => RESUME_TERMINATE and print
thoughts?

This might be a bug in qcom's implementation of the smmu spec, but
seems like we don't have SS bit set, yet we still require RESUME reg
to be written, otherwise gpu is perma-wedged. Maybe topic for a
separate quirk? I'm not sure if writing RESUME reg on other hw when
SS bit is not set is likely to cause problems? If not I suppose we
could just unconditionally write it.
Anyways, I'm not super-familiar w/ arm-smmu so suggestions welcome..
in between debugging freedreno I'll try to put together some patches.

From what I can tell we need SCTLR_CFCFG to make the stall happen otherwise
the operation just gets terminated immediately and *then* we get notification
but by then the system keeps going.

I think SCTLR_HUPCF helps control that behavior (i.e. we don't go off faulting
through eternity) but I don't know how it works.

From my very unlearned understanding I think we do want to set CFCFG and then
stall and let the interrupt handler decide to retry/terminate.

Jordan

--
The Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum,
a Linux Foundation Collaborative Project

Sricharan

2016-12-10 15:44:42 UTC

Permalink

Hi Rob,

Post by Jordan Crouse

Post by Joerg Roedel

Post by Will Deacon
Although it's not really Linux's job to save hardware integrators from
their own misfortune, it *is* our job to stop userspace (e.g. VFIO
clients) from hosing the system for everybody else, even if they might
already be required to have elevated privileges.
Given that the fault handling code currently executes entirely in IRQ
context, there is nothing that can sensibly be done to recover from
things like page faults anyway, so let's rip this code out for now and
avoid the potential for deadlock.

Hi Will,
so, I'd like to re-introduce this feature, I *guess* as some sort of
opt-in quirk (ie. disabled by default unless something in DT tells you
otherwise?? But I'm open to suggestions. I'm not entirely sure what
hw was having problems due to this feature.)
On newer snapdragon devices we are using arm-smmu for the GPU, and
halting the GPU so the driver's fault handler can dump some GPU state
on faults is enormously helpful for debugging and tracking down where
in the gpu cmdstream the fault was triggered. In addition, we will
eventually want the ability to update pagetables from fault handler
and resuming the faulting transition.
Some additional comments below..

I would like to decouple this dev_err_ratelimit() print from the
RESUME_RETRY vs RESUME_TERMINATE behaviour. I need the ability to
indicate by return from my fault handler whether to resume or
terminate. But I already have my own ratelimted prints and would
prefer not to spam dmesg twice.
0 => RESUME_RETRY
-EFAULT => RESUME_TERMINATE but don't print
anything else (or specifically -ENOSYS?) => RESUME_TERMINATE and print
thoughts?

This might be a bug in qcom's implementation of the smmu spec, but
seems like we don't have SS bit set, yet we still require RESUME reg
to be written, otherwise gpu is perma-wedged. Maybe topic for a
separate quirk? I'm not sure if writing RESUME reg on other hw when
SS bit is not set is likely to cause problems? If not I suppose we
could just unconditionally write it.
Anyways, I'm not super-familiar w/ arm-smmu so suggestions welcome..
in between debugging freedreno I'll try to put together some patches.

From what I can tell we need SCTLR_CFCFG to make the stall happen otherwise
the operation just gets terminated immediately and *then* we get notification
but by then the system keeps going.

Yes thats right, SCTLR_CFCFG was removed as a part of this patch.

Post by Jordan Crouse
I think SCTLR_HUPCF helps control that behavior (i.e. we don't go off faulting
through eternity) but I don't know how it works.
From my very unlearned understanding I think we do want to set CFCFG and then
stall and let the interrupt handler decide to retry/terminate.

Yes, infact i was thinking of adding this as a new patch, but now that this was
a reverted one. As you said, it would be good to know the hw which was having
problem with this and probably having this has a quirk otherwise.

Also i see that FSR_SS is implemented by the qcom and the
downstream code uses it.

Regards,
Sricharan

Will Deacon

2016-12-16 11:54:13 UTC

Permalink

Hi Rob,

Post by Rob Clark

so, I'd like to re-introduce this feature, I *guess* as some sort of
opt-in quirk (ie. disabled by default unless something in DT tells you
otherwise?? But I'm open to suggestions. I'm not entirely sure what
hw was having problems due to this feature.)
On newer snapdragon devices we are using arm-smmu for the GPU, and
halting the GPU so the driver's fault handler can dump some GPU state
on faults is enormously helpful for debugging and tracking down where
in the gpu cmdstream the fault was triggered. In addition, we will
eventually want the ability to update pagetables from fault handler
and resuming the faulting transition.

I'm not against reintroducing this, but it would certainly need to be
opt-in, as you suggest. If we want to try to use stall faults to enable
demand paging for DMA, then that means running core mm code to resolve
the fault and we can't do that in irq context. Consequently, we have to
hand this off to a thread, which means the hardware must allow the SS
bit to remain set without immediately reasserting the interrupt line.
Furthermore, we can't handle multiple faults on a context-bank, so we'd
need to restrict ourselves to one device (i.e. faulting stream) per
domain (CB).

I think that means we want both specific compatible strings to identify
the SS bit behaviour, but also a way to opt-in for the stall model as a
separate property to indicate that the SoC integration can support this
without e.g. deadlocking.

Will

Joerg Roedel

2016-08-18 16:52:29 UTC

Permalink

Hi Will,

Post by Will Deacon
Here's the current pile of ARM SMMU fixes I have queued for 4.8. I plan
to send a pull to Joerg tomorrow afternoon.

Please also add 'Fixes:' tags to the patches before sending the
pull-request.

Thanks,

Joerg