Post by David WoodhouseYes, that looks correct. I think we may also need to limit it, because
full 20-bit PASID support means we'll attempt an order 11 allocation.
But that's certainly correct for now
Actually, not quite correct. You fixed the allocation but not the free.
And Mika had reported that even the *correct* allocation was failing
because it was too large. So I've done it differently (untested)...
-----
Subject: [PATCH] iommu/vt-d: Fix PASID table allocation
Somehow I ended up with an off-by-three error in calculating the size of
the PASID and PASID State tables, which triggers allocations failures as
those tables unfortunately have to be physically contiguous.
In fact, even the *correct* maximum size of 8MiB is problematic and is
wont to lead to allocation failures. Since I have extracted a promise
that this *will* be fixed in hardware, I'm happy to limit it on the
current hardware to a maximum of 0x20000 PASIDs, which gives us 1MiB
tables â still not ideal, but better than before.
Reported by Mika Kuoppala <***@linux.intel.com> and also by
Xunlei Pang <***@redhat.com> who submitted a simpler patch to fix
only the allocation (and not the free) to the "correct" limit... which
was still problematic.
Signed-off-by: David Woodhouse <***@infradead.org>
---
 drivers/iommu/intel-svm.c   | 28 +++++++++++++++++-----------
 include/linux/intel-iommu.h |  1 +
 2 files changed, 18 insertions(+), 11 deletions(-)
diff --git a/drivers/iommu/intel-svm.c b/drivers/iommu/intel-svm.c
index 8ebb353..cb72e00 100644
--- a/drivers/iommu/intel-svm.c
+++ b/drivers/iommu/intel-svm.c
@@ -39,10 +39,18 @@ int intel_svm_alloc_pasid_tables(struct intel_iommu *iommu)
 struct page *pages;
 int order;
Â
- order = ecap_pss(iommu->ecap) + 7 - PAGE_SHIFT;
- if (order < 0)
- order = 0;
-
+ /* Start at 2 because it's defined as 1^(1+PSS) */
+ iommu->pasid_max = 2 << ecap_pss(iommu->ecap);
+
+ /* Eventually I'm promised we will get a multi-level PASID table
+ Â * and it won't have to be physically contiguous. Until then,
+ Â * limit the size because 8MiB contiguous allocations can be hard
+ Â * to come by. The limit of 0x20000, which is 1MiB for each of
+ Â * the PASID and PASID-state tables, is somewhat arbitrary. */
+ if (iommu->pasid_max > 0x20000)
+ iommu->pasid_max = 0x20000;
+
+ order = get_order(sizeof(struct pasid_entry) * iommu->pasid_max);
 pages = alloc_pages(GFP_KERNEL | __GFP_ZERO, order);
 if (!pages) {
 pr_warn("IOMMU: %s: Failed to allocate PASID table\n",
@@ -53,6 +61,8 @@ int intel_svm_alloc_pasid_tables(struct intel_iommu *iommu)
 pr_info("%s: Allocated order %d PASID table.\n", iommu->name, order);
Â
 if (ecap_dis(iommu->ecap)) {
+ /* Just making it explicit... */
+ BUILD_BUG_ON(sizeof(struct pasid_entry) != sizeof(struct pasid_state_entry));
 pages = alloc_pages(GFP_KERNEL | __GFP_ZERO, order);
 if (pages)
 iommu->pasid_state_table = page_address(pages);
@@ -68,11 +78,7 @@ int intel_svm_alloc_pasid_tables(struct intel_iommu *iommu)
Â
 int intel_svm_free_pasid_tables(struct intel_iommu *iommu)
 {
- int order;
-
- order = ecap_pss(iommu->ecap) + 7 - PAGE_SHIFT;
- if (order < 0)
- order = 0;
+ int order = get_order(sizeof(struct pasid_entry) * iommu->pasid_max);
Â
 if (iommu->pasid_table) {
 free_pages((unsigned long)iommu->pasid_table, order);
@@ -371,8 +377,8 @@ int intel_svm_bind_mm(struct device *dev, int *pasid, int flags, struct svm_dev_
 }
 svm->iommu = iommu;
Â
- if (pasid_max > 2 << ecap_pss(iommu->ecap))
- pasid_max = 2 << ecap_pss(iommu->ecap);
+ if (pasid_max > iommu->pasid_max)
+ pasid_max = iommu->pasid_max;
Â
 /* Do not use PASID 0 in caching mode (virtualised IOMMU) */
 ret = idr_alloc(&iommu->pasid_idr, svm,
diff --git a/include/linux/intel-iommu.h b/include/linux/intel-iommu.h
index 2d9b6500..d49e26c 100644
--- a/include/linux/intel-iommu.h
+++ b/include/linux/intel-iommu.h
@@ -429,6 +429,7 @@ struct intel_iommu {
 struct page_req_dsc *prq;
 unsigned char prq_name[16];    /* Name for PRQ interrupt */
 struct idr pasid_idr;
+ u32 pasid_max;
 #endif
 struct q_inval  *qi;            /* Queued invalidation info */
 u32 *iommu_state; /* Store iommu states between suspend and resume.*/
--Â
2.5.5
--
dwmw2