Discussion:
[PATCH] iommu/vt-d: Fix the size calculation of pasid table
Xunlei Pang
2016-09-12 02:49:11 UTC
Permalink
According to the vt-d spec, the size of pasid (state) entry is 8B
which equals 3 in power of 2, the number of pasid (state) entries
is (ecap_pss + 1) in power of 2.

Thus the right size of pasid (state) table in power of 2 should be
ecap_pss(iommu->ecap) plus "1+3=4" other than 7.

Signed-off-by: Xunlei Pang <***@redhat.com>
---
drivers/iommu/intel-svm.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/iommu/intel-svm.c b/drivers/iommu/intel-svm.c
index 8ebb353..cfa75c2 100644
--- a/drivers/iommu/intel-svm.c
+++ b/drivers/iommu/intel-svm.c
@@ -39,7 +39,7 @@ int intel_svm_alloc_pasid_tables(struct intel_iommu *iommu)
struct page *pages;
int order;

- order = ecap_pss(iommu->ecap) + 7 - PAGE_SHIFT;
+ order = ecap_pss(iommu->ecap) + 4 - PAGE_SHIFT;
if (order < 0)
order = 0;
--
1.8.3.1
Joerg Roedel
2016-09-19 12:18:27 UTC
Permalink
[Cc'ing David]
Post by Xunlei Pang
According to the vt-d spec, the size of pasid (state) entry is 8B
which equals 3 in power of 2, the number of pasid (state) entries
is (ecap_pss + 1) in power of 2.
Thus the right size of pasid (state) table in power of 2 should be
ecap_pss(iommu->ecap) plus "1+3=4" other than 7.
---
drivers/iommu/intel-svm.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/iommu/intel-svm.c b/drivers/iommu/intel-svm.c
index 8ebb353..cfa75c2 100644
--- a/drivers/iommu/intel-svm.c
+++ b/drivers/iommu/intel-svm.c
@@ -39,7 +39,7 @@ int intel_svm_alloc_pasid_tables(struct intel_iommu *iommu)
struct page *pages;
int order;
- order = ecap_pss(iommu->ecap) + 7 - PAGE_SHIFT;
+ order = ecap_pss(iommu->ecap) + 4 - PAGE_SHIFT;
if (order < 0)
order = 0;
The patch seems to be correct, but I'll let David comment on it first.



Joerg
Xunlei Pang
2016-10-10 13:00:14 UTC
Permalink
Ping David for confirmation
Post by Joerg Roedel
[Cc'ing David]
Post by Xunlei Pang
According to the vt-d spec, the size of pasid (state) entry is 8B
which equals 3 in power of 2, the number of pasid (state) entries
is (ecap_pss + 1) in power of 2.
Thus the right size of pasid (state) table in power of 2 should be
ecap_pss(iommu->ecap) plus "1+3=4" other than 7.
---
drivers/iommu/intel-svm.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/iommu/intel-svm.c b/drivers/iommu/intel-svm.c
index 8ebb353..cfa75c2 100644
--- a/drivers/iommu/intel-svm.c
+++ b/drivers/iommu/intel-svm.c
@@ -39,7 +39,7 @@ int intel_svm_alloc_pasid_tables(struct intel_iommu *iommu)
struct page *pages;
int order;
- order = ecap_pss(iommu->ecap) + 7 - PAGE_SHIFT;
+ order = ecap_pss(iommu->ecap) + 4 - PAGE_SHIFT;
if (order < 0)
order = 0;
The patch seems to be correct, but I'll let David comment on it first.
Joerg
David Woodhouse
2016-10-12 12:17:54 UTC
Permalink
Post by Joerg Roedel
[Cc'ing David]
Post by Xunlei Pang
According to the vt-d spec, the size of pasid (state) entry is 8B
which equals 3 in power of 2, the number of pasid (state) entries
is (ecap_pss + 1) in power of 2.
Thus the right size of pasid (state) table in power of 2 should be
ecap_pss(iommu->ecap) plus "1+3=4" other than 7.
---
drivers/iommu/intel-svm.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/iommu/intel-svm.c b/drivers/iommu/intel-svm.c
index 8ebb353..cfa75c2 100644
--- a/drivers/iommu/intel-svm.c
+++ b/drivers/iommu/intel-svm.c
@@ -39,7 +39,7 @@ int intel_svm_alloc_pasid_tables(struct intel_iommu *iommu)
struct page *pages;
int order;
- order = ecap_pss(iommu->ecap) + 7 - PAGE_SHIFT;
+ order = ecap_pss(iommu->ecap) + 4 - PAGE_SHIFT;
if (order < 0)
order = 0;
The patch seems to be correct, but I'll let David comment on it first.
Yes, that looks correct. I think we may also need to limit it, because
full 20-bit PASID support means we'll attempt an order 11 allocation.
But that's certainly correct for now

Acked-by: David Woodhouse <***@infradead.org>
--
dwmw2
David Woodhouse
2016-10-30 12:18:22 UTC
Permalink
Post by David Woodhouse
Yes, that looks correct. I think we may also need to limit it, because
full 20-bit PASID support means we'll attempt an order 11 allocation.
But that's certainly correct for now
Actually, not quite correct. You fixed the allocation but not the free.
And Mika had reported that even the *correct* allocation was failing
because it was too large. So I've done it differently (untested)...

-----
Subject: [PATCH] iommu/vt-d: Fix PASID table allocation

Somehow I ended up with an off-by-three error in calculating the size of
the PASID and PASID State tables, which triggers allocations failures as
those tables unfortunately have to be physically contiguous.

In fact, even the *correct* maximum size of 8MiB is problematic and is
wont to lead to allocation failures. Since I have extracted a promise
that this *will* be fixed in hardware, I'm happy to limit it on the
current hardware to a maximum of 0x20000 PASIDs, which gives us 1MiB
tables — still not ideal, but better than before.

Reported by Mika Kuoppala <***@linux.intel.com> and also by
Xunlei Pang <***@redhat.com> who submitted a simpler patch to fix
only the allocation (and not the free) to the "correct" limit... which
was still problematic.

Signed-off-by: David Woodhouse <***@infradead.org>
---
 drivers/iommu/intel-svm.c   | 28 +++++++++++++++++-----------
 include/linux/intel-iommu.h |  1 +
 2 files changed, 18 insertions(+), 11 deletions(-)

diff --git a/drivers/iommu/intel-svm.c b/drivers/iommu/intel-svm.c
index 8ebb353..cb72e00 100644
--- a/drivers/iommu/intel-svm.c
+++ b/drivers/iommu/intel-svm.c
@@ -39,10 +39,18 @@ int intel_svm_alloc_pasid_tables(struct intel_iommu *iommu)
  struct page *pages;
  int order;
 
- order = ecap_pss(iommu->ecap) + 7 - PAGE_SHIFT;
- if (order < 0)
- order = 0;
-
+ /* Start at 2 because it's defined as 1^(1+PSS) */
+ iommu->pasid_max = 2 << ecap_pss(iommu->ecap);
+
+ /* Eventually I'm promised we will get a multi-level PASID table
+  * and it won't have to be physically contiguous. Until then,
+  * limit the size because 8MiB contiguous allocations can be hard
+  * to come by. The limit of 0x20000, which is 1MiB for each of
+  * the PASID and PASID-state tables, is somewhat arbitrary. */
+ if (iommu->pasid_max > 0x20000)
+ iommu->pasid_max = 0x20000;
+
+ order = get_order(sizeof(struct pasid_entry) * iommu->pasid_max);
  pages = alloc_pages(GFP_KERNEL | __GFP_ZERO, order);
  if (!pages) {
  pr_warn("IOMMU: %s: Failed to allocate PASID table\n",
@@ -53,6 +61,8 @@ int intel_svm_alloc_pasid_tables(struct intel_iommu *iommu)
  pr_info("%s: Allocated order %d PASID table.\n", iommu->name, order);
 
  if (ecap_dis(iommu->ecap)) {
+ /* Just making it explicit... */
+ BUILD_BUG_ON(sizeof(struct pasid_entry) != sizeof(struct pasid_state_entry));
  pages = alloc_pages(GFP_KERNEL | __GFP_ZERO, order);
  if (pages)
  iommu->pasid_state_table = page_address(pages);
@@ -68,11 +78,7 @@ int intel_svm_alloc_pasid_tables(struct intel_iommu *iommu)
 
 int intel_svm_free_pasid_tables(struct intel_iommu *iommu)
 {
- int order;
-
- order = ecap_pss(iommu->ecap) + 7 - PAGE_SHIFT;
- if (order < 0)
- order = 0;
+ int order = get_order(sizeof(struct pasid_entry) * iommu->pasid_max);
 
  if (iommu->pasid_table) {
  free_pages((unsigned long)iommu->pasid_table, order);
@@ -371,8 +377,8 @@ int intel_svm_bind_mm(struct device *dev, int *pasid, int flags, struct svm_dev_
  }
  svm->iommu = iommu;
 
- if (pasid_max > 2 << ecap_pss(iommu->ecap))
- pasid_max = 2 << ecap_pss(iommu->ecap);
+ if (pasid_max > iommu->pasid_max)
+ pasid_max = iommu->pasid_max;
 
  /* Do not use PASID 0 in caching mode (virtualised IOMMU) */
  ret = idr_alloc(&iommu->pasid_idr, svm,
diff --git a/include/linux/intel-iommu.h b/include/linux/intel-iommu.h
index 2d9b6500..d49e26c 100644
--- a/include/linux/intel-iommu.h
+++ b/include/linux/intel-iommu.h
@@ -429,6 +429,7 @@ struct intel_iommu {
  struct page_req_dsc *prq;
  unsigned char prq_name[16];    /* Name for PRQ interrupt */
  struct idr pasid_idr;
+ u32 pasid_max;
 #endif
  struct q_inval  *qi;            /* Queued invalidation info */
  u32 *iommu_state; /* Store iommu states between suspend and resume.*/
-- 
2.5.5
--
dwmw2
Xunlei Pang
2016-10-31 11:30:32 UTC
Permalink
Post by David Woodhouse
Post by David Woodhouse
Yes, that looks correct. I think we may also need to limit it, because
full 20-bit PASID support means we'll attempt an order 11 allocation.
But that's certainly correct for now
Actually, not quite correct. You fixed the allocation but not the free.
And Mika had reported that even the *correct* allocation was failing
because it was too large. So I've done it differently (untested)...
Yes, your fix looks correct.

Thanks,
Xunlei
Post by David Woodhouse
-----
Subject: [PATCH] iommu/vt-d: Fix PASID table allocation
Somehow I ended up with an off-by-three error in calculating the size of
the PASID and PASID State tables, which triggers allocations failures as
those tables unfortunately have to be physically contiguous.
In fact, even the *correct* maximum size of 8MiB is problematic and is
wont to lead to allocation failures. Since I have extracted a promise
that this *will* be fixed in hardware, I'm happy to limit it on the
current hardware to a maximum of 0x20000 PASIDs, which gives us 1MiB
tables — still not ideal, but better than before.
only the allocation (and not the free) to the "correct" limit... which
was still problematic.
---
drivers/iommu/intel-svm.c | 28 +++++++++++++++++-----------
include/linux/intel-iommu.h | 1 +
2 files changed, 18 insertions(+), 11 deletions(-)
diff --git a/drivers/iommu/intel-svm.c b/drivers/iommu/intel-svm.c
index 8ebb353..cb72e00 100644
--- a/drivers/iommu/intel-svm.c
+++ b/drivers/iommu/intel-svm.c
@@ -39,10 +39,18 @@ int intel_svm_alloc_pasid_tables(struct intel_iommu *iommu)
struct page *pages;
int order;
- order = ecap_pss(iommu->ecap) + 7 - PAGE_SHIFT;
- if (order < 0)
- order = 0;
-
+ /* Start at 2 because it's defined as 1^(1+PSS) */
+ iommu->pasid_max = 2 << ecap_pss(iommu->ecap);
+
+ /* Eventually I'm promised we will get a multi-level PASID table
+ * and it won't have to be physically contiguous. Until then,
+ * limit the size because 8MiB contiguous allocations can be hard
+ * to come by. The limit of 0x20000, which is 1MiB for each of
+ * the PASID and PASID-state tables, is somewhat arbitrary. */
+ if (iommu->pasid_max > 0x20000)
+ iommu->pasid_max = 0x20000;
+
+ order = get_order(sizeof(struct pasid_entry) * iommu->pasid_max);
pages = alloc_pages(GFP_KERNEL | __GFP_ZERO, order);
if (!pages) {
pr_warn("IOMMU: %s: Failed to allocate PASID table\n",
@@ -53,6 +61,8 @@ int intel_svm_alloc_pasid_tables(struct intel_iommu *iommu)
pr_info("%s: Allocated order %d PASID table.\n", iommu->name, order);
if (ecap_dis(iommu->ecap)) {
+ /* Just making it explicit... */
+ BUILD_BUG_ON(sizeof(struct pasid_entry) != sizeof(struct pasid_state_entry));
pages = alloc_pages(GFP_KERNEL | __GFP_ZERO, order);
if (pages)
iommu->pasid_state_table = page_address(pages);
@@ -68,11 +78,7 @@ int intel_svm_alloc_pasid_tables(struct intel_iommu *iommu)
int intel_svm_free_pasid_tables(struct intel_iommu *iommu)
{
- int order;
-
- order = ecap_pss(iommu->ecap) + 7 - PAGE_SHIFT;
- if (order < 0)
- order = 0;
+ int order = get_order(sizeof(struct pasid_entry) * iommu->pasid_max);
if (iommu->pasid_table) {
free_pages((unsigned long)iommu->pasid_table, order);
@@ -371,8 +377,8 @@ int intel_svm_bind_mm(struct device *dev, int *pasid, int flags, struct svm_dev_
}
svm->iommu = iommu;
- if (pasid_max > 2 << ecap_pss(iommu->ecap))
- pasid_max = 2 << ecap_pss(iommu->ecap);
+ if (pasid_max > iommu->pasid_max)
+ pasid_max = iommu->pasid_max;
/* Do not use PASID 0 in caching mode (virtualised IOMMU) */
ret = idr_alloc(&iommu->pasid_idr, svm,
diff --git a/include/linux/intel-iommu.h b/include/linux/intel-iommu.h
index 2d9b6500..d49e26c 100644
--- a/include/linux/intel-iommu.h
+++ b/include/linux/intel-iommu.h
@@ -429,6 +429,7 @@ struct intel_iommu {
struct page_req_dsc *prq;
unsigned char prq_name[16]; /* Name for PRQ interrupt */
struct idr pasid_idr;
+ u32 pasid_max;
#endif
struct q_inval *qi; /* Queued invalidation info */
u32 *iommu_state; /* Store iommu states between suspend and resume.*/
--
2.5.5
Joerg Roedel
2016-11-10 10:45:44 UTC
Permalink
Hi David,
Post by David Woodhouse
+ /* Start at 2 because it's defined as 1^(1+PSS) */
This probably means 2^(1+PSS), right?
Post by David Woodhouse
+ iommu->pasid_max = 2 << ecap_pss(iommu->ecap);
Otherwise the patch looks good. Do you want to send it upstream yourself
or should I pick it up?



Joerg
David Woodhouse
2016-11-10 11:02:52 UTC
Permalink
Post by Joerg Roedel
Hi David,
+     /* Start at 2 because it's defined as 1^(1+PSS) */
This probably means 2^(1+PSS), right?
Or 1 << (1+PSS). Yeah.
Post by Joerg Roedel
+     iommu->pasid_max = 2 << ecap_pss(iommu->ecap);
Otherwise the patch looks good. Do you want to send it upstream yourself
or should I pick it up?
I'll let it sit in -next for a day or two more once I've fixed the
above, then ask Linus to pull it. Thanks.
--
dwmw2
Loading...