Discussion:
SMR masking and PCI
Stuart Yoder
2016-10-27 17:10:00 UTC
Permalink
Hi Robin,

A question about how the SMR masking defined in the arm,smmu binding
relates to the PCI iommu-map.

The #iommu-cells property defines the number of cells an "IOMMU specifier"
takes and 2 is specified to be:

SMMUs with stream matching support and complex masters
may use a value of 2, where the second cell represents
an SMR mask to combine with the ID in the first cell.

An iommu-map entry is defined as:

(rid-base,iommu,iommu-base,length)

What seems to be currently missing in the iommu-map support is
the possibility the case where #iommu-cells=<2>.

In this case iommu-base which is an IOMMU specifier should
occupy 2 cells. For example on an ls2085a we would want:

iommu-map = <0x0 0x6 0x7 0x3ff 0x1
0x100 0x6 0x8 0x3ff 0x1>;

...to mask our stream IDs to 10 bits.

This should work in theory and comply with the bindings, no?

of_pci_map_rid() seems to have a hardcoded assumption that
each field in the map is 4 bytes.

(Also, I guess that msi-map is not affected by this since it
is not related to the IOMMU...but we do have common code
handling both maps.)

Thanks,
Stuart
Robin Murphy
2016-10-28 16:16:37 UTC
Permalink
Hi Stuart,
Post by Stuart Yoder
Hi Robin,
A question about how the SMR masking defined in the arm,smmu binding
relates to the PCI iommu-map.
The #iommu-cells property defines the number of cells an "IOMMU specifier"
SMMUs with stream matching support and complex masters
may use a value of 2, where the second cell represents
an SMR mask to combine with the ID in the first cell.
(rid-base,iommu,iommu-base,length)
What seems to be currently missing in the iommu-map support is
the possibility the case where #iommu-cells=<2>.
Indeed. The bindings have so far rather implicitly assumed the case of
#{msi,iommu}-cells = 1, and the code has followed suit.
Post by Stuart Yoder
In this case iommu-base which is an IOMMU specifier should
iommu-map = <0x0 0x6 0x7 0x3ff 0x1
0x100 0x6 0x8 0x3ff 0x1>;
...to mask our stream IDs to 10 bits.
This should work in theory and comply with the bindings, no?
In theory, but now consider:

iommu-map = <0x0 0x6 0x7 0x3ff 0x2>;

faced with ID 1. The input base is 0, the output base is the 2-cell
value 0x7000003ff, so the final ID value should be 0x700000400, right?
Post by Stuart Yoder
of_pci_map_rid() seems to have a hardcoded assumption that
each field in the map is 4 bytes.
It does. I guess we should explicitly check that #{msi,iommu}-cells = 1
on the target node, and maybe clarify in the binding that that cell
should represent a plain linear ID value (although that's pretty obvious
from context IMO).
Post by Stuart Yoder
(Also, I guess that msi-map is not affected by this since it
is not related to the IOMMU...but we do have common code
handling both maps.)
I'd say it's definitely affected, since #msi-cells can equally be more
than 1, and encodes an equally opaque value.

It seems pretty reasonable to me that we could extend the binding to
accommodate #cells > 1 iff length == 1. Mark?

That said, is there a concrete need for this, i.e. do you actually have
one device with a single requester ID, which maps to multiple output IDs
(which differ only in the upper bits) in a non-predictable manner?

Robin.
Post by Stuart Yoder
Thanks,
Stuart
Stuart Yoder
2016-10-28 17:12:10 UTC
Permalink
-----Original Message-----
Sent: Friday, October 28, 2016 11:17 AM
Subject: Re: SMR masking and PCI
Hi Stuart,
Post by Stuart Yoder
Hi Robin,
A question about how the SMR masking defined in the arm,smmu binding
relates to the PCI iommu-map.
The #iommu-cells property defines the number of cells an "IOMMU specifier"
SMMUs with stream matching support and complex masters
may use a value of 2, where the second cell represents
an SMR mask to combine with the ID in the first cell.
(rid-base,iommu,iommu-base,length)
What seems to be currently missing in the iommu-map support is
the possibility the case where #iommu-cells=<2>.
Indeed. The bindings have so far rather implicitly assumed the case of
#{msi,iommu}-cells = 1, and the code has followed suit.
Post by Stuart Yoder
In this case iommu-base which is an IOMMU specifier should
iommu-map = <0x0 0x6 0x7 0x3ff 0x1
0x100 0x6 0x8 0x3ff 0x1>;
...to mask our stream IDs to 10 bits.
This should work in theory and comply with the bindings, no?
iommu-map = <0x0 0x6 0x7 0x3ff 0x2>;
faced with ID 1. The input base is 0, the output base is the 2-cell
value 0x7000003ff, so the final ID value should be 0x700000400, right?
No. The second cell as per the SMMU binding is the SMR mask...applied
by the SMMU before matching stream IDs.

In our case we want to mask off the upper TBU ID bits that the SMMU tags
onto the stream ID in our RID->SID LUT table.

RID = 0
SID in LUT and seen by SMMU = 7
SMMU-500 TBU appends bits, making SID something like: 0xC07
SMR mask of 0x3ff should be applied making the SID: 0x7
Post by Stuart Yoder
of_pci_map_rid() seems to have a hardcoded assumption that
each field in the map is 4 bytes.
It does. I guess we should explicitly check that #{msi,iommu}-cells = 1
on the target node, and maybe clarify in the binding that that cell
should represent a plain linear ID value (although that's pretty obvious
from context IMO).
Post by Stuart Yoder
(Also, I guess that msi-map is not affected by this since it
is not related to the IOMMU...but we do have common code
handling both maps.)
I'd say it's definitely affected, since #msi-cells can equally be more
than 1, and encodes an equally opaque value.
It seems pretty reasonable to me that we could extend the binding to
accommodate #cells > 1 iff length == 1. Mark?
I'm not following why the length matters.
That said, is there a concrete need for this, i.e. do you actually have
one device with a single requester ID, which maps to multiple output IDs
(which differ only in the upper bits) in a non-predictable manner?
Devices behind an fsl-mc bus instance share the same stream ID (and GIC
ITS device ID) but may be behind different TBUs.

(see drivers/staging/fsl-mc/README.txt for overview info on fsl-mc)

So we could have a bus instance with all devices having a stream ID
of 0x5. But the TBU ID appending causes the TCU to see incoming
stream IDs from different devices with values of 0x105 and 0x205.
We need to get those upper bits masked off.

I don't see the above being an issue for PCI and platform devices. But
we don't want to expose TBU topography and IDs. Those are treated as
a microarchitecture detail in our SoC and are not even documented.

The SMMU programming model does not expose the TCU/TBU split.
Just treating the stream ID namespace as a clean linear space is
much easier to understand without having to grasp the existence
of TBUs at all. Masking SMRs keeps things clean.

As far as msi-map goes, I don't think there is an issue there. I
don't think the TBU appended bits are propagated to the GIC ITS
(as the device ID) so I don't think any masking is needed. Perhaps
you or Marc know.

It seems like perhaps what we need is a new argument passed to
of_pci_map_rid() that tells the function how many cells the
base iommu/msi identifier is. The caller supplies it.

Thanks,
Stuart
Mark Rutland
2016-10-28 17:43:09 UTC
Permalink
Hi,
Post by Robin Murphy
Post by Stuart Yoder
A question about how the SMR masking defined in the arm,smmu binding
relates to the PCI iommu-map.
The #iommu-cells property defines the number of cells an "IOMMU specifier"
SMMUs with stream matching support and complex masters
may use a value of 2, where the second cell represents
an SMR mask to combine with the ID in the first cell.
(rid-base,iommu,iommu-base,length)
What seems to be currently missing in the iommu-map support is
the possibility the case where #iommu-cells=<2>.
Indeed. The bindings have so far rather implicitly assumed the case of
#{msi,iommu}-cells = 1, and the code has followed suit.
The intention was that neither the iommu or msi bindings had such a
requirement, but evidently I did not specify the intended behaviour
thoroughly enough.

I had intended that the offset was added to the final cell of the
iommu-specifier (i.e. that the iommu-specifier was treated as a large
number).

You can handle this case by adding additional entries in the map table,
with a single entry length.
Post by Robin Murphy
Post by Stuart Yoder
In this case iommu-base which is an IOMMU specifier should
iommu-map = <0x0 0x6 0x7 0x3ff 0x1
0x100 0x6 0x8 0x3ff 0x1>;
...to mask our stream IDs to 10 bits.
This should work in theory and comply with the bindings, no?
iommu-map = <0x0 0x6 0x7 0x3ff 0x2>;
faced with ID 1. The input base is 0, the output base is the 2-cell
value 0x7000003ff, so the final ID value should be 0x700000400, right?
That was the intended behaviour, yes.
Post by Robin Murphy
Post by Stuart Yoder
(Also, I guess that msi-map is not affected by this since it
is not related to the IOMMU...but we do have common code
handling both maps.)
I'd say it's definitely affected, since #msi-cells can equally be more
than 1, and encodes an equally opaque value.
Yes.
Post by Robin Murphy
It seems pretty reasonable to me that we could extend the binding to
accommodate #cells > 1 iff length == 1. Mark?
I will try to come up with the wording to make the above explicit, for
both bindings.

Thanks,
Mark.
Stuart Yoder
2016-10-28 18:38:29 UTC
Permalink
-----Original Message-----
From: Stuart Yoder
Sent: Friday, October 28, 2016 12:12 PM
Subject: RE: SMR masking and PCI
-----Original Message-----
Sent: Friday, October 28, 2016 11:17 AM
Subject: Re: SMR masking and PCI
Hi Stuart,
Post by Stuart Yoder
Hi Robin,
A question about how the SMR masking defined in the arm,smmu binding
relates to the PCI iommu-map.
The #iommu-cells property defines the number of cells an "IOMMU specifier"
SMMUs with stream matching support and complex masters
may use a value of 2, where the second cell represents
an SMR mask to combine with the ID in the first cell.
(rid-base,iommu,iommu-base,length)
What seems to be currently missing in the iommu-map support is
the possibility the case where #iommu-cells=<2>.
Indeed. The bindings have so far rather implicitly assumed the case of
#{msi,iommu}-cells = 1, and the code has followed suit.
Post by Stuart Yoder
In this case iommu-base which is an IOMMU specifier should
iommu-map = <0x0 0x6 0x7 0x3ff 0x1
0x100 0x6 0x8 0x3ff 0x1>;
...to mask our stream IDs to 10 bits.
This should work in theory and comply with the bindings, no?
iommu-map = <0x0 0x6 0x7 0x3ff 0x2>;
faced with ID 1. The input base is 0, the output base is the 2-cell
value 0x7000003ff, so the final ID value should be 0x700000400, right?
No. The second cell as per the SMMU binding is the SMR mask...applied
by the SMMU before matching stream IDs.
I think I now understand what you mean. I missed that you envisioned the
ID and mask being returned as a single unit and concatenated together...and
are split apart later by the SMMU driver.
In our case we want to mask off the upper TBU ID bits that the SMMU tags
onto the stream ID in our RID->SID LUT table.
RID = 0
SID in LUT and seen by SMMU = 7
SMMU-500 TBU appends bits, making SID something like: 0xC07
SMR mask of 0x3ff should be applied making the SID: 0x7
Post by Stuart Yoder
of_pci_map_rid() seems to have a hardcoded assumption that
each field in the map is 4 bytes.
It does. I guess we should explicitly check that #{msi,iommu}-cells = 1
on the target node, and maybe clarify in the binding that that cell
should represent a plain linear ID value (although that's pretty obvious
from context IMO).
Post by Stuart Yoder
(Also, I guess that msi-map is not affected by this since it
is not related to the IOMMU...but we do have common code
handling both maps.)
I'd say it's definitely affected, since #msi-cells can equally be more
than 1, and encodes an equally opaque value.
It seems pretty reasonable to me that we could extend the binding to
accommodate #cells > 1 iff length == 1. Mark?
I'm not following why the length matters.
Never mind the comment...think I follow now. Supporting #cells > 1 if
length == 1 sounds good.

Stuart
Diana Madalina Craciun
2016-10-31 15:57:46 UTC
Permalink
Hi Robin,
Post by Robin Murphy
Hi Stuart,
Post by Stuart Yoder
Hi Robin,
A question about how the SMR masking defined in the arm,smmu binding
relates to the PCI iommu-map.
The #iommu-cells property defines the number of cells an "IOMMU specifier"
SMMUs with stream matching support and complex masters
may use a value of 2, where the second cell represents
an SMR mask to combine with the ID in the first cell.
(rid-base,iommu,iommu-base,length)
What seems to be currently missing in the iommu-map support is
the possibility the case where #iommu-cells=<2>.
Indeed. The bindings have so far rather implicitly assumed the case of
#{msi,iommu}-cells = 1, and the code has followed suit.
Post by Stuart Yoder
In this case iommu-base which is an IOMMU specifier should
iommu-map = <0x0 0x6 0x7 0x3ff 0x1
0x100 0x6 0x8 0x3ff 0x1>;
...to mask our stream IDs to 10 bits.
This should work in theory and comply with the bindings, no?
iommu-map = <0x0 0x6 0x7 0x3ff 0x2>;
faced with ID 1. The input base is 0, the output base is the 2-cell
value 0x7000003ff, so the final ID value should be 0x700000400, right?
Post by Stuart Yoder
of_pci_map_rid() seems to have a hardcoded assumption that
each field in the map is 4 bytes.
It does. I guess we should explicitly check that #{msi,iommu}-cells = 1
on the target node, and maybe clarify in the binding that that cell
should represent a plain linear ID value (although that's pretty obvious
from context IMO).
Post by Stuart Yoder
(Also, I guess that msi-map is not affected by this since it
is not related to the IOMMU...but we do have common code
handling both maps.)
I'd say it's definitely affected, since #msi-cells can equally be more
than 1, and encodes an equally opaque value.
It seems pretty reasonable to me that we could extend the binding to
accommodate #cells > 1 iff length == 1. Mark?
That said, is there a concrete need for this, i.e. do you actually have
one device with a single requester ID, which maps to multiple output IDs
(which differ only in the upper bits) in a non-predictable manner?
Actually in the example presented by Stuart, the SMR mask should be
0x7C00 (as 0 means that the bit is relevant for matching). So, we have
the stream ID 7, but the SMMU 500 is appending the TBU bits which makes
the stream ID look like 0x1407 (TBU 5). In our platform the relationship
device-TBU is not exposed and documented. The SMMU-500 ref manual
describes this case:

"If the Stream ID presented to each TBU is already unique, and the TBU
ID addition is not required, then you must ensure that the TBU ID field
is masked in the SMR."

This is the reason that we need the SMR mask, to mask the TBU bits in
the SMR.

Thank you,

Diana
Robin Murphy
2016-10-31 18:36:43 UTC
Permalink
Post by Stuart Yoder
Hi Robin,
Post by Robin Murphy
Hi Stuart,
Post by Stuart Yoder
Hi Robin,
A question about how the SMR masking defined in the arm,smmu binding
relates to the PCI iommu-map.
The #iommu-cells property defines the number of cells an "IOMMU specifier"
SMMUs with stream matching support and complex masters
may use a value of 2, where the second cell represents
an SMR mask to combine with the ID in the first cell.
(rid-base,iommu,iommu-base,length)
What seems to be currently missing in the iommu-map support is
the possibility the case where #iommu-cells=<2>.
Indeed. The bindings have so far rather implicitly assumed the case of
#{msi,iommu}-cells = 1, and the code has followed suit.
Post by Stuart Yoder
In this case iommu-base which is an IOMMU specifier should
iommu-map = <0x0 0x6 0x7 0x3ff 0x1
0x100 0x6 0x8 0x3ff 0x1>;
...to mask our stream IDs to 10 bits.
This should work in theory and comply with the bindings, no?
iommu-map = <0x0 0x6 0x7 0x3ff 0x2>;
faced with ID 1. The input base is 0, the output base is the 2-cell
value 0x7000003ff, so the final ID value should be 0x700000400, right?
Post by Stuart Yoder
of_pci_map_rid() seems to have a hardcoded assumption that
each field in the map is 4 bytes.
It does. I guess we should explicitly check that #{msi,iommu}-cells = 1
on the target node, and maybe clarify in the binding that that cell
should represent a plain linear ID value (although that's pretty obvious
from context IMO).
Post by Stuart Yoder
(Also, I guess that msi-map is not affected by this since it
is not related to the IOMMU...but we do have common code
handling both maps.)
I'd say it's definitely affected, since #msi-cells can equally be more
than 1, and encodes an equally opaque value.
It seems pretty reasonable to me that we could extend the binding to
accommodate #cells > 1 iff length == 1. Mark?
That said, is there a concrete need for this, i.e. do you actually have
one device with a single requester ID, which maps to multiple output IDs
(which differ only in the upper bits) in a non-predictable manner?
Actually in the example presented by Stuart, the SMR mask should be
0x7C00 (as 0 means that the bit is relevant for matching). So, we have
the stream ID 7, but the SMMU 500 is appending the TBU bits which makes
the stream ID look like 0x1407 (TBU 5). In our platform the relationship
device-TBU is not exposed and documented.
To be fair, that's only the fault of the folks who neglected to document
it (and if this really was the anticipated use-case, possibly also the
integration folks for not simply using the 15-bit Stream ID
configuration and tying any extra bits off). The TBU IDs are still an
undeniable property of the hardware, and not exactly difficult to
discover either. For instance, an LS2085a has been running with the
following map for PCIe for quite some time:

/* Squash 8:5:3 BDF down to 2:2:3 */
iommu-map-mask = <0x031f>;
iommu-map = <0x000 &smmu 0x1400 0x20>,
<0x100 &smmu 0x1420 0x20>,
<0x200 &smmu 0x1440 0x20>,
<0x300 &smmu 0x1460 0x20>;

(with the obligatory hacks to program the PEX LUT entries to match, and
the SATA ICIDs not to alias as they apparently go through the same TBU).
Post by Stuart Yoder
The SMMU-500 ref manual
"If the Stream ID presented to each TBU is already unique, and the TBU
ID addition is not required, then you must ensure that the TBU ID field
is masked in the SMR."
This is the reason that we need the SMR mask, to mask the TBU bits in
the SMR.
The PCIe example might be a reason to *want* masking, in order to work
around inadequate documentation, but it certainly isn't a *need*.

Fortunately, Stuart's description of the DPAA complex mastering through
multiple TBUs such that a single device's ICID may map to multiple
stream IDs *is* a compelling justification, because iommu-map isn't
designed for one-to-many mappings. You just need to be very careful the
ICIDs really are completely unique system-wide once you start masking,
or the aliasing will result in weird, and possibly impractical, group
assignment.

Anyway, from the SMMU driver perspective SMR masking does work nicely
now, so I'm happy to review patches to of_pci_map_rid() ;)

Robin.
Post by Stuart Yoder
Thank you,
Diana
Loading...