Discussion:
[RFT] iommu/amd: use subsys_initcall() on amdv2 iommu
Oded Gabbay
2016-03-16 07:02:43 UTC
Permalink
We need to ensure amd iommu v2 initializes before
driver uses such as drivers/gpu/drm/amd/amdkfd/kfd_module.c,
to do this make its init routine a subsys_initcall() which
ensures its load init is called first than modules when
built-in.
---
Can someone test if this patch enables both CONFIG_AMD_IOMMU_V2 and
CONFIG_HSA_AMD to be =y (built-in) without any conflicts ?
drivers/iommu/amd_iommu_v2.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/iommu/amd_iommu_v2.c b/drivers/iommu/amd_iommu_v2.c
index 56999d2fac07..60df645b9927 100644
--- a/drivers/iommu/amd_iommu_v2.c
+++ b/drivers/iommu/amd_iommu_v2.c
@@ -1004,5 +1004,5 @@ static void __exit amd_iommu_v2_exit(void)
destroy_workqueue(iommu_wq);
}
-module_init(amd_iommu_v2_init);
+subsys_initcall(amd_iommu_v2_init);
module_exit(amd_iommu_v2_exit);
--
2.7.2
fwiw, we currently have this covered by the ugly hack of putting iommu
subsystem in front of gpu subsystem in the main drivers makefile (See
1bacc894c227fad8a727eb99728df708eba57654)

Oded
Joerg Roedel
2016-03-16 10:14:57 UTC
Permalink
Post by Oded Gabbay
fwiw, we currently have this covered by the ugly hack of putting iommu
subsystem in front of gpu subsystem in the main drivers makefile (See
1bacc894c227fad8a727eb99728df708eba57654)
Sure, but the above is a less ugly hack to solve the problem. So the
question is, would that work too?


Joerg
Oded Gabbay
2016-03-16 10:16:57 UTC
Permalink
Post by Joerg Roedel
Post by Oded Gabbay
fwiw, we currently have this covered by the ugly hack of putting iommu
subsystem in front of gpu subsystem in the main drivers makefile (See
1bacc894c227fad8a727eb99728df708eba57654)
Sure, but the above is a less ugly hack to solve the problem. So the
question is, would that work too?
Joerg
In theory it should, but I would prefer that it would be tested on
actual hardware.

Oded
Joerg Roedel
2016-03-16 16:39:00 UTC
Permalink
Post by Oded Gabbay
In theory it should, but I would prefer that it would be tested on
actual hardware.
Hence, RFT. Anyone have hardware to test ? And would the other hack
mentioned need to be reverted first ?
Yes, the other hack needs to be reverted first to see if this one helps.


Joerg
Joerg Roedel
2016-03-16 17:17:19 UTC
Permalink
I'm afraid I am not sure where that hack is, can someone construct a patch to
revert that so this is a proper RFT series ?
Oded mentioned 1bacc894c227fad8a727eb99728df708eba57654, which reverts
fine here on v4.5.


Joerg
Luis R. Rodriguez
2016-03-29 17:41:53 UTC
Permalink
We need to ensure amd iommu v2 initializes before
driver uses such as drivers/gpu/drm/amd/amdkfd/kfd_module.c,
to do this make its init routine a subsys_initcall() which
ensures its load init is called first than modules when
built-in.

This reverts the old work around implemented through commit
1bacc894c227fad8a7 ("drivers: Move iommu/ before gpu/ in Makefile"),
instead of making the dependency implicit by linker order this
makes the ordering requirement explicit through proper kernel
APIs.

Cc: Oded Gabbay <***@amd.com>
Cc: Christian König <***@amd.com>
Signed-off-by: Luis R. Rodriguez <***@kernel.org>
---

This goes along with the revert included. Can someone please test?

drivers/Makefile | 6 ++----
drivers/iommu/amd_iommu_v2.c | 2 +-
2 files changed, 3 insertions(+), 5 deletions(-)

diff --git a/drivers/Makefile b/drivers/Makefile
index 8f5d076baeb0..cc3cfbddc376 100644
--- a/drivers/Makefile
+++ b/drivers/Makefile
@@ -50,10 +50,7 @@ obj-$(CONFIG_RESET_CONTROLLER) += reset/
obj-y += tty/
obj-y += char/

-# iommu/ comes before gpu as gpu are using iommu controllers
-obj-$(CONFIG_IOMMU_SUPPORT) += iommu/
-
-# gpu/ comes after char for AGP vs DRM startup and after iommu
+# gpu/ comes after char for AGP vs DRM startup
obj-y += gpu/

obj-$(CONFIG_CONNECTOR) += connector/
@@ -146,6 +143,7 @@ obj-y += clk/

obj-$(CONFIG_MAILBOX) += mailbox/
obj-$(CONFIG_HWSPINLOCK) += hwspinlock/
+obj-$(CONFIG_IOMMU_SUPPORT) += iommu/
obj-$(CONFIG_REMOTEPROC) += remoteproc/
obj-$(CONFIG_RPMSG) += rpmsg/

diff --git a/drivers/iommu/amd_iommu_v2.c b/drivers/iommu/amd_iommu_v2.c
index 56999d2fac07..60df645b9927 100644
--- a/drivers/iommu/amd_iommu_v2.c
+++ b/drivers/iommu/amd_iommu_v2.c
@@ -1004,5 +1004,5 @@ static void __exit amd_iommu_v2_exit(void)
destroy_workqueue(iommu_wq);
}

-module_init(amd_iommu_v2_init);
+subsys_initcall(amd_iommu_v2_init);
module_exit(amd_iommu_v2_exit);
--
2.7.2
Luis R. Rodriguez
2016-04-09 00:25:49 UTC
Permalink
We need to ensure amd iommu v2 initializes before
driver uses such as drivers/gpu/drm/amd/amdkfd/kfd_module.c,
to do this make its init routine a subsys_initcall() which
ensures its load init is called first than modules when
built-in.
This reverts the old work around implemented through commit
1bacc894c227fad8a7 ("drivers: Move iommu/ before gpu/ in Makefile"),
instead of making the dependency implicit by linker order this
makes the ordering requirement explicit through proper kernel
APIs.
*poke*

Luis
Christian König
2016-04-11 13:28:52 UTC
Permalink
Post by Luis R. Rodriguez
We need to ensure amd iommu v2 initializes before
driver uses such as drivers/gpu/drm/amd/amdkfd/kfd_module.c,
to do this make its init routine a subsys_initcall() which
ensures its load init is called first than modules when
built-in.
This reverts the old work around implemented through commit
1bacc894c227fad8a7 ("drivers: Move iommu/ before gpu/ in Makefile"),
instead of making the dependency implicit by linker order this
makes the ordering requirement explicit through proper kernel
APIs.
Sorry for not responding earlier. Just coming back to all the stuff on
my TODO list.

Patch is Acked-by: Christian König <***@amd.com>

Regards,
Christian.
Post by Luis R. Rodriguez
*poke*
Luis
Oded Gabbay
2016-04-11 13:39:43 UTC
Permalink
On Mon, Apr 11, 2016 at 4:28 PM, Christian König
Post by Luis R. Rodriguez
We need to ensure amd iommu v2 initializes before
driver uses such as drivers/gpu/drm/amd/amdkfd/kfd_module.c,
to do this make its init routine a subsys_initcall() which
ensures its load init is called first than modules when
built-in.
This reverts the old work around implemented through commit
1bacc894c227fad8a7 ("drivers: Move iommu/ before gpu/ in Makefile"),
instead of making the dependency implicit by linker order this
makes the ordering requirement explicit through proper kernel
APIs.
Sorry for not responding earlier. Just coming back to all the stuff on my TODO list.
Christian,
Just wanted to be sure if you tested this patch-set or not.

I don't think it should be merged without testing. If you already
tested it than fine. If not, I think I can do it in the next week or
so (just came back from PTO).

Oded
Regards,
Christian.
Post by Luis R. Rodriguez
*poke*
Luis
Christian König
2016-04-11 13:52:43 UTC
Permalink
Post by Oded Gabbay
On Mon, Apr 11, 2016 at 4:28 PM, Christian König
Post by Luis R. Rodriguez
We need to ensure amd iommu v2 initializes before
driver uses such as drivers/gpu/drm/amd/amdkfd/kfd_module.c,
to do this make its init routine a subsys_initcall() which
ensures its load init is called first than modules when
built-in.
This reverts the old work around implemented through commit
1bacc894c227fad8a7 ("drivers: Move iommu/ before gpu/ in Makefile"),
instead of making the dependency implicit by linker order this
makes the ordering requirement explicit through proper kernel
APIs.
Sorry for not responding earlier. Just coming back to all the stuff on my TODO list.
Christian,
Just wanted to be sure if you tested this patch-set or not.
I did NOT tested it. If AMD IOMMU requires something which will now
initialize after the IOMMU module we will obviously run into trouble again.

I assumed that the creator of the patch did some testing.
Post by Oded Gabbay
I don't think it should be merged without testing. If you already
tested it than fine. If not, I think I can do it in the next week or
so (just came back from PTO).
Yeah, agree totally.

Regards,
Christian.
Post by Oded Gabbay
Oded
Regards,
Christian.
Post by Luis R. Rodriguez
*poke*
Luis
Luis R. Rodriguez
2016-04-12 22:07:15 UTC
Permalink
Post by Christian König
Post by Oded Gabbay
On Mon, Apr 11, 2016 at 4:28 PM, Christian König
We need to ensure amd iommu v2 initializes before
driver uses such as drivers/gpu/drm/amd/amdkfd/kfd_module.c,
to do this make its init routine a subsys_initcall() which
ensures its load init is called first than modules when
built-in.
This reverts the old work around implemented through commit
1bacc894c227fad8a7 ("drivers: Move iommu/ before gpu/ in Makefile"),
instead of making the dependency implicit by linker order this
makes the ordering requirement explicit through proper kernel
APIs.
Sorry for not responding earlier. Just coming back to all the stuff on my TODO list.
Christian,
Just wanted to be sure if you tested this patch-set or not.
I did NOT tested it. If AMD IOMMU requires something which will now
initialize after the IOMMU module we will obviously run into trouble again.
I assumed that the creator of the patch did some testing.
Nope, hence [RTF] Request For Testing.
Post by Christian König
Post by Oded Gabbay
I don't think it should be merged without testing. If you already
tested it than fine. If not, I think I can do it in the next week or
so (just came back from PTO).
Yeah, agree totally.
Agreed, please let me know if someone is able to test and confirm
this works. It should work.

Luis
Oded Gabbay
2016-04-18 06:48:11 UTC
Permalink
Post by Luis R. Rodriguez
Post by Christian König
Post by Oded Gabbay
On Mon, Apr 11, 2016 at 4:28 PM, Christian König
We need to ensure amd iommu v2 initializes before
driver uses such as drivers/gpu/drm/amd/amdkfd/kfd_module.c,
to do this make its init routine a subsys_initcall() which
ensures its load init is called first than modules when
built-in.
This reverts the old work around implemented through commit
1bacc894c227fad8a7 ("drivers: Move iommu/ before gpu/ in Makefile"),
instead of making the dependency implicit by linker order this
makes the ordering requirement explicit through proper kernel
APIs.
Sorry for not responding earlier. Just coming back to all the stuff on my TODO list.
Christian,
Just wanted to be sure if you tested this patch-set or not.
I did NOT tested it. If AMD IOMMU requires something which will now
initialize after the IOMMU module we will obviously run into trouble again.
I assumed that the creator of the patch did some testing.
Nope, hence [RTF] Request For Testing.
Post by Christian König
Post by Oded Gabbay
I don't think it should be merged without testing. If you already
tested it than fine. If not, I think I can do it in the next week or
so (just came back from PTO).
Yeah, agree totally.
Agreed, please let me know if someone is able to test and confirm
this works. It should work.
Luis
Hi,
So I finally got to test this patch and it's not working.
The reason is that AMD IOMMUv2 gets initialized *before* AMD IOMMUv1 driver !
So, we get this from dmesg:

[ 0.439612] AMD IOMMUv2 driver by Joerg Roedel <***@suse.de>
[ 0.439614] AMD IOMMUv2 functionality not available on this system

later we get:
[ 1.084749] AMD-Vi: Found IOMMU at 0000:00:00.2 cap 0x40
[ 1.084750] AMD-Vi: Extended features: PPR GT IA PC
[ 1.084754] AMD-Vi: Interrupt remapping enabled
[ 1.085125] AMD-Vi: Lazy IO/TLB flushing enabled


And when I run some HSA samples, at the tear-down process stage we get this:

[ 7937.740306] BUG: unable to handle kernel NULL pointer dereference
at 0000000000000020
[ 7937.740932] IP: [<ffffffff819661c2>] mutex_lock+0x12/0x30
[ 7937.741534] PGD 55a57067 PUD 55a56067 PMD 0
[ 7937.742127] Oops: 0002 [#1] SMP
[ 7937.742709] Modules linked in: xt_CHECKSUM iptable_mangle
ipt_MASQUERADE nf_nat_masquerade_ipv4 iptable_nat nf_nat_ipv4 nf_nat
nf_conntrack_ipv4 nf_defrag_ipv4 xt_conntrack nf_conntrack tun bridge
stp llc ebtable_filter ebtables ip6table_filter ip6_tables
snd_hda_codec_realtek snd_hda_codec_generic snd_hda_codec_hdmi
snd_hda_intel snd_hda_codec kvm_amd kvm snd_hda_core snd_hwdep snd_seq
edac_mce_amd edac_core snd_seq_device irqbypass crct10dif_pclmul
crc32_pclmul ppdev pcspkr crc32c_intel snd_pcm fuse
ghash_clmulni_intel sp5100_tco acpi_cpufreq parport_pc i2c_piix4
snd_timer fam15h_power k10temp video tpm_infineon tpm_tis parport
shpchp snd tpm soundcore nfsd auth_rpcgss nfs_acl lockd grace sunrpc
serio_raw uas usb_storage r8169 mii fjes
[ 7937.746867] CPU: 2 PID: 2260 Comm: kfdtest Not tainted 4.6.0-rc3 #5
[ 7937.747600] Hardware name: Gigabyte Technology Co., Ltd. To be
filled by O.E.M./F2A88XM-D3H, BIOS F5 01/09/2014
[ 7937.748363] task: ffff88006944d880 ti: ffff88005b734000 task.ti:
ffff88005b734000
[ 7937.749136] RIP: 0010:[<ffffffff819661c2>] [<ffffffff819661c2>]
mutex_lock+0x12/0x30
[ 7937.749918] RSP: 0018:ffff88005b737cc8 EFLAGS: 00010246
[ 7937.750695] RAX: 0000000000000000 RBX: 0000000000000020 RCX: 0000000000000000
[ 7937.751491] RDX: 0000000080000000 RSI: ffff880000000000 RDI: 0000000000000020
[ 7937.752286] RBP: ffff88005b737cd0 R08: 0000000000000002 R09: ffffffff815321e2
[ 7937.753084] R10: ffffea00016f71c0 R11: 0000000000000246 R12: ffff880035239680
[ 7937.753884] R13: 0000000000000000 R14: 0000000000000001 R15: ffff88005b737d10
[ 7937.754684] FS: 00007fe0bc68a740(0000) GS:ffff88006d500000(0000)
knlGS:0000000000000000
[ 7937.755504] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 7937.756328] CR2: 0000000000000020 CR3: 0000000057198000 CR4: 00000000000406e0
[ 7937.757168] Stack:
[ 7937.758004] 0000000000000000 ffff88005b737d80 ffffffff810bae2c
ffff88005b737d48
[ 7937.758872] 0000000000000020 ffff880000000000 0000000000000000
ffff88005b737d38
[ 7937.759737] ffff88005b737d38 ffff88005b737d10 ffff88005b737d10
ffff8800ffffffff
[ 7937.760606] Call Trace:
[ 7937.761474] [<ffffffff810bae2c>] flush_workqueue+0x9c/0x530
[ 7937.762348] [<ffffffff818107a4>] ? amd_iommu_domain_clear_gcr3+0x84/0xa0
[ 7937.763225] [<ffffffff818157d0>] mn_release+0x60/0x70
[ 7937.764107] [<ffffffff81210c64>] __mmu_notifier_release+0x44/0xc0
[ 7937.764998] [<ffffffff811efbca>] exit_mmap+0x15a/0x170
[ 7937.765889] [<ffffffff811e8b33>] ? handle_mm_fault+0x14e3/0x1d50
[ 7937.766784] [<ffffffff814b06d0>] ? n_tty_open+0xd0/0xd0
[ 7937.767677] [<ffffffff81124cac>] ? exit_robust_list+0x5c/0x110
[ 7937.768573] [<ffffffff810a199b>] mmput+0x5b/0x100
[ 7937.769463] [<ffffffff810a8566>] do_exit+0x276/0xb30
[ 7937.770349] [<ffffffff810684a5>] ? __do_page_fault+0x205/0x4d0
[ 7937.771226] [<ffffffff810a8ea7>] do_group_exit+0x47/0xb0
[ 7937.772090] [<ffffffff810a8f24>] SyS_exit_group+0x14/0x20
[ 7937.772941] [<ffffffff819684f2>] entry_SYSCALL_64_fastpath+0x1a/0xa4
[ 7937.773801] Code: 83 f8 01 0f 85 6d ff ff ff eb db e8 f9 e4 73 ff
66 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 55 48 89 e5 53 48 89 fb e8
1e e4 ff ff <f0> ff 0b 79 08 48 89 df e8 c1 fe ff ff 65 48 8b 04 25 c0
d2 00
[ 7937.775671] RIP [<ffffffff819661c2>] mutex_lock+0x12/0x30
[ 7937.776588] RSP <ffff88005b737cc8>
[ 7937.777499] CR2: 0000000000000020
[ 7937.781746] ---[ end trace 46aeb31f738c07ff ]---
[ 7937.781748] Fixing recursive fault but reboot is needed!
Luis R. Rodriguez
2016-04-18 06:55:45 UTC
Permalink
Post by Oded Gabbay
Post by Luis R. Rodriguez
Post by Christian König
On Mon, Apr 11, 2016 at 4:28 PM, Christian König
On Tue, Mar 29, 2016 at 10:41 AM, Luis R. Rodriguez <
We need to ensure amd iommu v2 initializes before
driver uses such as drivers/gpu/drm/amd/amdkfd/kfd_module.c,
to do this make its init routine a subsys_initcall() which
ensures its load init is called first than modules when
built-in.
This reverts the old work around implemented through commit
1bacc894c227fad8a7 ("drivers: Move iommu/ before gpu/ in Makefile"),
instead of making the dependency implicit by linker order this
makes the ordering requirement explicit through proper kernel
APIs.
Sorry for not responding earlier. Just coming back to all the stuff on my TODO list.
Christian,
Just wanted to be sure if you tested this patch-set or not.
I did NOT tested it. If AMD IOMMU requires something which will now
initialize after the IOMMU module we will obviously run into trouble again.
I assumed that the creator of the patch did some testing.
Nope, hence [RTF] Request For Testing.
Post by Christian König
I don't think it should be merged without testing. If you already
tested it than fine. If not, I think I can do it in the next week or
so (just came back from PTO).
Yeah, agree totally.
Agreed, please let me know if someone is able to test and confirm
this works. It should work.
Luis
Hi,
So I finally got to test this patch and it's not working.
The reason is that AMD IOMMUv2 gets initialized *before* AMD IOMMUv1 driver !
Thanks can you try using late_initcall() instead then?

Luis
Oded Gabbay
2016-04-18 07:02:24 UTC
Permalink
Post by Luis R. Rodriguez
Post by Oded Gabbay
Post by Luis R. Rodriguez
Post by Christian König
Post by Oded Gabbay
On Mon, Apr 11, 2016 at 4:28 PM, Christian König
Post by Christian König
On Tue, Mar 29, 2016 at 10:41 AM, Luis R. Rodriguez
We need to ensure amd iommu v2 initializes before
driver uses such as drivers/gpu/drm/amd/amdkfd/kfd_module.c,
to do this make its init routine a subsys_initcall() which
ensures its load init is called first than modules when
built-in.
This reverts the old work around implemented through commit
1bacc894c227fad8a7 ("drivers: Move iommu/ before gpu/ in Makefile"),
instead of making the dependency implicit by linker order this
makes the ordering requirement explicit through proper kernel
APIs.
Sorry for not responding earlier. Just coming back to all the stuff
on my TODO list.
Christian,
Just wanted to be sure if you tested this patch-set or not.
I did NOT tested it. If AMD IOMMU requires something which will now
initialize after the IOMMU module we will obviously run into trouble again.
I assumed that the creator of the patch did some testing.
Nope, hence [RTF] Request For Testing.
Post by Christian König
Post by Oded Gabbay
I don't think it should be merged without testing. If you already
tested it than fine. If not, I think I can do it in the next week or
so (just came back from PTO).
Yeah, agree totally.
Agreed, please let me know if someone is able to test and confirm
this works. It should work.
Luis
Hi,
So I finally got to test this patch and it's not working.
The reason is that AMD IOMMUv2 gets initialized *before* AMD IOMMUv1 driver !
Thanks can you try using late_initcall() instead then?
Luis
That will make it initialize *after* drm subsystem, which will cause
another bug.

Oded
Luis R. Rodriguez
2016-04-18 12:03:50 UTC
Permalink
Post by Oded Gabbay
Post by Luis R. Rodriguez
Post by Oded Gabbay
Post by Luis R. Rodriguez
Post by Christian König
Post by Oded Gabbay
On Mon, Apr 11, 2016 at 4:28 PM, Christian König
Post by Christian König
On Tue, Mar 29, 2016 at 10:41 AM, Luis R. Rodriguez
We need to ensure amd iommu v2 initializes before
driver uses such as drivers/gpu/drm/amd/amdkfd/kfd_module.c,
to do this make its init routine a subsys_initcall() which
ensures its load init is called first than modules when
built-in.
This reverts the old work around implemented through commit
1bacc894c227fad8a7 ("drivers: Move iommu/ before gpu/ in Makefile"),
instead of making the dependency implicit by linker order this
makes the ordering requirement explicit through proper kernel
APIs.
Sorry for not responding earlier. Just coming back to all the stuff
on my TODO list.
Christian,
Just wanted to be sure if you tested this patch-set or not.
I did NOT tested it. If AMD IOMMU requires something which will now
initialize after the IOMMU module we will obviously run into trouble again.
I assumed that the creator of the patch did some testing.
Nope, hence [RTF] Request For Testing.
Post by Christian König
Post by Oded Gabbay
I don't think it should be merged without testing. If you already
tested it than fine. If not, I think I can do it in the next week or
so (just came back from PTO).
Yeah, agree totally.
Agreed, please let me know if someone is able to test and confirm
this works. It should work.
Luis
Hi,
So I finally got to test this patch and it's not working.
The reason is that AMD IOMMUv2 gets initialized *before* AMD IOMMUv1 driver !
Thanks can you try using late_initcall() instead then?
Luis
That will make it initialize *after* drm subsystem, which will cause
another bug.
Hold up, I thought that we needed AMD IOMMUv2 to get initialized
before AMD IOMMUv1 ? That's what the patch did. Can someone clarify
the requirements then?

I'll provide some review of the current state of affairs first, without the
patch. AMD IOMMUv1 uses x86_init.iommu.iommu_init and that has its own init
semantics. Specifically that gets called via pci_iommu_init() which is pegged
on the init order via rootfs_initcall(pci_iommu_init);

Then AMD IOMMUv2 uses module_init() and that when is built-in falls on to
__initcall() which is device_initcall().

The order is:

#define pure_initcall(fn) __define_initcall(fn, 0)

#define core_initcall(fn) __define_initcall(fn, 1)
#define core_initcall_sync(fn) __define_initcall(fn, 1s)
#define postcore_initcall(fn) __define_initcall(fn, 2)
#define postcore_initcall_sync(fn) __define_initcall(fn, 2s)
#define arch_initcall(fn) __define_initcall(fn, 3)
#define arch_initcall_sync(fn) __define_initcall(fn, 3s)
#define subsys_initcall(fn) __define_initcall(fn, 4)
#define subsys_initcall_sync(fn) __define_initcall(fn, 4s)
#define fs_initcall(fn) __define_initcall(fn, 5)
#define fs_initcall_sync(fn) __define_initcall(fn, 5s)
#define rootfs_initcall(fn) __define_initcall(fn, rootfs)
#define device_initcall(fn) __define_initcall(fn, 6)
#define device_initcall_sync(fn) __define_initcall(fn, 6s)
#define late_initcall(fn) __define_initcall(fn, 7)
#define late_initcall_sync(fn) __define_initcall(fn, 7s)

So technically rootfs_initcall() (v1 amd) should be being called
first already, and after that AMD IOMMUv2 gets called next.

You said that with my patch you saw AMD IOMMUv2 kick off first,
that was intentional as I thought that's what you needed. Can
someone please describe the requirements?

Also what does drm use that you say has a conflict already? What
drm code are we talking about exactly ?

Luis
Joerg Roedel
2016-04-25 10:23:51 UTC
Permalink
Post by Luis R. Rodriguez
You said that with my patch you saw AMD IOMMUv2 kick off first,
that was intentional as I thought that's what you needed. Can
someone please describe the requirements?
Also what does drm use that you say has a conflict already? What
drm code are we talking about exactly ?
The required order is:

1. AMD IOMMUv1 (in-kernel only, initialized at boot time after PCI)
2. AMD IOMMUv2 (in-kernel or as module, provides demand paging
and uses v1 interfaces to talk to the IOMMU)
3. AMD-KFD (Implements compute offloading to the GPU and
uses AMD IOMMUv2 functionality, also provides a
symbol for the radeon driver)
4. DRM with (Checks if the symbol provided by AMD-KFD is
Radeon available at init time and does the KFD
initialization from there, because the GPU needs
to be up first)

So AMD IOMMUv2 does not initialize (but it does load to have the symbols
available for drivers that optionally use its functionality) without the
AMD IOMMUv1 driver, as it can't access the IOMMU hardware then.

AMD-KFD does not load without AMD IOMMUv2 being loaded, as it depends on
its symbols. AMD-KFD on the other hand needs to be loaded before the
radeon driver (but this it not enforced by symbols), because otherwise
the radeon driver will not initialize the AMD-KFD driver.

When AMD-KFD is loaded and you load radeon then, you get the KFD
functionality in the kernel. Then you can move on to the fun getting the
userspace running and actually execute anything on the GPU. But thats
another story.

I know what you think, and I agree: It's a big mess :)

Regards,

Joerg
Luis R. Rodriguez
2016-05-27 00:46:42 UTC
Permalink
Post by Joerg Roedel
Post by Luis R. Rodriguez
You said that with my patch you saw AMD IOMMUv2 kick off first,
that was intentional as I thought that's what you needed. Can
someone please describe the requirements?
Also what does drm use that you say has a conflict already? What
drm code are we talking about exactly ?
1. AMD IOMMUv1 (in-kernel only, initialized at boot time after PCI)
2. AMD IOMMUv2 (in-kernel or as module, provides demand paging
and uses v1 interfaces to talk to the IOMMU)
3. AMD-KFD (Implements compute offloading to the GPU and
uses AMD IOMMUv2 functionality, also provides a
symbol for the radeon driver)
4. DRM with (Checks if the symbol provided by AMD-KFD is
Radeon available at init time and does the KFD
initialization from there, because the GPU needs
to be up first)
So AMD IOMMUv2 does not initialize (but it does load to have the symbols
available for drivers that optionally use its functionality) without the
AMD IOMMUv1 driver, as it can't access the IOMMU hardware then.
AMD-KFD does not load without AMD IOMMUv2 being loaded, as it depends on
its symbols. AMD-KFD on the other hand needs to be loaded before the
radeon driver (but this it not enforced by symbols), because otherwise
the radeon driver will not initialize the AMD-KFD driver.
When AMD-KFD is loaded and you load radeon then, you get the KFD
functionality in the kernel. Then you can move on to the fun getting the
userspace running and actually execute anything on the GPU. But thats
another story.
I know what you think, and I agree: It's a big mess :)
Its no concern for me that its a mess, frankly most drivers are. This however
illustrates a semantic gap in the driver core when you have more than 3
dependencies after a rootfs_initcall(), given that the buck stops at
late_initcall(), only 3 levels above. With 4 dependencies you run out
of semantics to specify explicit requirements.

Ties can be disputed through link order though, but this is obviously fragile,
and the dependencies are then left implicit. Nothing vets for correctness,
either in software or through static analysers.

To summarize the specific code in question is (and order required):

AMD IOMMUv1: arch/x86/kernel/pci-dma.c rootfs_initcall(pci_iommu_init);
AMD IOMMUv2: drivers/iommu/amd_iommu_v2.c module_init(amd_iommu_v2_init); (device_initcall())
AMD KFD: drivers/gpu/drm/amd/amdkfd/kfd_module.c module_init(kfd_module_init); (device_initcall())
AMD Radeon: drivers/gpu/drm/radeon/radeon_drv.c module_init(radeon_init); (device_initcall())

Commit 1bacc894c227fad8a7 ("drivers: Move iommu/ before gpu/ in Makefile") resolved
the race between IOMMU and GPU drivers this way, however an alternative is to make
the dependencies in levels explicit by making the amdkfd and radeon drivers both
use late_initcall(), this however is technically still racy specially since you
note that the amdkfd driver is not loaded prior to radeon through symbol
dependencies, if happens to be loaded it then you get KFD functionality,
otherwise you don't.

Making both amdkfd and radeon use late_initcall() would actually work now though
but that's only because the link order also happens to match the dependency
requirements:

drivers/gpu/drm/Makefile
...
obj-$(CONFIG_HSA_AMD) += amd/amdkfd/
obj-$(CONFIG_DRM_RADEON)+= radeon/
...

Since we currently lack proper semantics to define clear dependencies for all
this reverting 1bacc894c227fad8a7 after using late_initcall() on both amdkfd
and radeon would not be useful other than to just test the race fix, given that
such work around would still be present also on drivers/gpu/drm/Makefile to
account for the race between amdkfd and radeon. Reverting 1bacc894c227fad8a7
after using late_initcall() on both amdkfd and radeon would just bump the
race work around another level.

Modifying both amdkfd and radeon to use late_initcall() however seems well
justified for now, and given the current state of affairs with link order
one should be able to then correctly build all these things as built-in
and it should work correctly.

I'm still not satisfied with the issues on semantics here though. A fix
for that, if desired, should be possible using linker-tables, currently
in RFCv2 [0] stage. Linker tables offer practically infinite number (as
long as a valid ELF section name, as the order level is part of the
section name) of order levels. It also would offer the opportunity
to build dependencies outside of the driver core layer, so you can
customize the dependency map per subsystem as you see fit.

So -- for now I'll submit a patch to just use late_initcall(), in the
future we can evaluate linker table uses to make these dependencies
explicit. This is perhaps a good test case to illustrate limitations
of the current driver model.

[0] http://lkml.kernel.org/r/1455889559-9428-1-git-send-email-***@kernel.org

Luis
Luis R. Rodriguez
2016-05-27 01:18:36 UTC
Permalink
To get KFD support in radeon we need the following
initialization to happen in this order, their
respective driver file that has its init routine
listed next to it:

0. AMD IOMMUv1: arch/x86/kernel/pci-dma.c
1. AMD IOMMUv2: drivers/iommu/amd_iommu_v2.c
2. AMD KFD: drivers/gpu/drm/amd/amdkfd/kfd_module.c
3. AMD Radeon: drivers/gpu/drm/radeon/radeon_drv.c

Order is rather implicit, but these drivers can currently
only do so much given the amount of leg room available.
Below are the respective init routines and how they are
initialized:

arch/x86/kernel/pci-dma.c rootfs_initcall(pci_iommu_init);
drivers/iommu/amd_iommu_v2.c module_init(amd_iommu_v2_init);
drivers/gpu/drm/amd/amdkfd/kfd_module.c module_init(kfd_module_init);
drivers/gpu/drm/radeon/radeon_drv.c module_init(radeon_init);

When a driver is built-in module_init() folds to use
device_initcall(), and we have the following possible
orders:

#define pure_initcall(fn) __define_initcall(fn, 0)
#define core_initcall(fn) __define_initcall(fn, 1)
#define postcore_initcall(fn)__define_initcall(fn, 2)
#define arch_initcall(fn) __define_initcall(fn, 3)
#define subsys_initcall(fn) __define_initcall(fn, 4)
#define fs_initcall(fn) __define_initcall(fn, 5)
---------------------------------------------------------
#define rootfs_initcall(fn) __define_initcall(fn, rootfs)
#define device_initcall(fn) __define_initcall(fn, 6)
#define late_initcall(fn) __define_initcall(fn, 7)

Since we start off from rootfs_initcall(), it gives us 3 more
levels of leg room to play with for order semantics, this isn't
enough to address all required levels of dependencies, this
is specially true given that AMD-KFD needs to be loaded before
the radeon driver -- -but this it not enforced by symbols.
If the AMD-KFD driver is not loaded prior to the radeon driver
because otherwise the radeon driver will not initialize the
AMD-KFD driver and you get no KFD functionality in userspace.

Commit 1bacc894c227fad8a7 ("drivers: Move iommu/ before gpu/ in
Makefile") works around some of the possibe races between
the AMD IOMMU v2 and GPU drivers by changing the link order.
This is fragile, however its the bets we can do, given that
making the GPU drivers use late_initcall() would also implicate
a similar race between them. That possible race is fortunatley
addressed given that the drm Makefile currently has amdkfd
linked prior to radeon:

drivers/gpu/drm/Makefile
...
obj-$(CONFIG_HSA_AMD) += amd/amdkfd/
obj-$(CONFIG_DRM_RADEON)+= radeon/
...

Changing amdkfd and radeon to late_initcall() however is
still the right call in orde to annotate explicitly a
delayed dependency requirement between the GPU drivers
and the IOMMUs.

We can't address the fragile nature of the link order
right now, but in the future that might be possible.

Signed-off-by: Luis R. Rodriguez <***@kernel.org>
---

Please note, the changes to drivers/Makefile are just
for the sake of forcing the possible race to occur,
if this works well the actual [PATCH] submission will
skip those changes as its pointless to remove those
work arounds as it stands, due to the limited nature
of the levels available for addressing requirements.

Also, if you are aware of further dependency hell
things like these -- please do let me know as I am
interested in looking at addressing them.

drivers/Makefile | 6 ++----
drivers/gpu/drm/amd/amdkfd/kfd_module.c | 2 +-
drivers/gpu/drm/radeon/radeon_drv.c | 2 +-
3 files changed, 4 insertions(+), 6 deletions(-)

diff --git a/drivers/Makefile b/drivers/Makefile
index 0b6f3d60193d..0fbe3982041f 100644
--- a/drivers/Makefile
+++ b/drivers/Makefile
@@ -50,10 +50,7 @@ obj-$(CONFIG_RESET_CONTROLLER) += reset/
obj-y += tty/
obj-y += char/

-# iommu/ comes before gpu as gpu are using iommu controllers
-obj-$(CONFIG_IOMMU_SUPPORT) += iommu/
-
-# gpu/ comes after char for AGP vs DRM startup and after iommu
+# gpu/ comes after char for AGP vs DRM startup
obj-y += gpu/

obj-$(CONFIG_CONNECTOR) += connector/
@@ -147,6 +144,7 @@ obj-y += clk/

obj-$(CONFIG_MAILBOX) += mailbox/
obj-$(CONFIG_HWSPINLOCK) += hwspinlock/
+obj-$(CONFIG_IOMMU_SUPPORT) += iommu/
obj-$(CONFIG_REMOTEPROC) += remoteproc/
obj-$(CONFIG_RPMSG) += rpmsg/

diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_module.c b/drivers/gpu/drm/amd/amdkfd/kfd_module.c
index 850a5623661f..3d1dab8a31c7 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_module.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_module.c
@@ -141,7 +141,7 @@ static void __exit kfd_module_exit(void)
dev_info(kfd_device, "Removed module\n");
}

-module_init(kfd_module_init);
+late_initcall(kfd_module_init);
module_exit(kfd_module_exit);

MODULE_AUTHOR(KFD_DRIVER_AUTHOR);
diff --git a/drivers/gpu/drm/radeon/radeon_drv.c b/drivers/gpu/drm/radeon/radeon_drv.c
index b55aa740171f..1fa1b7f3a89c 100644
--- a/drivers/gpu/drm/radeon/radeon_drv.c
+++ b/drivers/gpu/drm/radeon/radeon_drv.c
@@ -609,7 +609,7 @@ static void __exit radeon_exit(void)
radeon_unregister_atpx_handler();
}

-module_init(radeon_init);
+late_initcall(radeon_init);
module_exit(radeon_exit);

MODULE_AUTHOR(DRIVER_AUTHOR);
--
2.8.2
Luis R. Rodriguez
2016-05-31 17:15:55 UTC
Permalink
Post by Luis R. Rodriguez
diff --git a/drivers/gpu/drm/radeon/radeon_drv.c b/drivers/gpu/drm/radeon/radeon_drv.c
index b55aa740171f..1fa1b7f3a89c 100644
--- a/drivers/gpu/drm/radeon/radeon_drv.c
+++ b/drivers/gpu/drm/radeon/radeon_drv.c
@@ -609,7 +609,7 @@ static void __exit radeon_exit(void)
radeon_unregister_atpx_handler();
}
-module_init(radeon_init);
+late_initcall(radeon_init);
module_exit(radeon_exit);
Need to modify also amdgpu module_init
Thanks, so other than considering the first two hunks are only for testing purposes
(a proper [PATCH] will drop these hunks on the Makefile), you're suggestng this
then, is that right?:

---
drivers/Makefile | 6 ++----
drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c | 2 +-
drivers/gpu/drm/amd/amdkfd/kfd_module.c | 2 +-
drivers/gpu/drm/radeon/radeon_drv.c | 2 +-
4 files changed, 5 insertions(+), 7 deletions(-)

diff --git a/drivers/Makefile b/drivers/Makefile
index 0b6f3d60193d..0fbe3982041f 100644
--- a/drivers/Makefile
+++ b/drivers/Makefile
@@ -50,10 +50,7 @@ obj-$(CONFIG_RESET_CONTROLLER) += reset/
obj-y += tty/
obj-y += char/

-# iommu/ comes before gpu as gpu are using iommu controllers
-obj-$(CONFIG_IOMMU_SUPPORT) += iommu/
-
-# gpu/ comes after char for AGP vs DRM startup and after iommu
+# gpu/ comes after char for AGP vs DRM startup
obj-y += gpu/

obj-$(CONFIG_CONNECTOR) += connector/
@@ -147,6 +144,7 @@ obj-y += clk/

obj-$(CONFIG_MAILBOX) += mailbox/
obj-$(CONFIG_HWSPINLOCK) += hwspinlock/
+obj-$(CONFIG_IOMMU_SUPPORT) += iommu/
obj-$(CONFIG_REMOTEPROC) += remoteproc/
obj-$(CONFIG_RPMSG) += rpmsg/

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
index 1dab5f2b725b..1ca448f2b4d2 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
@@ -589,7 +589,7 @@ static void __exit amdgpu_exit(void)
amdgpu_sync_fini();
}

-module_init(amdgpu_init);
+late_initcall(amdgpu_init);
module_exit(amdgpu_exit);

MODULE_AUTHOR(DRIVER_AUTHOR);
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_module.c b/drivers/gpu/drm/amd/amdkfd/kfd_module.c
index 850a5623661f..3d1dab8a31c7 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_module.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_module.c
@@ -141,7 +141,7 @@ static void __exit kfd_module_exit(void)
dev_info(kfd_device, "Removed module\n");
}

-module_init(kfd_module_init);
+late_initcall(kfd_module_init);
module_exit(kfd_module_exit);

MODULE_AUTHOR(KFD_DRIVER_AUTHOR);
diff --git a/drivers/gpu/drm/radeon/radeon_drv.c b/drivers/gpu/drm/radeon/radeon_drv.c
index b55aa740171f..1fa1b7f3a89c 100644
--- a/drivers/gpu/drm/radeon/radeon_drv.c
+++ b/drivers/gpu/drm/radeon/radeon_drv.c
@@ -609,7 +609,7 @@ static void __exit radeon_exit(void)
radeon_unregister_atpx_handler();
}

-module_init(radeon_init);
+late_initcall(radeon_init);
module_exit(radeon_exit);

MODULE_AUTHOR(DRIVER_AUTHOR);
--
2.8.2
I tested this on Kaveri, and amdkfd is working. For amdkfd that's
fine, but IMO that's not enough testing for radeon/amdgpu. I would
like to hear AMD's developers take on this.
Sure. Hopefully the above extensions suffice. In the future we should be able
to also replace the radeon amdkfd -EPROBE_DEFER kludge and the implicit
sensitivity in Makefiles over GPU drivers and IOMMUs, but a bit more work is
needed before that happens.

Luis
Oded Gabbay
2016-05-31 17:33:33 UTC
Permalink
Post by Luis R. Rodriguez
Post by Luis R. Rodriguez
diff --git a/drivers/gpu/drm/radeon/radeon_drv.c b/drivers/gpu/drm/radeon/radeon_drv.c
index b55aa740171f..1fa1b7f3a89c 100644
--- a/drivers/gpu/drm/radeon/radeon_drv.c
+++ b/drivers/gpu/drm/radeon/radeon_drv.c
@@ -609,7 +609,7 @@ static void __exit radeon_exit(void)
radeon_unregister_atpx_handler();
}
-module_init(radeon_init);
+late_initcall(radeon_init);
module_exit(radeon_exit);
Need to modify also amdgpu module_init
Thanks, so other than considering the first two hunks are only for testing purposes
(a proper [PATCH] will drop these hunks on the Makefile), you're suggestng this
Yes, the below should cover the amdgpu case as well.
Post by Luis R. Rodriguez
---
drivers/Makefile | 6 ++----
drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c | 2 +-
drivers/gpu/drm/amd/amdkfd/kfd_module.c | 2 +-
drivers/gpu/drm/radeon/radeon_drv.c | 2 +-
4 files changed, 5 insertions(+), 7 deletions(-)
diff --git a/drivers/Makefile b/drivers/Makefile
index 0b6f3d60193d..0fbe3982041f 100644
--- a/drivers/Makefile
+++ b/drivers/Makefile
@@ -50,10 +50,7 @@ obj-$(CONFIG_RESET_CONTROLLER) += reset/
obj-y += tty/
obj-y += char/
-# iommu/ comes before gpu as gpu are using iommu controllers
-obj-$(CONFIG_IOMMU_SUPPORT) += iommu/
-
-# gpu/ comes after char for AGP vs DRM startup and after iommu
+# gpu/ comes after char for AGP vs DRM startup
obj-y += gpu/
obj-$(CONFIG_CONNECTOR) += connector/
@@ -147,6 +144,7 @@ obj-y += clk/
obj-$(CONFIG_MAILBOX) += mailbox/
obj-$(CONFIG_HWSPINLOCK) += hwspinlock/
+obj-$(CONFIG_IOMMU_SUPPORT) += iommu/
obj-$(CONFIG_REMOTEPROC) += remoteproc/
obj-$(CONFIG_RPMSG) += rpmsg/
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
index 1dab5f2b725b..1ca448f2b4d2 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
@@ -589,7 +589,7 @@ static void __exit amdgpu_exit(void)
amdgpu_sync_fini();
}
-module_init(amdgpu_init);
+late_initcall(amdgpu_init);
module_exit(amdgpu_exit);
MODULE_AUTHOR(DRIVER_AUTHOR);
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_module.c b/drivers/gpu/drm/amd/amdkfd/kfd_module.c
index 850a5623661f..3d1dab8a31c7 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_module.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_module.c
@@ -141,7 +141,7 @@ static void __exit kfd_module_exit(void)
dev_info(kfd_device, "Removed module\n");
}
-module_init(kfd_module_init);
+late_initcall(kfd_module_init);
module_exit(kfd_module_exit);
MODULE_AUTHOR(KFD_DRIVER_AUTHOR);
diff --git a/drivers/gpu/drm/radeon/radeon_drv.c b/drivers/gpu/drm/radeon/radeon_drv.c
index b55aa740171f..1fa1b7f3a89c 100644
--- a/drivers/gpu/drm/radeon/radeon_drv.c
+++ b/drivers/gpu/drm/radeon/radeon_drv.c
@@ -609,7 +609,7 @@ static void __exit radeon_exit(void)
radeon_unregister_atpx_handler();
}
-module_init(radeon_init);
+late_initcall(radeon_init);
module_exit(radeon_exit);
MODULE_AUTHOR(DRIVER_AUTHOR);
--
2.8.2
I tested this on Kaveri, and amdkfd is working. For amdkfd that's
fine, but IMO that's not enough testing for radeon/amdgpu. I would
like to hear AMD's developers take on this.
Sure. Hopefully the above extensions suffice. In the future we should be able
to also replace the radeon amdkfd -EPROBE_DEFER kludge and the implicit
sensitivity in Makefiles over GPU drivers and IOMMUs, but a bit more work is
needed before that happens.
Luis
Luis R. Rodriguez
2016-05-31 16:58:34 UTC
Permalink
Post by Luis R. Rodriguez
To get KFD support in radeon we need the following
initialization to happen in this order, their
respective driver file that has its init routine
0. AMD IOMMUv1: arch/x86/kernel/pci-dma.c
1. AMD IOMMUv2: drivers/iommu/amd_iommu_v2.c
2. AMD KFD: drivers/gpu/drm/amd/amdkfd/kfd_module.c
3. AMD Radeon: drivers/gpu/drm/radeon/radeon_drv.c
Order is rather implicit, but these drivers can currently
only do so much given the amount of leg room available.
Below are the respective init routines and how they are
arch/x86/kernel/pci-dma.c rootfs_initcall(pci_iommu_init);
drivers/iommu/amd_iommu_v2.c module_init(amd_iommu_v2_init);
drivers/gpu/drm/amd/amdkfd/kfd_module.c module_init(kfd_module_init);
drivers/gpu/drm/radeon/radeon_drv.c module_init(radeon_init);
When a driver is built-in module_init() folds to use
device_initcall(), and we have the following possible
#define pure_initcall(fn) __define_initcall(fn, 0)
#define core_initcall(fn) __define_initcall(fn, 1)
#define postcore_initcall(fn)__define_initcall(fn, 2)
#define arch_initcall(fn) __define_initcall(fn, 3)
#define subsys_initcall(fn) __define_initcall(fn, 4)
#define fs_initcall(fn) __define_initcall(fn, 5)
---------------------------------------------------------
#define rootfs_initcall(fn) __define_initcall(fn, rootfs)
#define device_initcall(fn) __define_initcall(fn, 6)
#define late_initcall(fn) __define_initcall(fn, 7)
Since we start off from rootfs_initcall(), it gives us 3 more
levels of leg room to play with for order semantics, this isn't
enough to address all required levels of dependencies, this
is specially true given that AMD-KFD needs to be loaded before
the radeon driver -- -but this it not enforced by symbols.
If the AMD-KFD driver is not loaded prior to the radeon driver
because otherwise the radeon driver will not initialize the
AMD-KFD driver and you get no KFD functionality in userspace.
Commit 1bacc894c227fad8a7 ("drivers: Move iommu/ before gpu/ in
Makefile") works around some of the possibe races between
the AMD IOMMU v2 and GPU drivers by changing the link order.
This is fragile, however its the bets we can do, given that
making the GPU drivers use late_initcall() would also implicate
a similar race between them. That possible race is fortunatley
addressed given that the drm Makefile currently has amdkfd
drivers/gpu/drm/Makefile
...
obj-$(CONFIG_HSA_AMD) += amd/amdkfd/
obj-$(CONFIG_DRM_RADEON)+= radeon/
...
Changing amdkfd and radeon to late_initcall() however is
still the right call in orde to annotate explicitly a
delayed dependency requirement between the GPU drivers
and the IOMMUs.
We can't address the fragile nature of the link order
right now, but in the future that might be possible.
---
Please note, the changes to drivers/Makefile are just
for the sake of forcing the possible race to occur,
if this works well the actual [PATCH] submission will
skip those changes as its pointless to remove those
work arounds as it stands, due to the limited nature
of the levels available for addressing requirements.
Also, if you are aware of further dependency hell
things like these -- please do let me know as I am
interested in looking at addressing them.
This should be fixed with -EDEFER_PROBE instead. Frobbing initcall
levels should then just be done as an optimization to avoid too much
reprobing.
radeon already uses -EDEFER_PROBE but it assumes that amdkfd *is* loaded first,
and only if it is already loaded can it count on getting -EDEFER_PROBE. The
radeon driver will deffer probe *iff* kgd2kfd_init() returns -EDEFER_PROBE,
which only happens when amdkfd_init_completed is no longer present. If radeon
gets linked first though the symbol fetch for kgd2kfd_init() will make it as
not-present though. So the current heuristic used does not address proper link
/ load order. Part of the issue mentioned on the commit log is another race
underneath the hood with the AMD IOMMU v2 which is needed for amdkfd. The
underlying issue however really is the lack of available clear semantics for
dependencies over 3 levels here. This is solved one way or another by link
order in the Makefiles, but as I've noted so far this has been rather implicit
and fragile. The change here makes at least the order of the GPU drivers
explicitly later than the IOMMUs. The specific race between radeon and amdfkd
is solved currently through link order through the Makefiles. In the future we
maybe able to make things more explicit.

-EDEFER_PROBE also introduces a latency on load which we should not need if we
can handle proper link / load order dependency annotations. This change is a
small part of that work, but as it the commit log also notes future further
work is possible to build stronger semantics. Some of the work I'm doing with
linker-tables may help with this in the future [0], but for now this should
help with what the semantics we have in place.

[0] http://lkml.kernel.org/r/1455889559-9428-1-git-send-email-***@kernel.org

Luis
Luis R. Rodriguez
2016-06-01 21:11:59 UTC
Permalink
Post by Luis R. Rodriguez
To get KFD support in radeon we need the following
initialization to happen in this order, their
respective driver file that has its init routine
0. AMD IOMMUv1: arch/x86/kernel/pci-dma.c
1. AMD IOMMUv2: drivers/iommu/amd_iommu_v2.c
2. AMD KFD: drivers/gpu/drm/amd/amdkfd/kfd_module.c
3. AMD Radeon: drivers/gpu/drm/radeon/radeon_drv.c
Order is rather implicit, but these drivers can currently
only do so much given the amount of leg room available.
Below are the respective init routines and how they are
arch/x86/kernel/pci-dma.c rootfs_initcall(pci_iommu_init);
drivers/iommu/amd_iommu_v2.c module_init(amd_iommu_v2_init);
drivers/gpu/drm/amd/amdkfd/kfd_module.c module_init(kfd_module_init);
drivers/gpu/drm/radeon/radeon_drv.c module_init(radeon_init);
When a driver is built-in module_init() folds to use
device_initcall(), and we have the following possible
#define pure_initcall(fn) __define_initcall(fn, 0)
#define core_initcall(fn) __define_initcall(fn, 1)
#define postcore_initcall(fn)__define_initcall(fn, 2)
#define arch_initcall(fn) __define_initcall(fn, 3)
#define subsys_initcall(fn) __define_initcall(fn, 4)
#define fs_initcall(fn) __define_initcall(fn, 5)
---------------------------------------------------------
#define rootfs_initcall(fn) __define_initcall(fn, rootfs)
#define device_initcall(fn) __define_initcall(fn, 6)
#define late_initcall(fn) __define_initcall(fn, 7)
Since we start off from rootfs_initcall(), it gives us 3 more
levels of leg room to play with for order semantics, this isn't
enough to address all required levels of dependencies, this
is specially true given that AMD-KFD needs to be loaded before
the radeon driver -- -but this it not enforced by symbols.
If the AMD-KFD driver is not loaded prior to the radeon driver
because otherwise the radeon driver will not initialize the
AMD-KFD driver and you get no KFD functionality in userspace.
Commit 1bacc894c227fad8a7 ("drivers: Move iommu/ before gpu/ in
Makefile") works around some of the possibe races between
the AMD IOMMU v2 and GPU drivers by changing the link order.
This is fragile, however its the bets we can do, given that
making the GPU drivers use late_initcall() would also implicate
a similar race between them. That possible race is fortunatley
addressed given that the drm Makefile currently has amdkfd
drivers/gpu/drm/Makefile
...
obj-$(CONFIG_HSA_AMD) += amd/amdkfd/
obj-$(CONFIG_DRM_RADEON)+= radeon/
...
Changing amdkfd and radeon to late_initcall() however is
still the right call in orde to annotate explicitly a
delayed dependency requirement between the GPU drivers
and the IOMMUs.
We can't address the fragile nature of the link order
right now, but in the future that might be possible.
---
Please note, the changes to drivers/Makefile are just
for the sake of forcing the possible race to occur,
if this works well the actual [PATCH] submission will
skip those changes as its pointless to remove those
work arounds as it stands, due to the limited nature
of the levels available for addressing requirements.
Also, if you are aware of further dependency hell
things like these -- please do let me know as I am
interested in looking at addressing them.
This should be fixed with -EPROBE_DEFER instead. Frobbing initcall
levels should then just be done as an optimization to avoid too much
reprobing.
radeon already uses -EPROBE_DEFER but it assumes that amdkfd *is* loaded first,
and only if it is already loaded can it count on getting -EPROBE_DEFER. The
radeon driver will defer probe *iff* kgd2kfd_init() returns -EPROBE_DEFER,
which only happens when amdkfd_init_completed is no longer 0. If radeon
gets linked first though the symbol fetch for kgd2kfd_init() will make it as
not-present though.
I did some more homework and I no longer believe this was the issue. More below.
So the current heuristic used does not address proper link
/ load order. Part of the issue mentioned on the commit log is another race
underneath the hood with the AMD IOMMU v2 which is needed for amdkfd. The
underlying issue however really is the lack of available clear semantics for
dependencies over 3 levels here. This is solved one way or another by link
order in the Makefiles, but as I've noted so far this has been rather implicit
and fragile. The change here makes at least the order of the GPU drivers
explicitly later than the IOMMUs. The specific race between radeon and amdfkd
is solved currently through link order through the Makefiles. In the future we
maybe able to make things more explicit.
Sounds like the EPROBE_DEFER handling is broken - if the module isn't set
up yet but selected in Kconfig, and needed for that hw generation then it
should not just silently fail.
Although I cannot confirm through testing, I did an under the hood inspection
of symbol_request() which both radeon and amdgpu uses and have a better idea
of why things where failing, it should not really be the inability to trust
symbol_request() to work if link order changes between amdkfd and radeon or
amdgpu, its the issue of link order also needed of the AMD IOMMU *and* amdkfd.
So my above assumption here that -EPROBE_DEFER could fail should be
invalid given that the real issue should have been that amdkfd was being
kicked off prior to the AMD IOMMU v2, at that point kgd2kfd_init() would
fail. The silent failure then was an issue not of radeon but rather higher
order drivers, and in this case neither radeon nor amdgpu could address
that regardless of what they do. Oded fixed this by changing the link order
between all IOMMUs and GPU drivers via commit 1bacc894c227fad8a7 ("drivers:
Move iommu/ before gpu/ in Makefile").
-EPROBE_DEFER also introduces a latency on load which we should not need if we
can handle proper link / load order dependency annotations. This change is a
small part of that work, but as it the commit log also notes future further
work is possible to build stronger semantics. Some of the work I'm doing with
linker-tables may help with this in the future [0], but for now this should
help with what the semantics we have in place.
That's what I meant with "avoiding too much reprobing". But in the end the
current solution to cross-driver deps we have is EPROBE_DEFER. Fiddling
with the link order is all well for optimizing stuff, but imo _way_ too
fragile for correctness.
Agreed, but EPROBE_DEFER cannot ensure layers below are correct either. By
moving the GPU drivers radon and amdgpu to late_initcall() we'd actually be
taking one more explicit ordering step for correctness to ensure that in case
the Makefile order is different in other environments at least the IOMMU and
GPU driver initialization is explicitly correct.

Luis
Luis R. Rodriguez
2016-11-10 22:12:44 UTC
Permalink
Post by Luis R. Rodriguez
Post by Luis R. Rodriguez
To get KFD support in radeon we need the following
initialization to happen in this order, their
respective driver file that has its init routine
0. AMD IOMMUv1: arch/x86/kernel/pci-dma.c
1. AMD IOMMUv2: drivers/iommu/amd_iommu_v2.c
2. AMD KFD: drivers/gpu/drm/amd/amdkfd/kfd_module.c
3. AMD Radeon: drivers/gpu/drm/radeon/radeon_drv.c
Order is rather implicit, but these drivers can currently
only do so much given the amount of leg room available.
Below are the respective init routines and how they are
arch/x86/kernel/pci-dma.c rootfs_initcall(pci_iommu_init);
drivers/iommu/amd_iommu_v2.c module_init(amd_iommu_v2_init);
drivers/gpu/drm/amd/amdkfd/kfd_module.c module_init(kfd_module_init);
drivers/gpu/drm/radeon/radeon_drv.c module_init(radeon_init);
When a driver is built-in module_init() folds to use
device_initcall(), and we have the following possible
#define pure_initcall(fn) __define_initcall(fn, 0)
#define core_initcall(fn) __define_initcall(fn, 1)
#define postcore_initcall(fn)__define_initcall(fn, 2)
#define arch_initcall(fn) __define_initcall(fn, 3)
#define subsys_initcall(fn) __define_initcall(fn, 4)
#define fs_initcall(fn) __define_initcall(fn, 5)
---------------------------------------------------------
#define rootfs_initcall(fn) __define_initcall(fn, rootfs)
#define device_initcall(fn) __define_initcall(fn, 6)
#define late_initcall(fn) __define_initcall(fn, 7)
Since we start off from rootfs_initcall(), it gives us 3 more
levels of leg room to play with for order semantics, this isn't
enough to address all required levels of dependencies, this
is specially true given that AMD-KFD needs to be loaded before
the radeon driver -- -but this it not enforced by symbols.
If the AMD-KFD driver is not loaded prior to the radeon driver
because otherwise the radeon driver will not initialize the
AMD-KFD driver and you get no KFD functionality in userspace.
Commit 1bacc894c227fad8a7 ("drivers: Move iommu/ before gpu/ in
Makefile") works around some of the possibe races between
the AMD IOMMU v2 and GPU drivers by changing the link order.
This is fragile, however its the bets we can do, given that
making the GPU drivers use late_initcall() would also implicate
a similar race between them. That possible race is fortunatley
addressed given that the drm Makefile currently has amdkfd
drivers/gpu/drm/Makefile
...
obj-$(CONFIG_HSA_AMD) += amd/amdkfd/
obj-$(CONFIG_DRM_RADEON)+= radeon/
...
Changing amdkfd and radeon to late_initcall() however is
still the right call in orde to annotate explicitly a
delayed dependency requirement between the GPU drivers
and the IOMMUs.
We can't address the fragile nature of the link order
right now, but in the future that might be possible.
---
Please note, the changes to drivers/Makefile are just
for the sake of forcing the possible race to occur,
if this works well the actual [PATCH] submission will
skip those changes as its pointless to remove those
work arounds as it stands, due to the limited nature
of the levels available for addressing requirements.
Also, if you are aware of further dependency hell
things like these -- please do let me know as I am
interested in looking at addressing them.
This should be fixed with -EPROBE_DEFER instead. Frobbing initcall
levels should then just be done as an optimization to avoid too much
reprobing.
radeon already uses -EPROBE_DEFER but it assumes that amdkfd *is* loaded first,
and only if it is already loaded can it count on getting -EPROBE_DEFER. The
radeon driver will defer probe *iff* kgd2kfd_init() returns -EPROBE_DEFER,
which only happens when amdkfd_init_completed is no longer 0. If radeon
gets linked first though the symbol fetch for kgd2kfd_init() will make it as
not-present though.
I did some more homework and I no longer believe this was the issue. More below.
So the current heuristic used does not address proper link
/ load order. Part of the issue mentioned on the commit log is another race
underneath the hood with the AMD IOMMU v2 which is needed for amdkfd. The
underlying issue however really is the lack of available clear semantics for
dependencies over 3 levels here. This is solved one way or another by link
order in the Makefiles, but as I've noted so far this has been rather implicit
and fragile. The change here makes at least the order of the GPU drivers
explicitly later than the IOMMUs. The specific race between radeon and amdfkd
is solved currently through link order through the Makefiles. In the future we
maybe able to make things more explicit.
Sounds like the EPROBE_DEFER handling is broken - if the module isn't set
up yet but selected in Kconfig, and needed for that hw generation then it
should not just silently fail.
Although I cannot confirm through testing, I did an under the hood inspection
of symbol_request() which both radeon and amdgpu uses and have a better idea
of why things where failing, it should not really be the inability to trust
symbol_request() to work if link order changes between amdkfd and radeon or
amdgpu, its the issue of link order also needed of the AMD IOMMU *and* amdkfd.
So my above assumption here that -EPROBE_DEFER could fail should be
invalid given that the real issue should have been that amdkfd was being
kicked off prior to the AMD IOMMU v2, at that point kgd2kfd_init() would
fail. The silent failure then was an issue not of radeon but rather higher
order drivers, and in this case neither radeon nor amdgpu could address
that regardless of what they do. Oded fixed this by changing the link order
Move iommu/ before gpu/ in Makefile").
-EPROBE_DEFER also introduces a latency on load which we should not need if we
can handle proper link / load order dependency annotations. This change is a
small part of that work, but as it the commit log also notes future further
work is possible to build stronger semantics. Some of the work I'm doing with
linker-tables may help with this in the future [0], but for now this should
help with what the semantics we have in place.
That's what I meant with "avoiding too much reprobing". But in the end the
current solution to cross-driver deps we have is EPROBE_DEFER. Fiddling
with the link order is all well for optimizing stuff, but imo _way_ too
fragile for correctness.
Agreed, but EPROBE_DEFER cannot ensure layers below are correct either. By
moving the GPU drivers radon and amdgpu to late_initcall() we'd actually be
taking one more explicit ordering step for correctness to ensure that in case
the Makefile order is different in other environments at least the IOMMU and
GPU driver initialization is explicitly correct.
Other than addressing more init levels in the future for more device
categories in the kernel, a module section, say with MODULE_SUGGESTS()
might help to enable the core to request_module() on behalf of
drivers... if this seems like it could help I could try it out.

Luis

Luis R. Rodriguez
2016-05-27 00:12:43 UTC
Permalink
You have to take carefully to arrange the calling sequence for
v1 ->v2->kfd, drm.
iommuv1 -- rootfs_initcall(fn)
IOMMUV2 -- device_initcall(fn)
kfd module -- late_initcall(fn)
drm -- late_initcall(fn)
Thanks, it turns out this is not exactly enough, given as
Joerg notes:

--
AMD-KFD on the other hand needs to be loaded before the
radeon driver (but this it not enforced by symbols), because otherwise
the radeon driver will not initialize the AMD-KFD driver.
---

We have a theoretical race still possible between the kfd module
and the drm driver. I'll reply to Joerg's e-mail with more feedback.

Luis
Loading...