I came to graphics drivers from embedded. At Baykar I did board bring-up on a TI AM67A — writing device tree, setting up DDR, poking registers directly. That background made the graphics stack look familiar in some ways and strange in others.
There are basically three layers if you ignore firmware: the kernel driver, the userspace driver, and the shader compiler.
The kernel driver handles the stuff that has to be in kernel space — memory management,
command submission from multiple processes, display, power, resetting the GPU when it crashes.
On Linux this is the DRM subsystem, for AMD it's amdgpu.
The userspace driver implements the API, Vulkan or OpenGL or whatever. Most of the logic
is here. For AMD Vulkan it's RADV which is part of Mesa. It records command buffers, manages
pipeline state, uploads resources. When you call vkQueueSubmit it eventually
does an ioctl into the kernel driver to actually do the work.
The shader compiler takes SPIR-V bytecode and produces actual GPU machine code. GPU ISAs aren't standardized so gfx9 and gfx11 are completely different, you can't run the same binary on both. So compilation happens at runtime on your machine for your specific GPU. For RADV this is ACO.
On a microcontroller you cross-compile once and flash the binary. GPUs don't work that way. Every generation has a different ISA, different register counts, different instruction latencies. Even same-generation cards differ in shader core count, cache sizes, memory bandwidth — all of which affect how a good compiler schedules instructions.
SPIR-V (Vulkan's shader bytecode) is the portable format. The driver compiles it on first use. That's the "compiling shaders" stutter in new games — the driver is compiling every shader the first time it sees it.