How to get IBM HS 22 with GPU Expansion Blade to work with XenServer 6.x GPU Passthrough
This is a article about how to get IBM HS 22 Blade and IBM BladeCenter GPU Expansion Blade to work with XenServer 6.0 Multi-GPU Passthrough and XenDesktop 5.6 HDX 3D Pro. After receiving our brand new IBM HS 22 Blade with the GPU expansion bay, we plugged it in and put it into our BladeCenter. So, next step was to install XenServer 6 to get the GPU Passthrough features up and running.This is where all the problems started, as we did not get neither XenServer 6.0 nor a plain physical installed Windows Server 2008 R2 to talk 100% with the GPU. In our case this GPU expansion blade, has a Tesla M2070-Q card onboard. Now, after talking to IBM(they actually took the blade with them to their HQ here in Norway), we did not find any problems with neither with the HS22 itself nor with the GPU expansion blade. So, we got it back from IBM without any luck. During this whole time Citrix Support had given us excellent support! Just needed to say that, because it’s true!
So, where do we go from here? We sent an email describing our issues to the support email over at NVIDIA. Since, this is a Nvidia Tesla card, that is a good way to get somebody talking… and boy, did we get the right answer. It turns out that this card default is in a 3D mode. So, the Nvidia Tesla card is actually a 3D card. Remember the old type of 3DFX cards??? Monster 3DFX 1 and 2, and so on… where you had to put on a buypass vga cable 🙂
Its the same, I guess the card may have been working with XenApp and application publish… if the apps are using OpenGL etc… with just HDX 3D, but not with directly connecting with it, which is what we want when we are talking about Citrix HDX 3D Pro.
So, what we need to do, in order to get this GPU talking with our hardware in the right way, is to flash it. Now, we actually got a gpu bios flash from Citrix, but since the card did not respond 100% Windows said that we could not flash the GPU because it’s not present
So, Ndivia gave us a firmware patch. The problem with this patch is that its HW depended.
So for IBM HS22 and Nvidia, the GPU card gets for example PCI info 19:00:0 , On other servers, or if we have diffrent HW specs (ex. no SAN card etc..) The PCI may become 18:00:0 or something…
This new GPU BIOS flash did the trick! As you can se from my screen dumps, we simply first installed the latest NVIDIA Tesla Gaphics Driver and rebooted. After that we used the m2070q-ibm-vga.exe file to flash our GPU card. Have in mind that I actually did all this via a normal installed Windows Server 2008 R2, not via a Virtual Machine(VM), but it might work with XenServer 6.
That did the trick, we now can communicate with our Tesla M2070-q card both directly within a regular non virtualized Windows Server 2008 R2, and with XenServer 6.0.2 with GPU Passthrough So, what did we learn? From what we found, it actually looks like NVIDIA Tesla cards have three work modes:
0 – Normal mode
1 – COMPUTE exclusive mode (only one COMPUTE context is allowed to run on the GPU)
2 – COMPUTE prohibited mode (no COMPUTE contexts are allowed to run on the GPU)
After you install NVIDIA drivers you should have nvidia-smi utility, possibly in C:\Program Files\NVIDIA Corporation\NVSMI\, this tool allows you switch card’s mode.
First off, install the latest NVIDIA drivers from www.nvidia.com
Now, after contacting Nvidia Support you should have 3 files from them. Depending on what you want to do, and I guess you want to make this work, you should run the m2070q-ibm-vga.exe file.
If you get the error messages you see above you need to contact Nvidia again, I wanted to put it in here so you can see what you don’t want to have!
Now, running it again with the right firmware flash for our Tesla M2070-q card from IBM, it now searches for the card, and flashes it into the correct operating mode.