CPU-pinning and further performance tweaks for virtual machines on AMD Ryzen CPUs

Considering gaming as the main use case for my passthrough setup I try to minimize the latency on the guest (don’t bother, straight to the config). One knob to turn in order to squeeze out more snappiness is CPU-pinning. 

This will allocate CPU-cores for mainly (or solely) Guest tasks, when the Guest is running. One could go one even further and restrict access to the guest cores completely, even if the guest isn’t running. This would use the isolcpus kernel command line flag at boot time. I do not use this feature as I would like to have maximum Host performance if the Guest is not running.

My test system runs:

  • AMD Ryzen 7 1800X (max boost clock 4GHz)

Decisions

I started with benchmarking several setups (found on level1tech, reddit and the arch wiki) – used cinebench, superpossition etc… In the end the results were, unfortunately, not as meaningful as I have hoped. Sure, more cores lead to better CPU bench results, but in terms of pure FPS performance the results were not significantly different. Thus I went with common sense.

Ryzen CPU architecture
Ryzen CPU architecture

The AMD Ryzen architecture houses 8 physical cores, each core capable of handling two threads. This leads to a total of 16 cores available for pinning. The 8 cores are separated into two complexes of 4 cores called CCX. Each CCX has its own L3 cache.

The plan is to have one CCX for the host, and one CCX for the guest. As the hosts runs first, ill assume it will use the (first) CCX with cores 0-3. The second CCX (cores 4-7) shall be used for the virtual machine.

I used a 12 pin setup to the Guest (6 cores) for half a year. Somtimes I encountered micro lag-spikes, which I was unable to track down the source of it.

Then I switched to 8 cpus (4 cores). The benchmarks indicated that the 6 core pinning had better CPU mark results, and slightly higher FPS, but since I switched to the CCX seperation the lag spikes went away. Here are my settings:

Core separation between host and guest system. Second core pins marked for CPU-pinning
Guest cores marked red

CPU-pinning configuration for passthrough optimal performance

In order edit the virtual machine

cd /etc/libvirt/qemu
sudo virsh define windows10.xml (change this according to your virtual machine name)
sudo virsh edit windows10

once your done with the edits and have saved your config re-run

sudo virsh define windows10.xml

First of all find the very first line, which should read:

<domain type='kvm'>

and replace it with:

<domain type='kvm' xmlns:qemu='http://libvirt.org/schemas/domain/qemu/1.0'>

Now find the line which ends with </vcpu>and add the following block in the next line:

<vcpu placement='static'>8</vcpu>   
<iothreads>1</iothreads>   
<cputune>     
   <vcpupin vcpu='0' cpuset='8'/>     
   <vcpupin vcpu='1' cpuset='9'/>     
   <vcpupin vcpu='2' cpuset='10'/>     
   <vcpupin vcpu='3' cpuset='11'/>     
   <vcpupin vcpu='4' cpuset='12'/>     
   <vcpupin vcpu='5' cpuset='13'/>     
   <vcpupin vcpu='6' cpuset='14'/>     
   <vcpupin vcpu='7' cpuset='15'/>     
   <emulatorpin cpuset='0-1'/>     
   <iothreadpin iothread='1' cpuset='0-1'/>   
</cputune>

Attention:Make sure <vcpu>, <iothreads> and <cputune> have the same indent.

Find the block <features>
and add the following block in parallel to the <acpi> block:

<hyperv> 
  <relaxed state='on'/>
   <vapic state='on'/>
   <spinlocks state='on' retries='8191'/>
</hyperv>

Attention:Make sure <hyperv>and <acpi> have the same indent.

Find the block <CPU> and adapt it to look like this:

  <cpu mode='host-passthrough' check='none'>
    <topology sockets='1' cores='4' threads='2'/>
    <cache level='3' mode='emulate'/>
  </cpu>

Updated AMD Ryzen CPU-pinning for qemu 4.1 (todo)

source

0. check updated pin numbers for ryzen (qemu, or bios or hw update?!)

2. descripe hyper-v and differnt timer

cat /sys/devices/system/clocksource/clocksource0/current_clocksource

If it running as tsc, make sure it s enabled.

<timer name=”tsc” present=”yes” mode=”native”/>

3.

use the command chrt -f -p 99 *PID*

4. descripe testing tools

Passages from Libvirt XML

 

<domain type='kvm' xmlns:qemu='http://libvirt.org/schemas/domain/qemu/1.0'>
  <name>win10-q35-hyperv</name>
  <uuid>cc37803d-a904-44cd-a333-5830ce22d20f</uuid>
  <memory unit='KiB'>16777216</memory>
  <currentMemory unit='KiB'>16777216</currentMemory>
  <memoryBacking>
    <hugepages/>
  </memoryBacking>
  <vcpu placement='static'>8</vcpu>
  <iothreads>2</iothreads>
  <cputune>
    <vcpupin vcpu='0' cpuset='8'/>
    <vcpupin vcpu='1' cpuset='9'/>
    <vcpupin vcpu='2' cpuset='10'/>
    <vcpupin vcpu='3' cpuset='11'/>
    <vcpupin vcpu='4' cpuset='12'/>
    <vcpupin vcpu='5' cpuset='13'/>
    <vcpupin vcpu='6' cpuset='14'/>
    <vcpupin vcpu='7' cpuset='15'/>
    <emulatorpin cpuset='0-3'/>
    <iothreadpin iothread='1' cpuset='0-1'/>
    <iothreadpin iothread='2' cpuset='2-3'/>
    <vcpusched vcpus='0' scheduler='fifo' priority='1'/>
    <vcpusched vcpus='1' scheduler='fifo' priority='1'/>
    <vcpusched vcpus='2' scheduler='fifo' priority='1'/>
    <vcpusched vcpus='3' scheduler='fifo' priority='1'/>
    <vcpusched vcpus='4' scheduler='fifo' priority='1'/>
    <vcpusched vcpus='5' scheduler='fifo' priority='1'/>
    <vcpusched vcpus='6' scheduler='fifo' priority='1'/>
    <vcpusched vcpus='7' scheduler='fifo' priority='1'/>
  </cputune>
  <os>
    <type arch='x86_64' machine='pc-q35-4.1'>hvm</type>
    <loader readonly='yes' type='pflash'>/usr/share/OVMF/OVMF_CODE.fd</loader>
   [...]
  </os>
  <features>
    <acpi/>
    <apic/>
    <hyperv>
      <relaxed state='on'/>
      <vapic state='on'/>
      <spinlocks state='on' retries='8191'/>
      <vendor_id state='on' value='1234567890ab'/>
    </hyperv>
    <kvm>
      <hidden state='off'/>
    </kvm>
    <vmport state='off'/>
    <ioapic driver='kvm'/>
  </features>
<cpu mode='host-passthrough' check='none'>
<topology sockets='1' cores='4' threads='2'/>
<cache level='3' mode='emulate'/>
<feature policy='require' name='topoext'/>
<feature policy='require' name='invtsc'/>
<feature policy='require' name='svm'/>
<feature policy='require' name='hypervisor'/>
<feature policy='require' name='apic'/>
</cpu>
<clock offset='localtime'>
<timer name='rtc' present='no' tickpolicy='catchup'/>
<timer name='pit' present='no' tickpolicy='delay'/>
<timer name='hpet' present='no'/>
<timer name='kvmclock' present='no'/>
<timer name='hypervclock' present='yes'/>
<timer name='tsc' present='yes' mode='native'/>
</clock>
[...]
<devices>
<emulator>/usr/local/bin/qemu4.1-system-x86_64</emulator>
[...]
<memballoon model='none'/>
</devices>
<qemu:commandline>
<qemu:arg value='-cpu'/>
  <qemu:arg value='host,kvm=off,hv_vendor_id=null,hv_time,migratable=no,+invtsc'/>
<qemu:env name='QEMU_AUDIO_DRV' value='pa'/>
<qemu:env name='QEMU_PA_SAMPLES' value='8192'/>
<qemu:env name='QEMU_AUDIO_TIMER_PERIOD' value='99'/>
<qemu:env name='QEMU_PA_SERVER' value='/run/user/1000/pulse/native'/>
</qemu:commandline>
</domain>

[collapse]

Post updates

21.08.2019 – Added further information and todos

30.09.2019 – Updated Hyper-V settings

5 Comment

  1. JK says: Reply

    Thanks for great configuration. Just bought Ryzen 2700 due to similar reasons – I had problems with guest latency, hope extra cores solve this issue.

    1. Mathias Hueber says: Reply

      Let me know if it helps, I am always looking for further tweaking suggestions.

  2. […] I used AMD μProf to help with mapping out my 3700X (and this writeup on CPU-pinning): […]

  3. Ian says: Reply

    Not sure why, but it looks like the physical cores for my 1700 are now arranged differently.
    Core 0 uses threads 0&8, core 1 uses 1&9, core 2 uses 2&10, etc.
    However the logical indexes haven’t changed, so not quite sure which one I must use

    1. Mathias Hueber says: Reply

      Ohh, that is strange. Which BIOS version are you running? Is it still working though? I have read that some newer BIOS versions might break vfio passthrough altogether. See https://forum.level1techs.com/t/attention-amd-vfio-users-do-not-update-your-bios/142685

Leave a Reply

Wir benutzen Cookies um die Nutzerfreundlichkeit der Webseite zu verbessen. Durch Deinen Besuch stimmst Du dem zu.