Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

...

Series fixing the regression: https://lore.kernel.org/qemu-devel/20241007172317.1439564-2-pbonzini@redhat.com/

This series will be was merged in QEMU soon, and will be available in next-stable and QEMU 9.2.0. Meanwhile, ensure your QEMU is built with this patchbuild QEMU from master branch.

Following results are presented with this fix.

...

qemu-system-x86_64 -accel tcg - cpu max

  • -smp 1: 1036s (x34) perf.data.smp_1.run0.svgImage Removed x64_perf.data.smp_1.run0.svgImage Added

  • -smp 2: 410s (x13) perf.data.smp_2.run0.svgImage Removed x64_perf.data.smp_2.run0.svgImage Added

  • -smp 4: 280s (x9) perf.data.smp_4.run0.svgImage Removed x64_perf.data.smp_4.run1.svgImage Added

  • -smp 6: 260s (x8) perf.data.smp_6.run0.svgImage Removed x64_perf.data.smp_6.run0.svgImage Added

  • -smp 8: 260s (x8) perf.data.smp_8.run1.svgImage Removed x64_perf.data.smp_8.run0.svgImage Added

We can see that the speedup compared to -smp 2 is not linear. While booting, we can see that the QEMU process barely reaches 500% of cpu time in top. This is a limitation of Android boot sequence that does not seem able to use more than 4 cores.

...

qemu-system-aarch64 -accel tcg -cpu max,pauth=off

  • -smp 1: 1034s (x34) perf.data.smp_1.run1.svgImage Removed arm64_perf.data.smp_1.run1.svgImage Added

  • -smp 2: 512s (x17) perf.data.smp_2.run1.svgImage Removed arm64_perf.data.smp_2.run0.svgImage Added

  • -smp 4: 380s (x12) perf.data.smp_4.run0.svgImage Removed arm64_perf.data.smp_4.run0.svgImage Added

  • -smp 6: 360s (x12) perf.data.smp_6.run0.svgImage Removed arm64_perf.data.smp_6.run0.svgImage Added

  • -smp 8: 375s (x12) perf.data.smp_8.run0.svgImage Removed arm64_perf.data.smp_8.run0.svgImage Added

We can see that disabling pointer execution results in much faster execution, as expected.
Performance is close from what we observe when booting x64 version, with a small overhead for aarch64.

...

  • 4 cores

  • -cpu max (,pauth=off on aarch64)

  • ensure that cmpxchg is used on x64 (massive difference with smp > 1). This series https://lore.kernel.org/qemu-devel/20241007172317.1439564-2-pbonzini@redhat.com/ will be was merged in QEMU and will be available in qemu next stable and 9.2.0. Meanwhile, use a QEMU compiled from master branch.

  • Performance difference between aarch64 and x64 can be explained by TLB management on aarch64, and some helpers.