qemu 与 qemu-kvm:一些性能测量

发布于 2024-10-26 10:17:50 字数 542 浏览 5 评论 0原文

我在 qemu 和 qemu-kvm 中进行了以下基准测试,配置如下:

CPU: AMD 4400 process dual core with svm enabled, 2G RAM
Host OS: OpenSUSE 11.3 with latest Patch, running with kde4
Guest OS: FreeDos
Emulated Memory: 256M
Network: Nil
Language: Turbo C 2.0
Benchmark Program: Count from 0000000 to 9999999. Display the counter on the screen
     by direct accessing the screen memory (i.e. 0xb800:xxxx)

在 qemu 中运行时只需要 6 秒。

但在 qemu-kvm 中运行时需要 89 秒。

我一项一项地运行基准测试,而不是并行运行。

我整晚都在挠头,但还是不明白为什么会发生这种情况。有人会给我一些提示吗?

I conducted the following benchmark in qemu and qemu-kvm, with the following configuration:

CPU: AMD 4400 process dual core with svm enabled, 2G RAM
Host OS: OpenSUSE 11.3 with latest Patch, running with kde4
Guest OS: FreeDos
Emulated Memory: 256M
Network: Nil
Language: Turbo C 2.0
Benchmark Program: Count from 0000000 to 9999999. Display the counter on the screen
     by direct accessing the screen memory (i.e. 0xb800:xxxx)

It only takes 6 sec when running in qemu.

But it takes 89 sec when running in qemu-kvm.

I ran the benchmark one by one, not in parallel.

I scratched my head the whole night, but still not idea why this happens. Would somebody give me some hints?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(3

旧瑾黎汐 2024-11-02 10:17:50

KVM使用qemu作为他的设备模拟器,任何设备操作都是由用户空间的QEMU程序模拟的。当您写入 0xB8000 时,图形显示将被操作,其中涉及客户从客户模式执行 CPU“vmexit”并返回到 KVM 模块,后者又将设备模拟请求发送到用户空间 QEMU 后端。

相比之下,QEMU w/o KVM 除了通常的系统调用之外,所有工作都在统一进程中完成,CPU 上下文切换较少。同时,您的基准测试代码是一个简单的循环,只需要代码块翻译一次。与 KVM 情况下每次迭代的 vmexit 和内核-用户通信相比,这不需要任何成本。

这应该是最可能的原因。

KVM uses qemu as his device simulator, any device operation is simulated by user space QEMU program. When you write to 0xB8000, the graphic display is operated which involves guest's doing a CPU `vmexit' from guest mode and returning to KVM module, who in turn sends device simulation requests to user space QEMU backend.

In contrast, QEMU w/o KVM does all the jobs in unified process except for usual system calls, there's fewer CPU context switches. Meanwhile, your benchmark code is a simple loop which only requires code block translation for just one time. That cost nothing, compared to vmexit and kernel-user communication of every iteration in KVM case.

This should be the most probable cause.

溺孤伤于心 2024-11-02 10:17:50

您的基准测试是 IO 密集型基准测试,并且 qemu 和 qemu-kvm 的所有 io 设备实际上都是相同的。在 qemu 的源代码中,可以在 hw/* 中找到它。

这说明 qemu-kvm 与 qemu 相比一定不是很快。不过,我对经济放缓没有具体的答案。对此我有以下解释,我认为它在很大程度上是正确的。

“qemu-kvm 模块使用 Linux 内核中的 kvm 内核模块。这会在 x86 客户模式下运行客户,这会导致每个特权指令陷入陷阱。相反,qemu 使用非常高效的 TCG,它将在我第一次认为陷阱的高成本出现在你们的基准测试中。”但这并不适用于所有 io 设备。 Apache 基准测试在 qemu-kvm 上运行得更好,因为该库会进行缓冲并使用最少数量的特权指令来执行 IO。

Your benchmark is an IO-intensive benchmark and all the io-devices are actually the same for qemu and qemu-kvm. In qemu's source code this can be found in hw/*.

This explains that the qemu-kvm must not be very fast compared to qemu. However, I have no particular answer for the slowdown. I have the following explanation for this and I think its correct to a large extent.

"The qemu-kvm module uses the kvm kernel module in linux kernel. This runs the guest in x86 guest mode which causes a trap on every privileged instruction. On the contrary, qemu uses a very efficient TCG which translates the instructions it sees at the first time. I think that the high-cost of trap is showing up in your benchmarks." This ain't true for all io-devices though. Apache benchmark would run better on qemu-kvm because the library does the buffering and uses least number of privileged instructions to do the IO.

调妓 2024-11-02 10:17:50

原因是发生了太多的VMEXIT。

The reason is too much VMEXIT take place.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文