x86 最快的虚拟机设计是什么？

发布于 2024-07-10 20:20:22 字数 1255 浏览 15 评论 0原文

我将在 x86 中实现一个虚拟机，我想知道什么样的设计会产生最好的结果。我应该集中注意什么才能挤出果汁？我将在 x86 汇编中实现整个虚拟机。

我没有太多指示，我可以选择它们的形式。这些指令直接投射到smalltalk 的语法块中。我给出了我正在考虑的指令设计：

^ ...       # return
^null     # return nothing
object    # address to object
... selector: ... # message pass (in this case arity:1 selector: #selector:)
var := ... # set
var # get

我正在考虑的虚拟机类型：

mov eax, [esi]
add esi, 2
mov ecx, eax
and eax, 0xff
and ecx, 0xff00 # *256
shr ecx, 5          # *8
jmp [ecx*4 + operations]
align 8:
    operations:
dd retnull
dd ret
# so on...
    retnull:          # jumps here at retnul
# ... retnull action
    ret:
# ... ret action
#etc.

不要开始问为什么我需要另一个虚拟机实现。解释例程并不是你需要时就可以拿起的库存东西。您在其他地方建议的大多数虚拟机都注重可移植性和性能成本。我的目标不是便携性，我的目标是性能。

需要这个解释器的原因是因为 Smalltalk 块最终不会以相同的方式被解释：

A := B subclass: [
    def a:x [^ x*x]
    clmet b [...]
    def c [...]
    def d [...]
]

[ 2 < x ] whileTrue: [...]

(i isNeat) ifTrue: [...] ifFalse: [...]

List fromBlock: [
    "carrots"
    "apples"
    "oranges" toUpper
]

我需要来自解释例程的真正好处，即选择在其中读取程序的上下文。当然，好的编译器应该在大多数情况下编译明显的情况，例如：“ifTrue:ifFalse”或“whileTrue:”，或列表示例。对口译员的需求并不会消失，因为您总是可能会遇到无法确定该块得到您期望的处理的情况。

原文

I will implement a virtual machine in x86 and I wonder what kind of design would yield best results. What should I concentrate on to squish out the juice? I will to implement the whole virtual machine in x86 assembly.

I haven't much instructions and I can choose their form. The instructions project directly into smalltalk's syntax in blocks. I give out the instruction design I were thinking of:

^ ...       # return
^null     # return nothing
object    # address to object
... selector: ... # message pass (in this case arity:1 selector: #selector:)
var := ... # set
var # get

The sort of VM I were thinking about:

mov eax, [esi]
add esi, 2
mov ecx, eax
and eax, 0xff
and ecx, 0xff00 # *256
shr ecx, 5          # *8
jmp [ecx*4 + operations]
align 8:
    operations:
dd retnull
dd ret
# so on...
    retnull:          # jumps here at retnul
# ... retnull action
    ret:
# ... ret action
#etc.

Don't start asking why I need yet another virtual machine implementation. Interpretive routines aren't stock stuff you just pick up whenever you need them. Most virtual machines you are proposing elsewhere are weighted towards portability with the cost of performance. My goal is not the portability, my goal is the performance.

The reason this interpreter is needed at all is because smalltalk blocks doesn't end up gotten interpreted the same way:

A := B subclass: [
    def a:x [^ x*x]
    clmet b [...]
    def c [...]
    def d [...]
]

[ 2 < x ] whileTrue: [...]

(i isNeat) ifTrue: [...] ifFalse: [...]

List fromBlock: [
    "carrots"
    "apples"
    "oranges" toUpper
]

I need the real benefit coming from the interpretive routines, that is the choice of context where to read the program in. Of course, good compiler should just most of the time compile the obvious cases like: 'ifTrue:ifFalse' or 'whileTrue:', or the list example. The need for interpreter doesn't just disappear because you always may hit a case where you can't be sure the block gets the treatment you expect.

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

攒眉千度 2024-07-17 20:20:23

我想问，为什么要创建一个注重性能的虚拟机呢？为什么不直接写x86代码呢？没有什么比这更快了。

如果您想要一种非常快速的解释语言，请查看Forth。他们的设计非常整洁并且很容易复制。

回复收藏 0 原文

梦在深巷 2024-07-17 20:20:23

如果您不喜欢 JIT 并且您的目标不是可移植性。我想您可能会对 Google NativeClient 项目感兴趣。他们进行静态分析、沙箱分析等。它们允许主机执行 RAW x86 指令。

回复收藏 0 原文

一花一树开 2024-07-17 20:20:22

我发现这里对可移植性存在一些困惑，所以我觉得有必要澄清一些问题。这些是我的拙见，所以你当然可以自由地反对它们。

我假设您遇到了 http://www.complang.tuwien.ac.at/ fort/threading/ 如果您认真考虑编写虚拟机，那么我不会详细讨论所描述的技术。

已经提到，以 VM 为目标具有一些优点，例如减少代码大小、降低编译器复杂性（通常转化为更快的编译）、可移植性（请注意，VM 的要点是语言的可移植性，因此它如果虚拟机本身不可移植也没关系）。

考虑到您的示例的动态特性，您的虚拟机将比其他更流行的编译器更类似于JIT编译器。所以，虽然S.Lott在这个例子中没有抓住重点，但他对福斯的提及却非常恰到好处。如果我要为一种非常动态的语言设计一个 VM，我会将解释分为两个阶段；

生产者阶段，根据需要查询 AST 流并将其转换为更有意义的形式（例如，获取一个块，决定是否应该立即执行或存储在某个地方以供以后执行），可能会引入新类型的代币。本质上，您可以恢复在解析时可能丢失的上下文敏感信息。
消费者阶段从 1 获取生成的流并像任何其他机器一样盲目执行它。如果你像 Forth 那样，你可以只推送一个存储的流并完成它，而不是跳转指令指针。

正如您所说，仅以另一种方式模仿该死的处理器的工作方式并不能实现您所需的任何活力（或任何其他值得该死的功能，例如安全性）。否则，您将编写一个编译器。

当然，您可以在第 1 阶段添加任意复杂的优化。

回复收藏 0 原文

つ可否回来 2024-07-17 20:20:22

如果您想要非常快的速度，请尝试使用 LLVM。它可以根据高级程序描述为大多数处理器生成本机代码。您可以使用自己的汇编语言，也可以跳过汇编阶段生成 llvm 结构，具体取决于您认为最方便的方式。

我不确定它是否最适合您的问题，但如果我要对无法与程序其余部分一起编译的代码执行一些性能关键的执行，那么它绝对是我会使用的。

回复收藏 0 原文

思慕 2024-07-17 20:20:22

大多数时候，解释器的重点是可移植性。我能想到的最快方法是直接在内存中生成 x86 代码，就像 JIT 编译器所做的那样，但是，当然，你不再有解释器了。你有一个编译器。

但是，我不确定用汇编程序编写解释器会给您带来最佳性能（除非您是汇编程序大师并且您的项目范围非常有限）。使用高级语言可以帮助您专注于更好的算法，例如符号查找和寄存器分配策略。

回复收藏 0 原文

后来的我们 2024-07-17 20:20:22

您可以使用未编码的指令集来加速调度例程：

mov eax, [esi]
add esi, 4
add eax, pOpcodeTable
jmp eax

其开销应小于 cpu 上的每次调度 4 个周期 > Pentium 4。

此外，出于性能原因，最好在每个原始例程中递增 ESI (IP)，因为递增可以与其他指令配对的机会很高：

mov eax, [esi]
add eax, pOpcodeTable
jmp eax

~ 1-2 周期开销。

you can speed up your dispatch routine with an unencoded instruction set to:

mov eax, [esi]
add esi, 4
add eax, pOpcodeTable
jmp eax

which should have an overhead < 4 cycles for each dispatch on cpu's > Pentium 4.

As addition, for performance reasons it is better to increment ESI (IP) in each primitive routine because the chances are high that the incrementation can be paired with other instructions:

mov eax, [esi]
add eax, pOpcodeTable
jmp eax

~ 1-2 cylces overhead.

回复收藏 0 原文

~没有更多了~

关于作者

祁梦

暂无简介

文章

25 人气

关注发私信

友情链接

文江博客

x86 最快的虚拟机设计是什么？

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

评论（6）

关于作者

相关话题

热门标签

推荐作者

琉璃梦幻

qq_4zWU6L

话少情深

西西弗的石头怪

彻夜缠绵

千寻…

友情链接

x86 最快的虚拟机设计是什么？

如果你对这篇内容有疑问，欢迎到本站社区发帖提问 参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

评论（6）

关于作者

相关话题

热门标签

推荐作者

琉璃梦幻

qq_4zWU6L

话少情深

西西弗的石头怪

彻夜缠绵

千寻…

友情链接

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。