什么是虚拟机?为什么动态语言需要虚拟机?

发布于 2024-10-11 03:20:26 字数 119 浏览 4 评论 0原文

例如,Python 和 Java 有 VM,而 C 和 Haskell 则没有。 (错了请指正)

想想线两边都有什么语言,找不到原因。 Java 在很多方面都是静态的,而 Haskell 提供了很多动态特性。

So, for example, Python and Java have a VM, C and Haskell do not. (Correct me if I'm wrong)

Thinking about what languages on both sides of the line have, I can't find the reason. Java is static in a lot of ways, while Haskell provides a lot of dynamic features.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(8

花开柳相依 2024-10-18 03:20:26

这与静态与动态无关。

相反,它是关于独立于底层硬件平台(理论上“构建一次,到处运行”......)

实际上,它也与语言无关。人们可以编写一个 C 编译器来为 JVM 生成字节码。人们可以编写一个生成 x86 机器代码的 Java 编译器。

It's nothing to do with static vs. dynamic.

Rather, it's about becoming independent from the underlying hardware platform ("build once, run everywhere" - in theory...)

Actually, it's nothing to do with the language, either. One could write a C compiler that generates bytecode for the JVM. One could write a Java compiler that generates x86 machine code.

爱本泡沫多脆弱 2024-10-18 03:20:26

让我们暂时忘记虚拟机(我保证我们会回到下面的内容),并从这个重要事实开始:

C 没有垃圾收集。

对于提供垃圾收集的语言,必须有某种“运行时”/运行时环境/事物来执行它。

这就是为什么 Python、Java 和 Haskell 需要“运行时”,而 C 不需要,可以直接编译为本机代码。

请注意, psyco 是一个将 Python 代码编译为机器代码的 Python 优化器,但是,很多机器代码包含对 C-Python 运行时函数的调用,例如 PyImport_AddModulePyImport_GetModuleDict 等。Haskell

/GHC 与 psyco 编译的 Python 类似。 Int 作为简单的机器指令添加,但分配对象等更复杂的东西,调用运行时。

还有什么?

C 没有“异常”

如果我们要向 C 添加异常,我们生成的机器代码将需要为每个函数和每个函数调用做一些事情。

如果我们也添加“闭包”,就会添加更多的东西。

现在,我们可以让它调用子过程来执行必要的操作,而不是在每个函数中重复这个样板机器代码,例如 PyErr_Occurred

所以现在,基本上每个原始源代码行都映射到对某些函数和较小的独特部分的一些调用。

但只要我们在每个原始源代码行上做了这么多事情,为什么还要费心去处理机器代码呢?

这是一个想法(顺便说一句,我们称这个想法为“虚拟机”)。

让我们代表您的 Python 代码,例如:

def has_no_letters(text):
  return text.upper() == text.lower()

作为内存中数据结构,例如:

{ 'func_name': 'has_no_letters',
  'num_args': 1,
  'kwargs': [],
  'codez': [
    ('get_attr', 'tmp_a', 'arg_0', 'upper'),  # tmp_a = arg_0.upper
    ('func_call', 'tmp_b', 'tmp_a', []),  # tmp_b = tmp_a() # tmp_b = arg_0.upper()
    ('get_attr', 'tmp_c', 'arg_0', 'lower'),
    ('func_call', 'tmp_d', 'tmp_c', []),
    ('get_global', 'tmp_e', '=='),
    ('func_call', 'tmp_f', 'tmp_e', ['tmp_b', 'tmp_d']),
    ('return', 'tmp_f'),
  ]
}

现在,让我们编写一个执行此内存中数据结构的解释器。

让我们讨论一下与直接从文本解释器相比的好处,然后讨论与编译为机器代码相比的好处。

VM 相对于直接文本解释器的优势

  • VM 系统会在执行代码之前为您提供所有语法错误。
  • 在评估循环时,VM 系统不会在每次运行时解析源代码。
    • 使虚拟机比直接文本解释器更快。
    • 因此,直接解释器在长变量名时运行速度较慢,而在短变量名时运行速度较快。这鼓励人们编写蹩脚的数学家风格的代码,例如 wt(f, d(o, e), s) <= th(i, s) + cr(a, p * d + o)代码>

虚拟机相对于编译为机器代码的好处

  • 描述程序的内存数据结构或“虚拟机代码”可能比完整的机器代码紧凑得多,后者一次又一次地执行相同的操作原始代码行。这将使虚拟机系统运行得更快,因为需要从内存中获取的“指令”更少。
  • 创建虚拟机比创建机器代码编译器要简单得多。您现在甚至不需要知道任何汇编/机器代码就可以做到这一点。

Let's forget about VMs for a sec (we'll get back to those below, I promise), and start with this important fact:

C doesn't have garbage collection.

For a language to provide garbage collection, there has to be some sort of "runtime"/runtime-environment/thing that will perform it.

That's why Python, Java, and Haskell require a "runtime", and C, which does not, can just straight-forwardly compile to native code.

Note that psyco was a Python optimizer that compiled Python code to machine code, however, a lot of that machine code consisted of calls to C-Python's runtime's functions, such as PyImport_AddModule, PyImport_GetModuleDict, etc.

Haskell/GHC is in a similar boat to psyco-compiled Python. Ints are added as simple machine instructions, but more complicated stuff which allocate objects etc, invoke the runtime.

What else?

C doesn't have "exceptions"

If we were to add exceptions to C, our generated machine code would need to do some stuff for every function and for every function call.

If we then add "closures" as well, there would be more stuff added.

Now, instead of having this boilerplate machine code repeated in every function, we could make it instead call a subprocedure to do the necessary stuff, something like PyErr_Occurred.

So now, basically every original source line maps to some calls to some functions and a smaller unique part.

But as long as we're doing so much stuff per original source code line, why even bother with machine code?

Here's an idea (btw let's call this idea a "Virtual Machine").

Let's represent your Python code, which is for example:

def has_no_letters(text):
  return text.upper() == text.lower()

As an in-memory data-structure, for example:

{ 'func_name': 'has_no_letters',
  'num_args': 1,
  'kwargs': [],
  'codez': [
    ('get_attr', 'tmp_a', 'arg_0', 'upper'),  # tmp_a = arg_0.upper
    ('func_call', 'tmp_b', 'tmp_a', []),  # tmp_b = tmp_a() # tmp_b = arg_0.upper()
    ('get_attr', 'tmp_c', 'arg_0', 'lower'),
    ('func_call', 'tmp_d', 'tmp_c', []),
    ('get_global', 'tmp_e', '=='),
    ('func_call', 'tmp_f', 'tmp_e', ['tmp_b', 'tmp_d']),
    ('return', 'tmp_f'),
  ]
}

Now, let's write an interpreter that executes this in-memory data structure.

Let's discuss the benefits of this over direct-from-text-interpreters, and then the benefits over compiling to machine code.

The benefits of VMs over direct-from-text-interpreters

  • The VM system gives you all the syntax errors before executing the code.
  • When evaluating a loop, a VM system doesn't parse the source code each time it runs.
    • Making the VM faster than the direct-from-text-interpreter.
    • So the direct interpreter runs slower with long variable name, and faster with short variable names. This encourages people to write crappy mathematician-style code such as wt(f, d(o, e), s) <= th(i, s) + cr(a, p * d + o)

The benefits of VMs over compiling to machine code

  • The in-memory data structure describing the program, or the "VM code", will probably be much more compact than boilerplate-full machine code which does the same stuff again and again for every original line of code. This will make the VM system run faster because less "instructions" will need to be fetched from memory.
  • Creating a VM is much simpler than creating a compiler to machine code. You can probably do this now without even knowing any assembly/machine-code.
遥远的她 2024-10-18 03:20:26

虚拟机基本上是一个解释器,它解释的语言更接近机器代码。当真实机器解释真实机器代码时,虚拟机解释虚构的机器代码。一些 VM 解释实际计算机的机器代码 - 这些称为模拟器。

为简单的类似汇编的语言编写解释器比为完整的高级语言编写解释器更容易。此外,许多高级代码结构通常只是一些基本原则的语法糖。因此,编写一个编译器将所有这些复杂的概念转换为简单的虚拟机语言会更容易,因此我们不必编写复杂的解释器,而可以使用简单的解释器(虚拟机)。然后您就有更多时间来优化虚拟机。

这基本上就是当今大多数语言(不编译为真实机器代码)的实现方式。

解释器 (VM) 和编译器可以是单独的程序(例如 javajavac),也可以只是一个程序(例如 Ruby 或 Python)。

A virtual machine is basically an interpreter that interprets a language closer to machine code. When real machine interprets real machine code, Virtual Machine interprets a made-up machine code. Some VM-s interpret machine code of an actual computer - these are called emulators.

It's easier to write an interpreter for a simple assembly-like language, then for the full high-level language. Besides, a lot of high-level code-constructs are often just syntactic sugar over some basic principles. So it's easier to just write a compiler that translates all those complex concepts to simple VM-language, so we don't have to write a complex interpreter but can get away with simple one (a VM). And then you have more time for optimizing the VM.

That's basically how most languages these days (that don't compile down to real machine code) are implemented.

The interpreter (VM) and compiler can either be separate programs (like java and javac), or they can be just one program (like with Ruby or Python).

月朦胧 2024-10-18 03:20:26

VM(虚拟机)实际上是语言设计者的一种工具,可以避免编写语言实现时的一些复杂性。

基本上是虚拟计算机的规范以及所述计算机的每一部分如何与另一部分交互。您可以在本规范中编写一些可以由实际语言使用或不使用的假设。

在此规范中,您通常定义处理器如何工作、内存如何工作、可能的读/写屏障等,以及与之交互的更简单的汇编语言。

最终语言通常是从您正在编写的文本文件翻译(编译)成为该机器编写的表示形式。

这有一些优点:

  • 您可以将语言与
    特定的硬件架构
  • 通常允许您控制什么
    碰巧
  • 不同的人可以移植到不同的架构,
  • 你有更多的信息来优化代码
  • 等。

还有一个很酷的因素:看,我做了一个虚拟机:)。

A VM (Virtual Machine) is actually a tool for a language designer to avoid some complexity in writing the implementation of a language.

Basically is a specification of a virtual computer and how each piece of said computer will interact with the other. You can code some assumptions in this specification that can be used by the actual language or not.

In this specification usually you define how the processor/processors work, how the memory works, what read/write barrier are possible etc, and a simpler assembly language to interact with it.

The final language is usually translated (compiled) from the text files you are writing into a representation written for that machine.

This has some advantages:

  • you decouple the language from a
    specific hardware architecture
  • usually allows you to control what
    happens
  • different people can port to a different architecture
  • you have more information to let optimize the code
  • etc.

There is also the coolness factor: Look Ma i made a virtual machine :).

木格 2024-10-18 03:20:26

来自 虚拟机 的维基百科条目:

“虚拟机 (VM) 是像物理机器一样执行程序的机器(即计算机)。”

从理论上讲,虚拟机最大的资产是代码可移植性 - “一次编写,随处运行”

虚拟机最著名的示例可能是JVM,最初设计用于运行 Java 代码,但现在也越来越多地用于 Clojure 和 Scala 等语言。

动态语言没有什么特殊之处意味着它们需要虚拟机。然而,他们确实需要一个解释器可能构建在虚拟机上。

From the wikipedia entry on Virtual Machines:

"A virtual machine (VM) is a software implementation of a machine (i.e. a computer) that executes programs like a physical machine."

The greatest asset of Virtual Machines is, in theory, code portability - "write once, run anywhere"

Probably the best known example of a Virtual Machine is the JVM, originally designed to run Java code, but now also increasingly used for langauges such as Clojure and Scala.

There's nothing specific to dynamic languages that means they need a VM. They do however need an interpreter, which could be built on a VM.

乖乖 2024-10-18 03:20:26

没有“需要”,这些语言中的任何一种都提供直接发出机器代码以在给定体系结构中实现其语言语义的编译器。

虚拟机的想法是抽象出所有不同硬件和软件制造商之间的架构差异,以便开发人员可以在一台机器上进行写入。

There's no "need", any of these languages provide compilers that directly emit the machine code to implement the semantics of their language in a given architecture.

The idea of a virtual machine is to abstract away the architectural differences between all the different hardware and software manufacturers so that developers have a single machine to write to.

捎一片雪花 2024-10-18 03:20:26

Java和Python可以以保持平台独立性的方式进行编译。即使对于 C# 也是如此。优点是虚拟机能够将这种大多数强类型字节码转换为非常好的平台特定代码,并且开销相对较低。由于 Java 旨在“构建一次 - 随处运行”,因此创建了 JVM。

Java and python can be compiled in a way that maintains platform independance. This holds even for C#. Advantages are that VMs are able to convert this mostly strongly typed bytecode into very good platform specific code with relativ low overhead. Since Java is intended to "build once - run anywhere", the JVM has been created.

迷离° 2024-10-18 03:20:26

想象一下,您创建了一种编程语言:您弄清楚了语言语义并开发了一种很好的语法。

然而,文本表示还不够:执行程序时必须一次又一次地解析文本,效率很低,因此很自然地添加内存中的二进制表示。将其与自定义内存管理器结合起来,您基本上就拥有了一个虚拟机。

现在,为了获得额外的好处,可以开发一种字节码格式来序列化内存中表示和运行时加载程序,或者,如果您想采用脚本语言,则可以使用 eval() 函数。

对于最后的结局,添加 JIT。

Imagine you created a programming language: you figured out the language semantics and developed a nice syntax.

However, a textual representation isn't enough: Having to parse text again and again when executing a program is inefficient, so it's natural to add an in-memory binary representation. Couple that with a custom memory manager, and you've basically got a VM.

Now, for extra points, either develop a bytecode format for serialization of your in-memory representation and a runtime loader, or, if you want to go the way of scripting languages, an eval() function.

For the grand finale, add a JIT.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文