用户定义的数据类型/CPU指令集的操作

发布于 2024-09-26 03:50:23 字数 190 浏览 0 评论 0原文

在任何编程环境中,无论我最终选择什么数据类型,CPU都只会执行算术运算(加法/逻辑运算)。

这种转换(从用户定义的数据类型/操作到 CPU 指令集)是如何发生的,以及编译器、解释器、汇编器和链接器在这个生命周期中的作用是什么

以及 OOPS 如何处理这种映射,因为最坏的情况大多数都是 OOPS 中的对象(我的意思是Java语言)..

In any programming environment,what ever the data type I am going to choose finally the CPU will do only the Arithmetic operations(addition/logical operations).

How this transition(from user defined data type/operations to CPU instruction set) happens and what is the role of compiler,interpreter,assembler and linker in this life cycle

Also how OOPS handles this mapping since the worst case mostly all are objects in OOPS(I mean the Java language)..

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(2

花开雨落又逢春i 2024-10-03 03:50:23

Java源码-->本机代码翻译实际上分为两个不同的步骤:编译时从源代码到字节码的转换(这就是 javac 所做的),以及运行时从字节码到本机 CPU 指令的转换(这就是 javac 所做的) >java 确实如此)。

当源代码被“编译”时,字段和方法被压缩为符号表中的条目。你说“System.out.println()”,javac 将其转换为类似“获取符号 #2004 引用的静态字段,并调用符号 #300 引用的方法” (其中#2004 可能是“System.out”,#300 可能是“void java.io.PrintStream.println()”)。 (注意,我过于简单化了——这些符号看起来一点也不像,而且它们被分割得更多一些。但它们确实包含这种信息。)

在运行时,JVM 会查看这些符号,加载类其中引用,并运行(或生成,如果是 JITting)查找和执行该方法所需的本机指令。 Java 中没有真正的“链接器”;所有链接都是在运行时根据引用的类完成的。这很像 DLL 在 Windows 中的工作方式。

JIT 是最接近“汇编器”的东西。它获取字节码并动态生成等效的本机代码。不过,字节码不是人类可读的形式,因此我通常不会将翻译视为“汇编”。

...

在 C 和 C++(不是 C++/CLI)等语言中,情况就完全不同了。所有的翻译(以及大量的链接)都发生在编译时。对 struct 成员的访问被转换为类似“给我从这个特定字节串开头算起的 int 4 个字节”之类的内容。那里没有灵活性;如果结构的布局发生变化,通常整个应用程序必须重新编译。

Java source --> native code translation actually happens in two distinct steps: the conversion from source code to bytecode at compile time (that's what javac does), and the conversion from bytecode to native CPU instructions at runtime (that's what java does).

When the source code is being "compiled", the fields and methods get condensed into entries in a symbol table. You say "System.out.println()", and javac turns it into something like "get the static field referenced by symbol #2004, and invoke the method referred to by symbol #300 on it" (where #2004 might be "System.out" and #300 might be "void java.io.PrintStream.println()"). (Note, i'm way oversimplifying -- the symbols look nothing like that, and they're split up a bit more. But they do contain that kind of info.)

At runtime, the JVM looks at those symbols, loads the classes referred to in them, and runs (or generates, if it's JITting) the native instructions necessary to find and execute the method. There's no real "linker" in Java; all the linking is done at runtime, based on the classes referenced. It's a lot like how DLLs work in Windows.

JIT is about the closest thing there is to an "assembler". It takes the bytecode and generates equivalent native code on the fly. The bytecode isn't in human-readable form, though, so i wouldn't normally count the translation as "assembling".

...

In languages like C and C++ (not C++/CLI), the story is quite different. All of the translation (and a good bit of linking) happens at compile time. Access to members of a struct gets converted into something like "give me the int 4 bytes from the beginning of this particular bunch of bytes". There's no flexibility there; if the struct's layout changes, generally the whole app has to be recompiled.

柏拉图鍀咏恒 2024-10-03 03:50:23

考虑一种仅具有各种大小的整数和浮点数的语言的起点,以及一种指向内存的类型,该类型允许我们拥有指向这些类型的指针。

这与 CPU 使用的机器代码的相关性相对清晰(尽管事实上我们可能会对此进行优化)。

我们可以通过在某种编码中存储代码点来添加字符,以及我们将字符串构建为此类字符的数组。

现在假设我们想要将其移动到这样的程度:

class User
{
  int _id;
  char* _username;
  public User(int id, char* username)
  {
    _id = id;
    _username = username;
  }
  public virtaul bool IsDefaultUser()
  {
    return _id == 0;
  }
}

我们需要添加到语言中的第一件事是某种包含成员的结构/类构造。那么我们可以得到:

class User
{
  int _id;
  char* _username;
}

我们的编译过程知道这意味着存储一个整数,后跟一个指向字符数组的指针。因此它知道访问 _id 意味着访问结构体起始地址处的整数,而访问 _username 意味着访问距结构体起始地址给定偏移量处的 char 指针。

鉴于此,构造函数可以作为一个函数存在,执行以下操作:

  _ctor_User*(int id, char* username)
  {
    User* toMake = ObtainMemoryForUser();
    toMake._id = id;
    toMake._username = ObtainMemoryAndCopyString(username);
    return toMake;
  }

获取内存并在适当时清理它很复杂,请查看 K&R 中有关如何使用指向结构的指针以及 malloc 如何查找结构的部分这样就可以做到这一点。

从这一点来看,我们还可以使用以下内容来实现 IsDefaultUser:

bool _impl_IsDefaultUser(*User this)
{
  return this._id == 0
}

但这不能被覆盖。为了允许覆盖,我们将 User 更改为:

class User
{
  UserVTable* _vTable;
  int _id;
  char* _username;
}

然后 _vTable 指向函数指针表,在本例中,该表包含一个条目,它是指向上面函数的指针。然后调用虚拟成员就变成了查看该表中的正确偏移量并调用找到的适当函数的问题。派生类将具有不同的 _vTable,除了对于那些被重写的方法具有不同的函数指针之外,该 _vTable 是相同的。

这掩盖了很多东西,并不是每种情况下唯一的可能性(例如,v表不是实现可重写方法的唯一方法),但确实展示了我们如何构建一种可以编译为面向对象的语言对更原始数据类型的更原始操作。

它还掩盖了执行类似 C# 编译为 IL,然后再编译为机器代码的可能性,因此 OO 语言和实际执行的机器代码之间有两个步骤。

Consider the starting point of a language that has only integers and floats of various sizes, and a type that points into memory that lets us have pointers to those types.

The correlation from this to the machine code the CPU uses would be relatively clear (though in fact we might well optimise beyond that).

Characters we can add by storing code-points in some encoding, and strings we build as arrays of such characters.

Now lets say we want to move this to the point where we can have something like:

class User
{
  int _id;
  char* _username;
  public User(int id, char* username)
  {
    _id = id;
    _username = username;
  }
  public virtaul bool IsDefaultUser()
  {
    return _id == 0;
  }
}

The first thing we need to add to our language is some sort of struct/class construct that contains the members. Then we can have as far as:

class User
{
  int _id;
  char* _username;
}

Our compiling process knows that this means storing an integer followed by a pointer to an array of characters. It therefore knows that accessing _id means accessing the integer at the address of the start of the structure, and accessing _username means accessing the pointer to char at a given offset from the address of the start of the structure.

Given this, the constructor can exist as a function that does something like:

  _ctor_User*(int id, char* username)
  {
    User* toMake = ObtainMemoryForUser();
    toMake._id = id;
    toMake._username = ObtainMemoryAndCopyString(username);
    return toMake;
  }

Obtaining memory and cleaning it up when appropriate is complicated, take a look at the section in the K&R on how to use pointers to structures and how malloc looks for one way this could be done.

From this point we can also implement IsDefaultUser with something like:

bool _impl_IsDefaultUser(*User this)
{
  return this._id == 0
}

This can't be overridden though. To allow for overriding we change User to be:

class User
{
  UserVTable* _vTable;
  int _id;
  char* _username;
}

Then _vTable points at a table of pointers to functions, which in this case contains a single entry, which is a pointer to the function above. Then calling the virtual member becomes a matter of looking at the correct offset into that table, and calling the appropriate function found. A derived class would have a different _vTable that would be the same except for having different function pointers for those methods that are overridden.

This is glossing over an awful lot, and not the only possibility in each case (e.g. v-tables are not the only way to implement overridable methods), but does show how we can build an object-oriented language which can be compiled down to more primitive operations on more primitive data types.

It also glosses over the possibility of doing something like the way C# is compiled to IL which is then in turn compiled to machine code, so that there are two steps between the OO language and the machine code that will actually be excuted.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文