为什么有些语言需要装箱和拆箱?

发布于 2024-07-25 14:07:18 字数 379 浏览 6 评论 0原文

这不是什么是装箱和拆箱的问题, 而是为什么像 Java 和 C# 这样的语言需要它?

我非常熟悉 C++、STL 和 Boost。

在 C++ 中,我可以很容易地写出这样的东西,

std::vector<double> dummy;

我有一些 Java 经验,但我真的很惊讶,因为我必须写这样的东西,

ArrayList<Double> dummy = new ArrayList<Double>();

我的问题是,为什么它应该是一个对象,从技术上来说包含原始数据有什么困难谈论泛型时的类型?

This is not a question of what is boxing and unboxing,
it is rather why do languages like Java and C# need that ?

I am greatly familiar wtih C++, STL and Boost.

In C++ I could write something like this very easily,

std::vector<double> dummy;

I have some experience with Java, but I was really surprised because I had to write something like this,

ArrayList<Double> dummy = new ArrayList<Double>();

My question, why should it be an Object, what is so hard technically to include primitive types when talking about Generics ?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(6

柠栀 2024-08-01 14:07:18

在谈论泛型时包含原始类型在技术上有什么困难?

就 Java 而言,这是因为泛型的工作方式。 在 Java 中,泛型是一种编译时技巧,它会阻止您将 Image 对象放入 ArrayList 中。 然而,Java 的泛型是通过类型擦除来实现的:泛型类型信息在运行时会丢失。 这是出于兼容性原因,因为泛型是在 Java 诞生相当晚的时候才添加的。 这意味着,在运行时, ArrayList 实际上是一个 ArrayList(或者更好:只是 ArrayList 期望并且在其所有方法中返回 Object),当您检索值时,该对象会自动转换为 String

但由于 int 不是从 Object 派生的,因此您不能将其放入需要(在运行时)Object 的 ArrayList 中,并且您可以也不会将 Object 转换为 int 。 这意味着基元 int 必须包装到从 Object 继承的类型中,例如 Integer

例如,C# 的工作方式就不同。 C# 中的泛型也在运行时强制执行,并且 List 不需要装箱。 C# 中的装箱仅当您尝试将值类型(如 int)存储在引用类型变量(如 object)中时才会发生。 由于 C# 中的 int 继承自 C# 中的 Object,因此编写 object obj = 2 是完全有效的,但是 int 会被装箱,这样就完成了由编译器自动执行(没有Integer引用类型暴露给用户或任何东西)。

what is so hard technically to include primitive types when talking about Generics ?

In Java's case, it's because of the way generics work. In Java, generics are a compile-time trick, that prevents you from putting an Image object into an ArrayList<String>. However, Java's generics are implemented with type erasure: the generic type information is lost during run-time. This was for compatibility reasons, because generics were added fairly late in Java's life. This means that, run-time, an ArrayList<String> is effectively an ArrayList<Object> (or better: just ArrayList that expects and returns Object in all of its methods) that automatically casts to String when you retrieve a value.

But since int doesn't derive from Object, you can't put it in an ArrayList that expects (at runtime) Object and you can't cast an Object to int either. This means that the primitive int must be wrapped into a type that does inherit from Object, like Integer.

C# for example, works differently. Generics in C# are also enforced at runtime and no boxing is required with a List<int>. Boxing in C# only happens when you try to store a value type like int in a reference type variable like object. Since int in C# inherits from Object in C#, writing object obj = 2 is perfectly valid, however the int will be boxed, which is done automatically by the compiler (no Integer reference type is exposed to the user or anything).

栩栩如生 2024-08-01 14:07:18

装箱和拆箱是语言(如 C# 和 Java)实现内存分配策略的必要条件。

某些类型在堆栈上分配,其他类型在堆上分配。 为了将堆栈分配类型视为堆分配类型,需要装箱将堆栈分配类型移动到堆上。 拆箱是相反的过程。

在 C# 中,堆栈分配类型称为值类型(例如 System.Int32System.DateTime),堆分配类型称为 >引用类型(例如System.StreamSystem.String)。

在某些情况下,能够像引用类型一样处理值类型(反射就是一个例子)是有利的,但在大多数情况下,最好避免装箱和拆箱。

Boxing and unboxing are a necessity born out of the way that languages (like C# and Java) implement their memory allocation strategies.

Certain types are allocated on the stack and other on the heap. In order to treat a stack-allocated type as a heap-allocated type, boxing is required to move the stack-allocated type onto the heap. Unboxing is the reverse processes.

In C# stack-allocated types are called value types (e.g. System.Int32 and System.DateTime) and heap-allocated types are called reference types (e.g. System.Stream and System.String).

In some cases it is advantageous to be able to treat a value type like a reference type (reflection is one example) but in most cases, boxing and unboxing are best avoided.

笛声青案梦长安 2024-08-01 14:07:18

我相信这也是因为基元不是从 Object 继承的。 假设您有一个方法希望能够接受任何内容作为参数,例如。

class Printer {
    public void print(Object o) {
        ...
    }
}

您可能需要向该方法传递一个简单的原始值,例如:

printer.print(5);

您无需装箱/拆箱即可做到这一点,因为 5 是一个原始值,而不是一个对象。 您可以重载每个基元类型的打印方法以启用此类功能,但这很痛苦。

I believe this is also because primitives do not inherit from Object. Suppose you have a method that wants to be able to accept anything at all as the parameter, eg.

class Printer {
    public void print(Object o) {
        ...
    }
}

You may need to pass a simple primitive value to that method, like:

printer.print(5);

You would be able to do that without boxing/unboxing, because 5 is a primitive and is not an Object. You could overload the print method for each primitive type to enable such functionality, but it's a pain.

行雁书 2024-08-01 14:07:18

我只能告诉你Java为什么它不支持泛型中的原始类型。

首先存在的问题是,每次支持这一点的问题都会引发关于 java 是否应该具有原始类型的讨论。 这当然阻碍了对实际问题的讨论。

其次,不包含它的主要原因是他们希望二进制向后兼容,这样它就可以在不知道泛型的虚拟机上运行而不修改。 这种向后兼容性/迁移兼容性的原因也是为什么现在 Collections API 支持泛型并保持不变,并且没有(如在 C# 中引入泛型时)一套完整的新的通用感知 Collection API。

兼容性是使用 ersure(在编译时删除通用类型参数信息)完成的,这也是您在 java 中收到如此多未经检查的强制转换警告的原因。

您仍然可以添加具体化泛型,但这并不那么容易。 仅添加类型信息添加运行时而不是删除它是行不通的,因为它会破坏源代码和代码。 二进制兼容性(不能继续使用原始类型,也不能调用现有的编译代码,因为它们没有相应的方法)。

另一种方法是 C# 选择的方法:参见上文,

并且此用例不支持自动装箱/拆箱,因为自动装箱成本太高。

Java 理论与实践:泛型陷阱

I can only tell you for Java why it doesn't support primitve types in generics.

First there was the problem that the question to support this everytime brought on the discussion if java should even have primitive types. Which of course hindered the discussion of the actual question.

Second the main reason not to include it was that they wanted binary backward compatibility so it would run unmodified on a VM not aware of generics. This backward compatibility/migration compatibility reason is also why now the Collections API supports generics and stayed the same and there isn't (as in C# when they introduced generics) a complete new set of a generic aware Collection API.

The compatibility was done using ersure (generic type parameter info removed at compile time) which is also the reason you get so many unchecked cast warnings in java.

You could still add reified generics but it's not that easy. Just adding the type info add runtime instead of removing it won't work as it breaks source & binary compatibility (you can't continue to use raw types and you can't call existing compiled code because they don't have the corresponding methods).

The other approach is the one C# chose: see above

And automated autoboxing/unboxing wasn't supported for this use case because autoboxing costs too much.

Java theory and practice: Generics gotchas

勿挽旧人 2024-08-01 14:07:18

存储在堆上的每个非数组非字符串对象都包含一个 8 或 16 字节标头(32/64 位系统的大小),后跟该对象的公共和私有字段的内容。 数组和字符串具有上述标头,再加上一些定义数组长度和每个元素大小的字节(可能还有维度数、每个额外维度的长度等),后面是第一个元素的所有字段元素,然后是第二个元素的所有字段,等等。给定对对象的引用,系统可以轻松检查标头并确定它是什么类型。

引用类型存储位置保存一个四字节或八字节的值,该值唯一标识存储在堆上的对象。 在当前的实现中,该值是一个指针,但更容易(并且在语义上等效)将其视为“对象 ID”。

值类型存储位置保存值类型字段的内容,但没有任何关联的标头。 如果代码声明了 Int32 类型的变量,则无需存储该 Int32 的信息来说明它是什么。 该位置保存 Int32 的事实实际上是作为程序的一部分存储的,因此不必存储在该位置本身中。 例如,如果一个对象有一百万个对象,每个对象都有一个 Int32 类型的字段,那么这代表了很大的节省。 每个持有 Int32 的对象都有一个标头,用于标识可以操作它的类。 由于该类代码的一个副本可以对数百万个实例中的任何一个实例进行操作,因此将该字段作为 Int32 类型作为代码的一部分这一事实比为其中每一个实例都进行存储要高效得多字段包含有关它是什么的信息。

当请求将值类型存储位置的内容传递给不知道需要该特定值类型的代码时,装箱是必要的。 需要未知类型对象的代码可以接受对存储在堆上的对象的引用。 由于存储在堆上的每个对象都有一个标头来标识它是什么类型的对象,因此只要需要以需要知道其类型的方式使用对象,代码就可以使用该标头。

请注意,在 .net 中,可以声明所谓的泛型类和方法。 每个这样的声明都会自动生成一系列类或方法,这些类或方法除了期望操作的对象类型外都是相同的。 如果将 Int32 传递给例程 DoSomething(T param),则会自动生成例程的一个版本,其中每个 T 类型的实例 已有效地替换为 Int32。 该版本的例程将知道声明为类型 T 的每个存储位置都包含一个 Int32,就像例程被硬编码为使用 T 的情况一样。 code>Int32 存储位置,不需要将类型信息与这些位置本身一起存储。

Every non-array non-string object stored on the heap contains an 8- or 16-byte header (sizes for 32/64-bit systems), followed by the contents of that object's public and private fields. Arrays and strings have the above header, plus some more bytes defining the length of the array and size of each element (and possibly the number of dimensions, length of each extra dimension, etc.), followed by all of the fields of the first element, then all the fields of the second, etc. Given an reference to an object, the system can easily examine the header and determine what type it is.

Reference-type storage locations hold a four- or eight-byte value which uniquely identifies an object stored on the heap. In present implementations, that value is a pointer, but it's easier (and semantically equivalent) to think of it as an "object ID".

Value-type storage locations hold the contents of the value type's fields, but do not have any associated header. If code declares a variable of type Int32, there's no need to need to store information with that Int32 saying what it is. The fact that that location holds an Int32 is effectively stored as part of the program, and so it doesn't have to be stored in the location itself. This an represent a big savings if, e.g., one has a million objects each of which have a field of type Int32. Each of the objects holding the Int32 has a header which identifies the class that can operate it. Since one copy of that class code can operate on any of the million instances, having the fact that the field is an Int32 be part of the code is much more efficient than having the storage for every one of those fields include information about what it is.

Boxing is necessary when a request is made to pass the contents of a value-type storage location to code which doesn't know to expect that particular value type. Code which expects objects of unknown type can accept a reference to an object stored on the heap. Since every object stored on the heap has a header identifying what type of object it is, code can use that header whenever it's necessary to use an object in a way which would require knowing its type.

Note that in .net, it is possible to declare what are called generic classes and methods. Each such declaration automatically generates a family of classes or methods which are identical except fort he type of object upon which they expect to act. If one passes an Int32 to a routine DoSomething<T>(T param), that will automatically generate a version of the routine in which every instance of type T is effectively replaced with Int32. That version of the routine will know that every storage location declared as type T holds an Int32, so just as in the case where a routine was hard-coded to use an Int32 storage location, it will not be necessary to store type information with those locations themselves.

心碎无痕… 2024-08-01 14:07:18

在 Java 和 C# 中(与 C++ 不同),一切都扩展了 Object,因此像 ArrayList 这样的集合类可以保存 Object 或其任何后代(基本上是任何东西)。

然而,出于性能原因,java 中的原语或 C# 中的值类型被赋予了特殊的地位。 他们不是对象。 你不能做类似的事情(在Java中):

 7.toString()

即使toString是Object上的一个方法。 为了将这种认可与性能联系起来,创建了等效的对象。 AutoBoxing 删除了必须将原语放入其包装类中然后再次取出的样板代码,从而使代码更具可读性。

C# 中值类型和对象之间的区别更加灰色。 请参阅此处了解它们的不同之处。

In Java and C# (unlike C++) everything extends Object, so collection classes like ArrayList can hold Object or any of its descendants (basically anything).

For performance reasons, however, primitives in java, or value types in C#, were given a special status. They are not object. You cannot do something like (in Java):

 7.toString()

Even though toString is a method on Object. In order to bridge this nod to performance, equivalent objects were created. AutoBoxing removes the boilerplate code of having to put a primitive in its wrapper class and take it out again, making the code more readable.

The difference between value types and objects in C# is more grey. See here about how they are different.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文