C# 中的运行时转换?
我正在从概念上将数据存储在表中的自定义数据格式读取数据。每列可以有不同的类型。这些类型特定于文件格式并映射到 C# 类型。
我有一个 Column 类型,它封装了列的概念,通用参数 T 指示列中的 C# 类型。 Column.FormatType 指示格式类型方面的类型。因此,要读取列的值,我有一个简单的方法:
protected T readColumnValue<T>(Column<T> column)
{
switch (column.FormatType)
{
case FormatType.Int:
return (T)readInt();
}
}
多么简单而优雅!现在我所要做的就是:
Column<int> column=new Column<int>(...)
int value=readColumnValue(column);
上面对类型 T 的强制转换可以在 Java 中工作(尽管有警告),并且由于擦除,在调用者实际使用该值之前不会对强制转换进行求值——此时如果转换不正确,则会抛出 ClassCastException。
这在 C# 中不起作用。然而,因为 C# 并没有抛弃泛型类型,所以应该可以让它变得更好!我似乎可以在运行时询问 T 的类型:
Type valueType=typeof(T);
太棒了——所以我有了要返回的值的类型。我能用它做什么?如果这是 Java,因为存在执行运行时转换的 Class.Cast 方法,我就可以回家了! (因为每个 Java Class 类都有一个指示该类的泛型类型参数,因为它还将提供编译时类型安全性。)以下内容来自我的梦想世界,其中 C# Type 类的工作方式与 Java Class 类类似:
protected T readColumnValue<T>(Column<T> column)
{
Type<T> valueType=typeof(T);
switch (column.FormatType)
{
case FormatType.Int:
return valueType.Cast(readInt());
}
}
显然有没有 Type.Cast()——那我该怎么办?
(是的,我知道有一个 Convert.ChangeType() 方法,但它似乎执行转换,而不是进行简单的转换。)
更新:因此,如果不使用 (T)(object 装箱/拆箱,这似乎是不可能的)readInt()。但这是不可接受的。这些文件非常大——例如 80MB。假设我想读取一整列值。我有一个优雅的小方法,它使用泛型并像这样调用上面的方法:
public T[] readColumn<T>(Column<T> column, int rowStart, int rowEnd, T[] values)
{
... //seek to column start
for (int row = rowStart; row < rowEnd; ++row)
{
values[row - rowStart] = readColumnValue(column);
... //seek to next row
对数百万个值进行装箱/拆箱?听起来不太好。我觉得很荒谬,我必须扔掉泛型并求助于 readColumnInt()、readColumnFloat() 等,并重现所有这些代码只是为了防止装箱/拆箱!
public int[] readColumnInt(Column<int> column, int rowStart, int rowEnd, int[] values)
{
... //seek to column start
for (int row = rowStart; row < rowEnd; ++row)
{
values[row - rowStart] = readInt();
... //seek to next row
public float[] readColumnFloat(Column<float> column, int rowStart, int rowEnd, float[] values)
{
... //seek to column start
for (int row = rowStart; row < rowEnd; ++row)
{
values[row - rowStart] = readFloat();
... //seek to next row
这很可怜。 :(
I'm reading data from a custom data format that conceptually stores data in a table. Each column can have a distinct type. The types are specific to the file format and map to C# types.
I have a Column type that encapsulates the idea of a column, with generic parameter T indicating the C# type that is in the column. The Column.FormatType indicates the type in terms of the format types. So to read a value for a column, I have a simple method:
protected T readColumnValue<T>(Column<T> column)
{
switch (column.FormatType)
{
case FormatType.Int:
return (T)readInt();
}
}
How simple and elegant! Now all I have to do is:
Column<int> column=new Column<int>(...)
int value=readColumnValue(column);
The above cast to type T would work in Java (albeit with a warning), and because of erasure the cast would not be evaluated until the value was actually used by the caller---at which point a ClassCastException would be thrown if the cast wasn't correct.
This doesn't work in C#. However, because C# doesn't throw away the generic types it should be possible to make it even better! I appears that I can ask for the type of T at runtime:
Type valueType=typeof(T);
Great---so I have the type of value that I'll be returning. What can I do with it? If this were Java, because there exists a Class.Cast method which performs a runtime cast, I would be home free! (Because each Java Class class has a generic type parameter indicating of the class is for it would also provide compile-time type safety.) The following is from my dream-world where C# Type class works like the Java Class class:
protected T readColumnValue<T>(Column<T> column)
{
Type<T> valueType=typeof(T);
switch (column.FormatType)
{
case FormatType.Int:
return valueType.Cast(readInt());
}
}
Obviously there is no Type.Cast()---so what do I do?
(Yes, I know there is a Convert.ChangeType() method, but that seems to perform conversions, not make a simple cast.)
Update: So it's seeming like this is simply not possible without boxing/unboxing using (T)(object)readInt(). But this is not acceptable. These files are really big---80MB, for example. Let's say I want to read an entire column of values. I'd have an elegant little method that uses generics and calls the method above like this:
public T[] readColumn<T>(Column<T> column, int rowStart, int rowEnd, T[] values)
{
... //seek to column start
for (int row = rowStart; row < rowEnd; ++row)
{
values[row - rowStart] = readColumnValue(column);
... //seek to next row
Boxing/unboxing for millions of values? That doesn't sound good. I find it absurd that I'm going to have to throw away generics and resort to readColumnInt(), readColumnFloat(), etc. and reproduce all this code just to prevent boxing/unboxing!
public int[] readColumnInt(Column<int> column, int rowStart, int rowEnd, int[] values)
{
... //seek to column start
for (int row = rowStart; row < rowEnd; ++row)
{
values[row - rowStart] = readInt();
... //seek to next row
public float[] readColumnFloat(Column<float> column, int rowStart, int rowEnd, float[] values)
{
... //seek to column start
for (int row = rowStart; row < rowEnd; ++row)
{
values[row - rowStart] = readFloat();
... //seek to next row
This is pitiful. :(
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(5)
我认为最接近的方法是重载 readColumnInfo 而不是像这样通用:
I think the closest way to make this work is to overload readColumnInfo and not make it generic like so:
为什么不实现自己的从
Column
到T
的转换运算符?然后您可以在需要时轻松转换:
Why don't you implement your own casting operator from
Column<T>
toT
?Then you can easily convert whenever you need to:
所有这些的简短答案(请参阅问题详细信息)是 C# 不允许显式转换为泛型类型 T,即使您知道 T 的类型并且您知道您拥有的值是 T——除非您想要接受装箱/拆箱:
就个人而言,这似乎是该语言的一个主要缺陷——没有任何内容表明需要进行装箱/拆箱。
然而,如果您提前知道所有可能的不同类型的 T,则有一个解决方法。继续问题中的示例,我们有一个通用类型 T 的列,表示文件中的表格数据,以及一个根据列的类型从列中读取值的解析器。我想要在解析器中执行以下操作:
正如所讨论的,这是行不通的。但是(假设本例中解析器的类型为 MyParser),您实际上可以为每个 T 创建一个不同的 Column 子类,如下所示:
现在我可以更新我的解析方法以委托给该列:
请注意,正在发生相同的程序逻辑---只是通过对通用列类型进行子类化,我们允许专门化一个方法来为我们执行到 T 的转换。换句话说,我们仍然有 (T)readInt(),只是 (T) 转换正在发生,不是在一行内,而是在方法的重写中,从: 变为
So
如果编译器可以弄清楚如何在方法专门化中强制转换为 T,它应该能够通过单行强制转换来弄清楚。换句话说,没有什么可以阻止 C# 拥有 typeof(T).cast() 方法,该方法可以完成与上面方法专业化中所做的完全相同的事情。
(整个练习中更令人沮丧的是,这个解决方案迫使我在努力将其分离之后将解析代码混合到数据对象模型中。)
现在,如果有人编译此代码,请查看生成的 CIL,并且发现 .NET 对返回值进行装箱/拆箱只是为了让专门的 readValue() 方法能够满足通用返回类型 T,我会哭。
The short answer to all of this (see the question details) is that C# does not allow explicit casting to generic type T even if you know the type of T and you know the value that you have is T---unless you want to live with boxing/unboxing:
This personally seems like a major deficiency in the language---there is nothing about the situation that says that boxing/unboxing would need to occur.
There is, however, a workaround, if you know ahead of time all the different types of T that are possible. Continuing the example in the question, we have a Column of generic type T representing tabular data in a file, and a parser that reads values from a column based upon the type of the column. I wanted the following in the parser:
As discussed, that doesn't work. But (assuming for this example that the parser is of type MyParser) you can actually create a different Column subclass for each T, like this:
Now I can update my parsing method to delegate to the column:
Note that the same program logic is occurring---it's just that by subclassing the generic column type, we've allowed specialization of a method to do the casting to T for us. In other words, we still have (T)readInt(), it's just that the (T) cast is happening, not within a single line, but in the override of the method that changes from:
to
So if the compiler can figure out how to cast to T in a method specialization, it should be able to figure it out on a single line cast. Put another way, nothing prevents C# from having a typeof(T).cast() method that would do exactly the same thing being done in method specialization above.
(What's even more frustrating about this whole exercise is that this solution has forced me to mix parsing code into the data object model, after trying so hard to keep it separate.)
Now, if somebody compiles this, looks at the generated CIL, and finds out that .NET is boxing/unboxing the return value just so that the specialized readValue() method can satisfy the generic return type T, I will cry.
数据是按行优先顺序还是列优先顺序存储的?如果它是按行优先顺序排列的,那么必须多次扫描整个数据集(您所说的数百万个值)才能选出每一列,这将使装箱的成本相形见绌。
我真的建议在一次传递数据中完成所有操作,可能是通过构建一个
Action
向量(或Predicate
来报告错误)委托该过程每个单元格都放入与该列关联的List
中。封闭式代表可以提供很多帮助。类似于:当然,您可以通过在每种格式的备用
storeMap
字典之间进行选择,将元素解析逻辑与数据集行走逻辑分开。如果您不将内容存储为字符串,您也可以使用Predicate
或类似的。Is the data stored in row-major or column-major order? If it's in row-major order, then having to scan the entire data set (millions of values you said) multiple times to pick out each column will dwarf the cost of boxing.
I really would suggest doing everything in one pass through the data, probably by building a vector of
Action<string>
(orPredicate<string>
to report errors) delegates that process a single cell each into aList<T>
associated with the column. Closed delegates could help a whole lot. Something like:Of course you could separate the element parsing logic from the dataset walking logic, by choosing between alternate
storeMap
dictionaries for each format. And if you don't store things as strings, you could just as well usePredicate<byte[]>
or similar.