C# 中的运行时转换?

发布于 2024-09-25 07:32:23 字数 2296 浏览 8 评论 0原文

我正在从概念上将数据存储在表中的自定义数据格式读取数据。每列可以有不同的类型。这些类型特定于文件格式并映射到 C# 类型。

我有一个 Column 类型,它封装了列的概念,通用参数 T 指示列中的 C# 类型。 Column.FormatType 指示格式类型方面的类型。因此,要读取列的值,我有一个简单的方法:

protected T readColumnValue<T>(Column<T> column)
{
  switch (column.FormatType)
  {
    case FormatType.Int:
      return (T)readInt();
  }
}

多么简单而优雅!现在我所要做的就是:

Column<int> column=new Column<int>(...)
int value=readColumnValue(column);

上面对类型 T 的强制转换可以在 Java 中工作(尽管有警告),并且由于擦除,在调用者实际使用该值之前不会对强制转换进行求值——此时如果转换不正确,则会抛出 ClassCastException。

这在 C# 中不起作用。然而,因为 C# 并没有抛弃泛型类型,所以应该可以让它变得更好!我似乎可以在运行时询问 T 的类型:

Type valueType=typeof(T);

太棒了——所以我有了要返回的值的类型。我能用它做什么?如果这是 Java,因为存在执行运行时转换的 Class.Cast 方法,我就可以回家了! (因为每个 Java Class 类都有一个指示该类的泛型类型参数,因为它还将提供编译时类型安全性。)以下内容来自我的梦想世界,其中 C# Type 类的工作方式与 Java Class 类类似:

protected T readColumnValue<T>(Column<T> column)
{
  Type<T> valueType=typeof(T);
  switch (column.FormatType)
  {
    case FormatType.Int:
      return valueType.Cast(readInt());
  }
}

显然有没有 Type.Cast()——那我该怎么办?

(是的,我知道有一个 Convert.ChangeType() 方法,但它似乎执行转换,而不是进行简单的转换。)

更新:因此,如果不使用 (T)(object 装箱/拆箱,这似乎是不可能的)readInt()。但这是不可接受的。这些文件非常大——例如 80MB。假设我想读取一整列值。我有一个优雅的小方法,它使用泛型并像这样调用上面的方法:

public T[] readColumn<T>(Column<T> column, int rowStart, int rowEnd, T[] values)
{
  ...  //seek to column start
  for (int row = rowStart; row < rowEnd; ++row)
  {
    values[row - rowStart] = readColumnValue(column);
    ... //seek to next row

对数百万个值进行装箱/拆箱?听起来不太好。我觉得很荒谬,我必须扔掉泛型并求助于 readColumnInt()、readColumnFloat() 等,并重现所有这些代码只是为了防止装箱/拆箱!

public int[] readColumnInt(Column<int> column, int rowStart, int rowEnd, int[] values)
{
  ...  //seek to column start
  for (int row = rowStart; row < rowEnd; ++row)
  {
    values[row - rowStart] = readInt();
    ... //seek to next row

public float[] readColumnFloat(Column<float> column, int rowStart, int rowEnd, float[] values)
{
  ...  //seek to column start
  for (int row = rowStart; row < rowEnd; ++row)
  {
    values[row - rowStart] = readFloat();
    ... //seek to next row

这很可怜。 :(

I'm reading data from a custom data format that conceptually stores data in a table. Each column can have a distinct type. The types are specific to the file format and map to C# types.

I have a Column type that encapsulates the idea of a column, with generic parameter T indicating the C# type that is in the column. The Column.FormatType indicates the type in terms of the format types. So to read a value for a column, I have a simple method:

protected T readColumnValue<T>(Column<T> column)
{
  switch (column.FormatType)
  {
    case FormatType.Int:
      return (T)readInt();
  }
}

How simple and elegant! Now all I have to do is:

Column<int> column=new Column<int>(...)
int value=readColumnValue(column);

The above cast to type T would work in Java (albeit with a warning), and because of erasure the cast would not be evaluated until the value was actually used by the caller---at which point a ClassCastException would be thrown if the cast wasn't correct.

This doesn't work in C#. However, because C# doesn't throw away the generic types it should be possible to make it even better! I appears that I can ask for the type of T at runtime:

Type valueType=typeof(T);

Great---so I have the type of value that I'll be returning. What can I do with it? If this were Java, because there exists a Class.Cast method which performs a runtime cast, I would be home free! (Because each Java Class class has a generic type parameter indicating of the class is for it would also provide compile-time type safety.) The following is from my dream-world where C# Type class works like the Java Class class:

protected T readColumnValue<T>(Column<T> column)
{
  Type<T> valueType=typeof(T);
  switch (column.FormatType)
  {
    case FormatType.Int:
      return valueType.Cast(readInt());
  }
}

Obviously there is no Type.Cast()---so what do I do?

(Yes, I know there is a Convert.ChangeType() method, but that seems to perform conversions, not make a simple cast.)

Update: So it's seeming like this is simply not possible without boxing/unboxing using (T)(object)readInt(). But this is not acceptable. These files are really big---80MB, for example. Let's say I want to read an entire column of values. I'd have an elegant little method that uses generics and calls the method above like this:

public T[] readColumn<T>(Column<T> column, int rowStart, int rowEnd, T[] values)
{
  ...  //seek to column start
  for (int row = rowStart; row < rowEnd; ++row)
  {
    values[row - rowStart] = readColumnValue(column);
    ... //seek to next row

Boxing/unboxing for millions of values? That doesn't sound good. I find it absurd that I'm going to have to throw away generics and resort to readColumnInt(), readColumnFloat(), etc. and reproduce all this code just to prevent boxing/unboxing!

public int[] readColumnInt(Column<int> column, int rowStart, int rowEnd, int[] values)
{
  ...  //seek to column start
  for (int row = rowStart; row < rowEnd; ++row)
  {
    values[row - rowStart] = readInt();
    ... //seek to next row

public float[] readColumnFloat(Column<float> column, int rowStart, int rowEnd, float[] values)
{
  ...  //seek to column start
  for (int row = rowStart; row < rowEnd; ++row)
  {
    values[row - rowStart] = readFloat();
    ... //seek to next row

This is pitiful. :(

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(5

单调的奢华 2024-10-02 07:32:23
return (T)(object)readInt();
return (T)(object)readInt();
同尘 2024-10-02 07:32:23

我认为最接近的方法是重载 readColumnInfo 而不是像这样通用:

    protected Int32 readColumnValue(Column<Int32> column) {
        return readInt();
    }
    protected Int64 readColumnValue(Column<Int64> column) {
        return readLong();
    }
    protected String readColumnValue(Column<String> column){
        return String.Empty;
    }

I think the closest way to make this work is to overload readColumnInfo and not make it generic like so:

    protected Int32 readColumnValue(Column<Int32> column) {
        return readInt();
    }
    protected Int64 readColumnValue(Column<Int64> column) {
        return readLong();
    }
    protected String readColumnValue(Column<String> column){
        return String.Empty;
    }
︶ ̄淡然 2024-10-02 07:32:23

为什么不实现自己的从 ColumnT 的转换运算符?

public class Column<T>
{
    public static explicit operator T(Column<T> value)
    {
        return value;
    }

    private T value;
}

然后您可以在需要时轻松转换:

Column<int> column = new Column<int>(...)
int value = (int)column;

Why don't you implement your own casting operator from Column<T> to T?

public class Column<T>
{
    public static explicit operator T(Column<T> value)
    {
        return value;
    }

    private T value;
}

Then you can easily convert whenever you need to:

Column<int> column = new Column<int>(...)
int value = (int)column;
蓝色星空 2024-10-02 07:32:23

所有这些的简短答案(请参阅问题详细信息)是 C# 不允许显式转换为泛型类型 T,即使您知道 T 的类型并且您知道您拥有的值是 T——除非您想要接受装箱/拆箱:

return (T)(object)myvalue;

就个人而言,这似乎是该语言的一个主要缺陷——没有任何内容表明需要进行装箱/拆箱。

然而,如果您提前知道所有可能的不同类型的 T,则有一个解决方法。继续问题中的示例,我们有一个通用类型 T 的列,表示文件中的表格数据,以及一个根据列的类型从列中读取值的解析器。我想要在解析器中执行以下操作:

protected T readColumnValue<T>(Column<T> column)
{
  switch (column.FormatType)
  {
    case FormatType.Int:
      return (T)readInt();
  }
}

正如所讨论的,这是行不通的。但是(假设本例中解析器的类型为 MyParser),您实际上可以为每个 T 创建一个不同的 Column 子类,如下所示:

public abstract class Column<T>
{
  public abstract T readValue(MyParser myParser);
}

public class IntColumn : Column<int>
{
  public override int readValue(MyParser myParser)
  {
    return myParser.readInt();
  }
}

现在我可以更新我的解析方法以委托给该列:

protected T readColumnValue<T>(Column<T> column)
{
  return column.readValue(this);
}

请注意,正在发生相同的程序逻辑---只是通过对通用列类型进行子类化,我们允许专门化一个方法来为我们执行到 T 的转换。换句话说,我们仍然有 (T)readInt(),只是 (T) 转换正在发生,不是在一行内,而是在方法的重写中,从: 变为

  public abstract T readValue(MyParser myParser);

So

  public override int readValue(MyParser myParser)

如果编译器可以弄清楚如何在方法专门化中强制转换为 T,它应该能够通过单行强制转换来弄清楚。换句话说,没有什么可以阻止 C# 拥有 typeof(T).cast() 方法,该方法可以完成与上面方法专业化中所做的完全相同的事情。

(整个练习中更令人沮丧的是,这个解决方案迫使我在努力将其分离之后将解析代码混合到数据对象模型中。)

现在,如果有人编译此代码,请查看生成的 CIL,并且发现 .NET 对返回值进行装箱/拆箱只是为了让专门的 readValue() 方法能够满足通用返回类型 T,我会哭。

The short answer to all of this (see the question details) is that C# does not allow explicit casting to generic type T even if you know the type of T and you know the value that you have is T---unless you want to live with boxing/unboxing:

return (T)(object)myvalue;

This personally seems like a major deficiency in the language---there is nothing about the situation that says that boxing/unboxing would need to occur.

There is, however, a workaround, if you know ahead of time all the different types of T that are possible. Continuing the example in the question, we have a Column of generic type T representing tabular data in a file, and a parser that reads values from a column based upon the type of the column. I wanted the following in the parser:

protected T readColumnValue<T>(Column<T> column)
{
  switch (column.FormatType)
  {
    case FormatType.Int:
      return (T)readInt();
  }
}

As discussed, that doesn't work. But (assuming for this example that the parser is of type MyParser) you can actually create a different Column subclass for each T, like this:

public abstract class Column<T>
{
  public abstract T readValue(MyParser myParser);
}

public class IntColumn : Column<int>
{
  public override int readValue(MyParser myParser)
  {
    return myParser.readInt();
  }
}

Now I can update my parsing method to delegate to the column:

protected T readColumnValue<T>(Column<T> column)
{
  return column.readValue(this);
}

Note that the same program logic is occurring---it's just that by subclassing the generic column type, we've allowed specialization of a method to do the casting to T for us. In other words, we still have (T)readInt(), it's just that the (T) cast is happening, not within a single line, but in the override of the method that changes from:

  public abstract T readValue(MyParser myParser);

to

  public override int readValue(MyParser myParser)

So if the compiler can figure out how to cast to T in a method specialization, it should be able to figure it out on a single line cast. Put another way, nothing prevents C# from having a typeof(T).cast() method that would do exactly the same thing being done in method specialization above.

(What's even more frustrating about this whole exercise is that this solution has forced me to mix parsing code into the data object model, after trying so hard to keep it separate.)

Now, if somebody compiles this, looks at the generated CIL, and finds out that .NET is boxing/unboxing the return value just so that the specialized readValue() method can satisfy the generic return type T, I will cry.

对不⑦ 2024-10-02 07:32:23

数据是按行优先顺序还是列优先顺序存储的?如果它是按行优先顺序排列的,那么必须多次扫描整个数据集(您所说的数百万个值)才能选出每一列,这将使装箱的成本相形见绌。

我真的建议在一次传递数据中完成所有操作,可能是通过构建一个 Action 向量(或 Predicate 来报告错误)委托该过程每个单元格都放入与该列关联的 List 中。封闭式代表可以提供很多帮助。类似于:

public class TableParser
{
    private static bool Store(List<string> lst, string cell) { lst.Append(cell); return true; }
    private static bool Store(List<int> lst, string cell) { int val; if (!int.TryParse(cell, out val)) return false; lst.Append(val); return true; }
    private static bool Store(List<double> lst, string cell) { double val; if (!double.TryParse(cell, out val)) return false; lst.Append(val); return true; }
    private static readonly Dictionary<Type, System.Reflection.MethodInfo> storeMap = new Dictionary<Type, System.Reflection.MethodInfo>();

    static TableParser()
    {
        System.Reflection.MethodInfo[] storeMethods = typeof(TableParser).GetMethods("Store", BindingFlags.Private | BindingFlags.Static);
        foreach (System.Reflection.MethodInfo mi in storeMethods)
            storeMap[mi.GetParameters()[0].GetGenericParameters()[0]] = mi;
    }

    private readonly List< Predicate<string> > columnHandlers = new List< Predicate<string> >;

    public bool TryBindColumn<T>(List<T> lst)
    {
        System.Reflection.MethodInfo storeImpl;
        if (!storeMap.TryGetValue(typeof(T), out storeImpl)) return false;
        columnHandlers.Add(Delegate.Create(typeof(Predicate<string>), storeImpl, lst));
        return true;
    }

    // adapt your existing logic to grab a row, pull it apart with string.Split or whatever, and walk through columnHandlers passing in each of the pieces
}

当然,您可以通过在每种格式的备用 storeMap 字典之间进行选择,将元素解析逻辑与数据集行走逻辑分开。如果您不将内容存储为字符串,您也可以使用 Predicate 或类似的。

Is the data stored in row-major or column-major order? If it's in row-major order, then having to scan the entire data set (millions of values you said) multiple times to pick out each column will dwarf the cost of boxing.

I really would suggest doing everything in one pass through the data, probably by building a vector of Action<string> (or Predicate<string> to report errors) delegates that process a single cell each into a List<T> associated with the column. Closed delegates could help a whole lot. Something like:

public class TableParser
{
    private static bool Store(List<string> lst, string cell) { lst.Append(cell); return true; }
    private static bool Store(List<int> lst, string cell) { int val; if (!int.TryParse(cell, out val)) return false; lst.Append(val); return true; }
    private static bool Store(List<double> lst, string cell) { double val; if (!double.TryParse(cell, out val)) return false; lst.Append(val); return true; }
    private static readonly Dictionary<Type, System.Reflection.MethodInfo> storeMap = new Dictionary<Type, System.Reflection.MethodInfo>();

    static TableParser()
    {
        System.Reflection.MethodInfo[] storeMethods = typeof(TableParser).GetMethods("Store", BindingFlags.Private | BindingFlags.Static);
        foreach (System.Reflection.MethodInfo mi in storeMethods)
            storeMap[mi.GetParameters()[0].GetGenericParameters()[0]] = mi;
    }

    private readonly List< Predicate<string> > columnHandlers = new List< Predicate<string> >;

    public bool TryBindColumn<T>(List<T> lst)
    {
        System.Reflection.MethodInfo storeImpl;
        if (!storeMap.TryGetValue(typeof(T), out storeImpl)) return false;
        columnHandlers.Add(Delegate.Create(typeof(Predicate<string>), storeImpl, lst));
        return true;
    }

    // adapt your existing logic to grab a row, pull it apart with string.Split or whatever, and walk through columnHandlers passing in each of the pieces
}

Of course you could separate the element parsing logic from the dataset walking logic, by choosing between alternate storeMap dictionaries for each format. And if you don't store things as strings, you could just as well use Predicate<byte[]> or similar.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文