存储大型查找表

发布于 2024-07-29 04:32:25 字数 1429 浏览 9 评论 0原文

我正在开发一个应用程序，它利用非常大的查找表来加速数学计算。这些表中最大的是一个 int[]，大约有 1000 万个条目。并非所有查找表都是 int[]。例如，其中一个是包含约 200,000 个条目的字典。目前，我生成每个查找表一次（这需要几分钟），并使用以下代码片段将其序列化到磁盘（通过压缩）：其中

    int[] lut = GenerateLUT();
    lut.Serialize("lut");

序列化定义如下：

    public static void Serialize(this object obj, string file)
    {
        using (FileStream stream = File.Open(file, FileMode.Create))
        {
            using (var gz = new GZipStream(stream, CompressionMode.Compress))
            {
                var formatter = new BinaryFormatter();
                formatter.Serialize(gz, obj);
            }
        }
    }

我遇到的烦恼是启动应用程序时，反序列化这些查找表需要很长时间（最多 15 秒）。这种类型的延迟会惹恼用户，因为在加载所有查找表之前应用程序将无法使用。目前反序列化如下：

     int[] lut1 = (Dictionary<string, int>) Deserialize("lut1");
     int[] lut2 = (int[]) Deserialize("lut2");
 ...

其中反序列化定义为：

    public static object Deserialize(string file)
    {
        using (FileStream stream = File.Open(file, FileMode.Open))
        {
            using (var gz = new GZipStream(stream, CompressionMode.Decompress))
            {
                var formatter = new BinaryFormatter();
                return formatter.Deserialize(gz);
            }
        }
    }

起初，我认为可能是 gzip 压缩导致速度变慢，但删除它只从序列化/反序列化例程中节省了几百毫秒。

任何人都可以建议一种在应用程序首次启动时加快这些查找表的加载时间的方法吗？

原文

I am developing an app that utilizes very large lookup tables to speed up mathematical computations. The largest of these tables is an int[] that has ~10 million entries. Not all of the lookup tables are int[]. For example, one is a Dictionary with ~200,000 entries. Currently, I generate each lookup table once (which takes several minutes) and serialize it to disk (with compression) using the following snippet:

    int[] lut = GenerateLUT();
    lut.Serialize("lut");

where Serialize is defined as follows:

    public static void Serialize(this object obj, string file)
    {
        using (FileStream stream = File.Open(file, FileMode.Create))
        {
            using (var gz = new GZipStream(stream, CompressionMode.Compress))
            {
                var formatter = new BinaryFormatter();
                formatter.Serialize(gz, obj);
            }
        }
    }

The annoyance I am having is when launching the application, is that the Deserialization of these lookup tables is taking very long (upwards of 15 seconds). This type of delay will annoy users as the app will be unusable until all the lookup tables are loaded. Currently the Deserialization is as follows:

     int[] lut1 = (Dictionary<string, int>) Deserialize("lut1");
     int[] lut2 = (int[]) Deserialize("lut2");
 ...

where Deserialize is defined as:

    public static object Deserialize(string file)
    {
        using (FileStream stream = File.Open(file, FileMode.Open))
        {
            using (var gz = new GZipStream(stream, CompressionMode.Decompress))
            {
                var formatter = new BinaryFormatter();
                return formatter.Deserialize(gz);
            }
        }
    }

At first, I thought it might have been the gzip compression that was causing the slowdown, but removing it only skimmed a few hundred milliseconds from the Serialization/Deserialization routines.

Can anyone suggest a way of speeding up the load times of these lookup tables upon the app's initial startup?

分享到QQ

分享到微博