从数据库并行反序列化 Json

发布于 2024-12-26 01:43:12 字数 979 浏览 3 评论 0原文

场景如下:在一个单独的任务中,我从数据读取器读取数据,该数据读取器表示带有字符串(JSON)的单列结果集。在该任务中,我将 JSON 字符串添加到包装 ConcurrentQueue 的 BlockingCollection 中。同时,在主线程中,我尝试从集合中获取/出列 JSON 字符串,然后将其反序列化并返回。

从数据库读取和反序列化的速度大致相同,因此不会因大型BlockingCollection而导致太多内存消耗。

从数据库读取完成后,任务将关闭,然后我将反序列化所有非反序列化的 JSON 字符串。

问题/想法:

1) TryTake 是否锁定,从而无法进行添加?

2)不要这样做。只需串行执行并产生返回即可。

using (var q = new BlockingCollection<string>())
{
Task task = null;

try
{
    task = new Task(() =>
    {
        foreach (var json in sourceData)
            q.Add(json);
    });

    task.Start();

    while (!task.IsCompleted)
    {
        string json;
        if (q.TryTake(out json))
            yield return Deserialize<T>(json);
    }

    Task.WaitAll(task);
}
finally 
{
    if (task != null)
    {
        task.Dispose();
    }

    q.CompleteAdding();
}

foreach (var e in q.GetConsumingEnumerable())
    yield return Deserialize<T>(e);
}

This is the scenario: In a separate task I read from a datareader which represent a single column result set with a string, a JSON. In that task I add the JSON string to a BlockingCollection that wraps the ConcurrentQueue. At the same time in the main thread I TryTake/dequeue a JSON string from the collection and then yield return it deserialized.

The reading from the database and the deserialization is approximately of the same speed so there will not be to much memory consumption caused by a large BlockingCollection.

When the reading from the database is done, the task is closed and I then deserialize all the non deserialized JSON strings.

Questions/thoughts:

1) Does the TryTake lock so that no adding can be done?

2) Don't do it. Just do it in serial and yield return.

using (var q = new BlockingCollection<string>())
{
Task task = null;

try
{
    task = new Task(() =>
    {
        foreach (var json in sourceData)
            q.Add(json);
    });

    task.Start();

    while (!task.IsCompleted)
    {
        string json;
        if (q.TryTake(out json))
            yield return Deserialize<T>(json);
    }

    Task.WaitAll(task);
}
finally 
{
    if (task != null)
    {
        task.Dispose();
    }

    q.CompleteAdding();
}

foreach (var e in q.GetConsumingEnumerable())
    yield return Deserialize<T>(e);
}

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

余生共白头 2025-01-02 01:43:12

问题1

TryTake 是否锁定,以便无法添加

将有一段非常短暂的时间无法执行添加,但是这段时间可以忽略不计。来自http://msdn.microsoft.com/en-us/library/dd997305。 ASPX

一些并发集合类型使用轻量级
同步机制如 SpinLock、SpinWait、SemaphoreSlim、
和 CountdownEvent,它们是 .NET Framework 4 中的新功能。这些
同步类型通常会在短时间内使用繁忙旋转
在他们将线程置于真正的等待状态之前。当等待时间为
预计时间很短,旋转的计算量要少得多
比等待更昂贵,这涉及昂贵的内核转换。
对于使用旋转的集合类,这种效率意味着
多个线程可以以非常高的速度添加和删除项目。为了
有关旋转与阻塞的更多信息,请参阅 SpinLock 和
旋转等待。

ConcurrentQueue 和 ConcurrentStack 类不使用锁
根本不。相反,他们依靠互锁操作来实现
线程安全。

问题2:

别这样做。只需串行执行并产生返回即可。

这似乎是可行的方法。与任何优化工作一样 - 做最简单的事情,然后进行测量!如果这里存在瓶颈,请考虑优化,但至少您会知道您的“优化”是否确实通过指标进行比较而有所帮助。

Question 1

Does the TryTake lock so that no adding can be done

There will be a very brief period whereby an add cannot be performed, however this time will be negligible. From http://msdn.microsoft.com/en-us/library/dd997305.aspx

Some of the concurrent collection types use lightweight
synchronization mechanisms such as SpinLock, SpinWait, SemaphoreSlim,
and CountdownEvent, which are new in the .NET Framework 4. These
synchronization types typically use busy spinning for brief periods
before they put the thread into a true Wait state. When wait times are
expected to be very short, spinning is far less computationally
expensive than waiting, which involves an expensive kernel transition.
For collection classes that use spinning, this efficiency means that
multiple threads can add and remove items at a very high rate. For
more information about spinning vs. blocking, see SpinLock and
SpinWait.

The ConcurrentQueue and ConcurrentStack classes do not use locks
at all. Instead, they rely on Interlocked operations to achieve
thread-safety.

Question 2:

Don't do it. Just do it in serial and yield return.

This seems like the way to go. As with any optimisation work - do what is simplest and then measure! If there is a bottleneck here consider optimising, but at least you'll know if your 'optimistations' are actually helping by virtue of having metrics to compare against.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文