我应该使用哪个集合来读取多个线程的元素并定期读取完整的覆盖集?

发布于 2025-02-03 22:36:06 字数 1152 浏览 3 评论 0 原文

我将使用静态集合,该集合将用于核心过程,并通过背景服务每x分钟进行一次更新。

背景过程将每个X分钟从数据库加载更新的数据,并将收到的数据集设置为此静态集合中。

核心过程将接收许多任务,以检查此集合中是否存在一些值。每个任务将在单独的线程中处理。会有很多请求,它应该非常快,所以我不能为每个请求要求数据库,并且我需要内存中的可更新列表。

public class LoadedData
{
    public static HashSet<string> Keys { get; set; }
}

public class CoreProcess
{
    public bool ElementExists(string key)
    {
        return LoadedData.Keys.Contains(key);
    }
}

public class BackgroundProcess
{
    public async Task LoadData()
    {
        while (true)
        {
            LoadedData.Keys = GetKeysFromDb();
            await Task.Delay(TimeSpan.FromMinutes(5));
        }
    }
}

因此,我正在寻找最佳解决方案。 我正在考虑使用 hashset&lt; t&gt; ,因为我敢肯定集合中的每个元素都是唯一的。但是 Hashset&lt; t&gt; 不是线程安全。因此,我开始考虑 blockingCollection&lt; t&gt; concurrentbag&lt; t&gt; consurrentDictionary&lt; t,byte&gt; ,但是我想知道我是否需要一个这里的线程安全集合。看起来不是,因为我不会添加/更新/删除集合中的特定元素。仅从数据库中进行完整重写。

  1. 那么,这是否意味着我可以使用简单 hashset&lt; t&gt;

  2. 您将使用哪个集合来解决?

  3. ,总的来说,通过核心过程同时阅读并通过后台过程对集合的完全覆盖会有任何问题?

I'm going to use a static collection that will be used for reading by the core process and fully updated every X mins by the background service.

The background process will load updated data from the database every X mins and set the received dataset into this static collection.

The core process will receive many tasks to check if some values exist in this collection. Each task will be processed in a separate thread. There will be a lot of requests, it should be extremely fast, so I can't ask database for each request and I need an updateable list in memory.

public class LoadedData
{
    public static HashSet<string> Keys { get; set; }
}

public class CoreProcess
{
    public bool ElementExists(string key)
    {
        return LoadedData.Keys.Contains(key);
    }
}

public class BackgroundProcess
{
    public async Task LoadData()
    {
        while (true)
        {
            LoadedData.Keys = GetKeysFromDb();
            await Task.Delay(TimeSpan.FromMinutes(5));
        }
    }
}

So, I'm looking for the best solution for this.
I was thinking about using HashSet<T> because I'm sure that each element in the collection will be unique. But HashSet<T> is not thread-safe. So I started considering BlockingCollection<T>, ConcurrentBag<T>, ConcurrentDictionary<T, byte>, but then I wondered if I needed a thread-safe collection here at all. Looks like not, because I'm not going to add/update/remove particular elements in the collection. Only full rewrite from the database.

  1. So, does it mean that I can just use simple HashSet<T>?

  2. Which collection would you use to solve it?

  3. And in general, will there be any issues with a simultaneous reading by the core process and full overwriting of the collection by the background process?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

葬花如无物 2025-02-10 22:36:06

因此,一旦成为 loadeddata.keys 属性的值, hashset&lt; string&gt; 就会有效地不变。在这种情况下,您的代码几乎可以。唯一缺少的成分是确保所有涉及的线程的可见性。

在理论上主要内存。在实践中,您可能永远不会遇到这种现象,但是如果您想按照规则进行播放,则必须用挥发性语义读取并写入此属性。如果 field ,您只需用 volatile 关键字。由于它是属性,因此您必须做更多的工作:

public class LoadedData
{
    private volatile static HashSet<string> _keys;

    public static HashSet<string> Keys
    {
        get => _keys;
        set => _keys = value;
    }
}

...或使用 volatile 类,而不是 volatile 关键字:

public class LoadedData
{
    private static HashSet<string> _keys;

    public static HashSet<string> Keys
    {
        get => Volatile.Read(ref _keys);
        set => Volatile.Write(ref _keys, value);
    }
}

最终的警告: >不是由编译器强制执行的。这只是您与未来自我以及代码的任何其他未来维护者签订的口头合同。如果某些突变代码找到了您的代码库的方式,则您的程序的行为将正式不确定。如果您想保护自己免受这种情况的限制,那么在语义上最正确的方法是用 hashset&lt; string&gt; en-us/dotnet/api/system.collections.mmutable.mmutableHashset-1“ rel =“ nofollow noreferrer”> immutableHashset&lt; string&gt; 比其可变的对应物慢得多(通常慢10倍),因此这是一个权衡。您可以放心,或者最终的表现,但不能同时进行。

So the HashSet<string> becomes effectively immutable as soon as it becomes the value of the LoadedData.Keys property. In this case your code is almost OK. The only missing ingredient is to ensure the visibility of this property by all threads involved.

In theory it is possible that the compiler or the jitter might use a cached/stale value of the property, instead of looking what is currently stored in the main memory. In practice you might never experience this phenomenon, but if you want to play by the rules you must read and write to this property with volatile semantics. If the Keys was a field, you could just decorate it with the volatile keyword. Since it's a property, you must do a bit more work:

public class LoadedData
{
    private volatile static HashSet<string> _keys;

    public static HashSet<string> Keys
    {
        get => _keys;
        set => _keys = value;
    }
}

...or using the Volatile class instead of the volatile keyword:

public class LoadedData
{
    private static HashSet<string> _keys;

    public static HashSet<string> Keys
    {
        get => Volatile.Read(ref _keys);
        set => Volatile.Write(ref _keys, value);
    }
}

A final cautionary note: The immutability of the HashSet<string> is not enforced by the compiler. It's just a verbal contract that you make with your future self, and with any other future maintainers of your code. In case some mutative code find its way to your code-base, the behavior of your program will become officially undefined. If you want to guard yourself against this scenario, the most semantically correct way to do it is to replace the HashSet<string> with an ImmutableHashSet<string>. The immutable collections are significantly slower than their mutable counterparts (typically at least 10x slower), so it's a trade-off. You can have peace of mind, or ultimate performance, but not both.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文