C# queue filesystemwatcher no-duplicates

如何在一段时间内从队列中删除重复的项目？

发布于 2024-10-17 01:47:34 字数 1756 浏览 5 评论 0原文

我想以有效的方式从队列中删除重复的条目。队列有一个带有 DateTime 和 FullPath 以及其他一些内容的自定义类。

private Queue<MyCustomClass> SharedQueue;

类中的 DateTime 是插入队列时的时间戳。我想使用的逻辑如下：如果完整路径在 4 秒窗口内相同（即，如果在重复完整路径的 4 秒内添加到队列），则从队列中删除重复项。我有一些想要观看的事件，但仍然会出现一些重复的事件，这没关系。

我正在使用 c# 2.0 和 FileSystemWatcher 类和工作队列。

有很多方法可以做到这一点：每次添加项目时都会修剪队列，或者当我在队列上工作时跳过当前重复项目的处理。

或者我应该使用“全局私有”变量 Dictionary<字符串、日期时间> ？这样我就可以快速搜索到吗？或者队列的本地副本？也许最好将本地队列限制为 100 个项目，以防出现许多文件事件？虽然就我而言，它“应该”只有相对较少的文件要在文件夹中监视...但事情总是会发生变化...

感谢您的帮助。

：编辑：美国东部时间 2 月 10 日 8:54：因此，据我所知，我决定实施一个很好的简单解决方案。我不认为我持有 Dict 键太久...

：编辑：2 月 10 日 9:53 EST：更新，因为我的词典不能包含重复的值。

   public void QueueInput(HotSynchUnit.RcdFSWFile rcd)
// start the worker thread when program starts.
// call Terminate.Set() in the programs exit routine or close handler etc.
{
  // lock shared queue
  lock (SharedQueue)
  {
    if (!IsDuplicateQueueInput(rcd))  // only add unique values to queue
    {
      SharedQueue.Enqueue(rcd);
      SomethingToDo.Set();
    }
  }
} // public void QueueInput

private bool IsDuplicateQueueInput(HotSynchUnit.RcdFSWFile rcd)
/* Return true if the object is a duplicate object.
 * Pseudo Code:
 * 
 * isDuplicate = false
 * Lock Dictionary
 * -If lastTimeStamp > 4 seconds ago then       // Optimization: save lastTimeStamp
 *    if Dict.Count > 0 then clear Dictionary
 *    return isDuplicate
 * -If not Dict.TryGetValue(sPath, dtTimeStamp) then
 *    Dict.AddKey()
 * -Else
 *    Compare key timestamp to Currenttime
 *    if key timestamp is <= 4 seconds ago then
 *       IsDuplicate = True
 *
 *    Dict.RemoveKey()
 *    Dict.AddKey()
 * 
 * return isDuplicate
*/
{
  // put real code here
}

原文

I would like to remove duplicate entries from a queue in an efficient way.
The queue has a custom class with DateTime and FullPath and a few other things

private Queue<MyCustomClass> SharedQueue;

The DateTime in the class is the timestamp when inserted into the queue. The logic I would like to use is as following: Remove duplicates from the queue if the FullPath is identical within a 4 second window (i.e. if added to queue within 4 seconds of a duplicate fullpath). I have the events that I want to watch but a few duplicates will still arrive and that is OK.

I am using c# 2.0 and the FileSystemWatcher class and a worker queue.

There are a bunch of ways to do this:
Trim the queue each time an item is added to it, or when I am working on the queue skip the processing of the current duplicate item.

Or should I use a 'global private' variable Dictionary< String, DateTime> ? So I can quickly search it? or a local copy of the queue ? Perhaps it is best to limit the local queue to 100 items in case of many file events? Though in my case it 'should be' only a relatively few files to monitor in a folder... but things always change...

Thanks for any help.

:Edit: Feb 10 8:54 EST:
So I decided to implement a good simple solution as far as I can tell.
I don't think I am holding on to the Dict keys too long...

:Edit: Feb 10 9:53 EST: Updated as my Dictionary cannot contain duplicate values.

   public void QueueInput(HotSynchUnit.RcdFSWFile rcd)
// start the worker thread when program starts.
// call Terminate.Set() in the programs exit routine or close handler etc.
{
  // lock shared queue
  lock (SharedQueue)
  {
    if (!IsDuplicateQueueInput(rcd))  // only add unique values to queue
    {
      SharedQueue.Enqueue(rcd);
      SomethingToDo.Set();
    }
  }
} // public void QueueInput

private bool IsDuplicateQueueInput(HotSynchUnit.RcdFSWFile rcd)
/* Return true if the object is a duplicate object.
 * Pseudo Code:
 * 
 * isDuplicate = false
 * Lock Dictionary
 * -If lastTimeStamp > 4 seconds ago then       // Optimization: save lastTimeStamp
 *    if Dict.Count > 0 then clear Dictionary
 *    return isDuplicate
 * -If not Dict.TryGetValue(sPath, dtTimeStamp) then
 *    Dict.AddKey()
 * -Else
 *    Compare key timestamp to Currenttime
 *    if key timestamp is <= 4 seconds ago then
 *       IsDuplicate = True
 *
 *    Dict.RemoveKey()
 *    Dict.AddKey()
 * 
 * return isDuplicate
*/
{
  // put real code here
}

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

眼趣 2024-10-24 01:47:34

我只是考虑使用类似于通用哈希表的任何集合...类似这样的东西：

Dictionary<string, YourClass> dict = new Dictionary<string, YourClass>();

/// just let's assume you want to add/check for "c:\demo.txt"

if (!dict.ContainsKey(@"c:\demo.txt"))
{
   /// add items to dict by passing fullPath as key and your objects as value
   dict.add(@"c:\demo.txt", obj1);
} 
else if (dict[@"c:\demo.txt"].CheckForIntervall())
{
   /// replace current object in dictionary with new object - in case you want to..
   /// or just do what you want to 
}

编辑 - 您的自定义类可能具有这样的功能：

class YOURCUSTOMCLASS
{
    private DateTime creationTime;

    public DateTime CreationTime
    { get { return creationTime; } }

    public YOURCUSTOMCLASS(parametersGoesHere xyz)
    {
          creationTime = DateTime.Now;
    }

    /// in this case this method will return true
    /// if the timeSpan between this object and otherObject
    /// is greater than 4 seconds
    public bool CheckForInterval(YOURCUSTOMCLASS otherObject)
    {
         TimeSpan diff = otherObj.CreationTime.Subtract(creationTime);

         /// you may replace 4 through any other digit, or even better take
         /// a const/global var/static ...
         return diff.TotalSeconds > 4;
    }

    /// all the other stuff you need ...
}

当然您会< em>放松队列的功能 - 但如果您的队列包含许多元素，您将运行时间大幅增加。

哈

I just thought about using any collection similar to a generic hashtable... Something like this:

Dictionary<string, YourClass> dict = new Dictionary<string, YourClass>();

/// just let's assume you want to add/check for "c:\demo.txt"

if (!dict.ContainsKey(@"c:\demo.txt"))
{
   /// add items to dict by passing fullPath as key and your objects as value
   dict.add(@"c:\demo.txt", obj1);
} 
else if (dict[@"c:\demo.txt"].CheckForIntervall())
{
   /// replace current object in dictionary with new object - in case you want to..
   /// or just do what you want to 
}

edit - your custom class may have some functionality like this:

class YOURCUSTOMCLASS
{
    private DateTime creationTime;

    public DateTime CreationTime
    { get { return creationTime; } }

    public YOURCUSTOMCLASS(parametersGoesHere xyz)
    {
          creationTime = DateTime.Now;
    }

    /// in this case this method will return true
    /// if the timeSpan between this object and otherObject
    /// is greater than 4 seconds
    public bool CheckForInterval(YOURCUSTOMCLASS otherObject)
    {
         TimeSpan diff = otherObj.CreationTime.Subtract(creationTime);

         /// you may replace 4 through any other digit, or even better take
         /// a const/global var/static ...
         return diff.TotalSeconds > 4;
    }

    /// all the other stuff you need ...
}

Of course you will loose the functionality of a queue - but you will get an massive increase in runtime if your queue containts many elements.

hth

回复收藏 0 原文

静赏你的温柔 2024-10-24 01:47:34

我会创建一个子类：

class MyDeduplicatedQueue : Queue<MyCustomObject> {
    /// etc
}

然后您可以将所有适当的过滤逻辑放入 Enqueue 方法中。

I would make a subclass:

class MyDeduplicatedQueue : Queue<MyCustomObject> {
    /// etc
}

Then you can put all the appropriate filtering logic into the Enqueue method.

回复收藏 0 原文

谁的年少不轻狂 2024-10-24 01:47:34

我将创建一个包装类，而不是从队列扩展，因为基本类型 Queue 的用户期望不同的行为。（当您这样做时，.NET 4.0 中的数据协定甚至可能会抱怨。）

在内部，您可以有一个实际的队列来将所需的调用重定向到其中。
每次 Queue() 调用时，您都可以将尚未包含的新元素添加到字典中。在此之前，您可以从该字典中清空所有早于 x 秒的元素，并将它们按顺序添加到内部队列中。

出队时，您必须检查内部队列是否包含元素，否则从字典中选择最早的元素。

这当然只是一种可能的实现。当许多不同的元素可能快速排队时，字典将很快填满，并且可能需要添加额外的逻辑来解决这个问题。

回复收藏 0 原文

只怪假的太真实 2024-10-24 01:47:34

如果插入有重复的路径，为什么不直接拒绝插入呢？您所要做的就是从队列尾部开始线性搜索，并在找到重复项（并拒绝插入）或时间戳超过时间限制（并插入记录）时停止？看起来比保留另一个数据结构和所有相关逻辑要简单得多。

回复收藏 0 原文

~没有更多了~

关于作者

情深缘浅

暂无简介

0 文章

0 评论

23 人气

关注发私信

友情链接

文江博客

如何在一段时间内从队列中删除重复的项目？

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

评论（4）

关于作者

相关话题

热门标签

推荐作者

謌踐踏愛綪

开始看清了

高速公鹿

alipaysp_PLnULTzf66

热情消退

白色月光

友情链接

如何在一段时间内从队列中删除重复的项目？

如果你对这篇内容有疑问，欢迎到本站社区发帖提问 参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

评论（4）

关于作者

相关话题

热门标签

推荐作者

謌踐踏愛綪

开始看清了

高速公鹿

alipaysp_PLnULTzf66

热情消退

白色月光

友情链接

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。