将优先级队列读/写到文本文件的有效方法是什么?

发布于 2024-12-03 07:29:45 字数 389 浏览 0 评论 0原文

我有一个用 Java 实现的优先级队列类,因为它是一个队列数组。我需要一种好方法(不使用序列化)在优先级队列中的对象的每次“事务”或 enqueue()/dequeue() 之后记录和存储优先级队列的内容。如果程序需要从文本文件重建优先级队列,它应该作为备份。

我的一些想法和每个问题的问题:

  • 在每个“事务”之后,循环遍历队列并使用对象之间的分隔符将每个事务写入文件中的一行。 -- 我的问题是,它需要将所有对象出队并重新入队,这看起来效率非常低。

  • 每次入队或出队后,只需写入该对象或从文件中删除该对象即可。 -- 我的问题是:如果这是我应该采取的方法,那么我很难想出一种在出队后轻松查找和删除对象的方法。

任何提示/技巧/建议将不胜感激!

I have a priority queue class that I implemented in Java as it being an array of queues. I need a good way (without using Serialization) of recording and storing the contents of the priority queue after each "transaction" or enqueue()/dequeue() of an object from the priority queue. It should serve as a backup in the event that the priority queue needs to be rebuilt by the program from the text file.

Some ideas I had and my problems with each:

  • After each "transaction", loop through the queues and write each one to a line in the file using delimiters between objects.
    -- My problem with this is that it would require dequeueing and re-enqueueing all the objects and this seems highly inefficient.

  • After each enqueue or dequeue simply write that object or remove that object from the file.
    -- My problem with this is: if this is the approach I should be taking, I am having a hard time coming up with a way to easily find and delete the object after being dequeued.

Any hints/tips/suggestions would be greatly appreciated!

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(4

神也荒唐 2024-12-10 07:29:45

要循环队列,您只需迭代它即可。这是非破坏性的(但只是松散的线程安全)

每次将队列的内容写入磁盘可能会非常慢。对于典型的硬盘驱动器,一个小队列将需要大约 20 毫秒来写入。即每秒最多 50 次。如果您使用 SSD,这对于小队列来说会快得多,但是即使您不使用序列化,您仍然必须编组数据。

另一种方法是使用 JMS 服务器,该服务器旨在支持事务、队列和持久性。典型的 JMS 服务器每秒可以处理大约 10,000 条消息。有许多优秀的免费服务器可用。

To loop through a queue you can just iterate over it. This is non-destructive (but only loosely thread safe)

Writing the contents of the queue to disk every time is likely to be very slow. For a typical hard drive, a small queue will take about 20 ms to write. i.e. 50 times per second at best. If you use an SSD this will be much faster for a small queue, however you still have to marshal your data even if you don't use Serialisation.

An alternative is to use a JMS server which is designed to support transactions, queues and persistence. A typical JMS server can handle about 10,000 messages per second. There are a number of good free servers available.

缱倦旧时光 2024-12-10 07:29:45

我会将您的要求实现为日志模式。在文件的末尾,附加每个入队及其优先级,附加每个出队。如果您的消息传递服务器崩溃,您可以重播日志文件,最终会得到适当的状态。

显然,您的日志文件会随着时间的推移而变得巨大。为了解决这个问题,您需要经常轮换日志文件。为此,请在某个时间点序列化队列,然后开始登录新文件。您甚至可以在不锁定状态(冻结队列请求)的情况下完成此操作,方法是在将数据结构的快照写入磁盘的同时将事务同时记录到旧日志和新日志中。快照完成后,将指示该情况的指针写入磁盘,然后您可以删除旧日志。

写的时间和空间是n,重播应该很少,而且比较快。

I would implement your requirements as a log pattern. At the end of your file, append every enqueue and its priority, append every dequeue. If your messaging server crashes, you can replay the log file and you'll end up with the appropriate state.

Obviously, your log file will grow huge over time. To combat this, you'll want to rotate log files every so often. To do this, serialize your queue at a point in time, and then begin logging in a new file. You can even accomplish this without locking the state (freezing queu requests) by simultaneously logging transactions to the old and new logs while a snapshot of the data structure is written to disk. When the snapshot is complete, write a pointer indicating that to disk and you can delete your old log.

Write time and space is n, replays should be rare and are relatively fast.

装迷糊 2024-12-10 07:29:45

要在第二种方法中轻松查找对象...我有几个建议 ::

  1. 您可以使用优先级函数来保持文件中对象的排序。
  2. 要管理不同位置新添加的对象,请在文本文件中的每个插入对象之间保留一些空间,并且在插入对象时,您可以使用一些类似指针的行为来指定偏移量或其他可以轻松管理的内容。
  3. 使用缓冲区,因为每次写入内容可能会非常慢。
  4. 如果您仔细使用优先级功能,删除将是微不足道的。
  5. 此外,在指针指向的小存储桶中进行排序将非常快,并且您始终可以通过在一段时间后压缩所有对象来使用垃圾收集类型的行为。

To find objects easily in second approach...I've couple of suggestions ::

  1. You can use your priority function to keep objects sorted in the file.
  2. To manage newly added objects at different positions, keep some space between every inserted object in the text file and when an object is inserted, you can use some pointer like behavior to specify the offset or something else which can be easily managed.
  3. Use a buffer since writing content evreytime can be very slow.
  4. Deletion will be trivial if you use your priority function carefully.
  5. Also sorting in small buckets pointed by pointers will be very fast and you can always use a garbage collection type of behavior by compacting all the objects after sometime.
伴随着你 2024-12-10 07:29:45

还有一个建议:(考虑是否必须使用一个文件):

如果您的对象数量不是很大,请将每个对象存储到一个单独的文件中。当然,您需要为每个对象创建一个唯一的标识符,并且您也可以使用该标识符作为文件名。这样,您始终可以根据对象中存储的标识符添加或删除单个文件。如果对象属于无法修改的各种类,您只需存储将标识符映射到对象的哈希图即可。因此,在将对象添加到队列之前,您需要创建一个标识符,然后将对象和标识符作为对添加到映射中,并编写一个新文件名作为标识符并包含该对象。我留下了如何删除和重新加载,因为这只不过是练习。

就我个人而言,我赞成罗伯特·哈维在对该问题的评论中提出的建议。考虑使用数据库,特别是如果您的项目已经有一个数据库。这将使存储对象和删除对象比在文件中定位位置更容易、更快。因为即使您在文件中找到该对象的位置,很可能您也需要再次写入整个文件(仅在没有该对象的情况下)。这与循环没有什么不同。使用数据库,您可以避免所有这些麻烦。

one more suggestions: (to consider if usage one file exactly is not a must):

If your object number is not very large, store each object to a seperate file. Of'course, you will need to make a unique identifier for each object and you can use this identifier to be the file name too. this way, you always add or delete a single file based on the identifier stored in the object. If the objects are of various classes that can't be modified, you simply can store a hashmap that maps identifiers to objects. so before you add an object to a queue, you create an identifier and then add the object and the identifier to the map as a pair and you write a new file names as the identifier and containing the object. I leave what to do on delete and reload as it is nothing more than practice.

personally, I favour what was suggested by Robert Harvey in his comment on the question. consider the use of a database, especially if your project has one already. this will make storing objects and deleting objects easier and faster than locating positions within a file. because even if you find a location of the object in a file, most probably you will need to write the whole file again (only without that object). and that is not different from looping. using a database, you avoid all of this trouble.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文