哪个更省时?文件存储或 SQLite

发布于 2024-12-21 10:14:40 字数 212 浏览 0 评论 0原文

我正在开发一个 Firefox 扩展,用作用户分析和 Web 个性化引擎。需要存储网页的TF-IDF相关数据。我的问题是,哪个会产生更快的简单搜索结果?

一个。使用自定义数据结构并将整个数据结构存储在文件中,将其加载到内存并查询它?

b.存储和查询 SQLite 数据库中的数据?

可以安全地假设一个表中有大约 250,000 行的最坏情况。

I am developing a firefox extension which is to serve as a user profiling and web personalization engine. It needs to store TF-IDF related data of web pages. My question is, which would produce faster simple search results?

a. Using a custom data structure and storing the entire data structure in a file, loading it to memory and querying it?

OR

b. Storing and querying and the data off an SQLite database?

It is safe to assume a worst case scenario of around 250,000 rows in one of the tables.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(2

千紇 2024-12-28 10:14:40

你的问题基本上可以归结为

:我应该编写自己的数据存储系统的自定义实现吗?

b.我应该使用现成的、经过验证的数据存储系统吗?

我想说,如果您采用第一种方法,那么:

  • 您显然最终会花时间编写这段代码。您需要权衡这一点与在现有库之上学习/编写代码所花费的时间。
  • 随着时间的推移,您将不可避免地开始添加功能。您必须不断重新评估添加更多代码的成本与放弃已投入的工作并使用现有库的成本,
  • 您可能会遇到严重的性能或其他问题。当像 SQLite 这样的东西已经在生产中大量使用来发现这些问题时,您是否愿意冒这个风险?
  • 您将花费多少时间来处理由数据存储引起的错误,而这些错误可以使用现成的库来避免?

另一种看待这个问题的方式是:为什么不使用 SQLite?对于您的场景来说,它是否存在某种问题?我想不出任何一个。

我当然倾向于从 SQLite(或类似的东西)开始。如果事实证明它在某种程度上不起作用,只有在用尽任何其他现成的替代方案之后,我才会考虑编写自己的数据存储库。

Your question basically boils down to:

a. Should I write my own custom implementation of a data storage system?

or

b. Should I use an off-the-shelf, proven data storage system?

I would say if you go with the first approach, that:

  • You'll obviously end up spending time writing this code. You need to weigh this vs the time you spend learning/writing code on top of an existing library
  • You'll inevitably start adding features over time. You'll have to continuously re-evaluate the cost of adding more code vs throwing away the work you've put in and using an existing library
  • You may possibly run into serious performance or other issues. Are you willing to take this risk when something like SQLite has already had a lot of production use to find these issues?
  • How much time are you going to spend dealing with bugs caused by your data storage, that could be avoided using an off the shelf library?

Another way of looking at this is: why would you NOT use SQLite? Is there some kind of problem with it for your scenario? I can't think of any.

I would certainly be inclined to start with SQLite (or something similar). If it proves to not work in some way, only after exhausting any other off the shelf alternatives would I consider writing my own data storage library.

澜川若宁 2024-12-28 10:14:40

为什么不能使用像字典或二叉树这样的数据结构。基于搜索、检索、插入和插入次数的数据结构?删除。

Why can't using some data structure like dictionary or binary tree.Base the data structure on number of search,retreivals,insert & delete.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文