高性能数据库意见
我正在开发使用 MySql 数据库和 Hibernate 来访问它的软件。 我遇到的问题是,当我查找 1 个关键字时,我已经使用了 40 000 个查询,并且 我正在开发的应用程序应该能够处理多个关键字。
所以基本上我们正在处理一个充满字符串值的数据库,并且必须进行大量比较。现在,我使用过滤器加载内存中所有可能的匹配项,并在 java 代码中对它们进行比较。这是高度递归且缓慢的。
显然,MySql 和最重要的 Hibernate 都不是正确的选择。 任何人都可以提供一些关于哪个数据库可以提供更好性能的信息。 我正在研究 Hypertable、MongoDb、Hbase、图形数据库……但我不确定该走哪条路。
请帮忙。 谢谢
I'm developing software using a MySql database and Hibernate to access it.
The problem I am having is when I look for 1 keyword I am using 40 000 queries already and
the application that I am developing should be able to process multiple keywords.
So basically we are dealing with a database filled with String values and a lot of comparing has to be done. For now, using a filter I'm loading all possible matches in memmory and I compare them in the java code. This is highly recursive and slow.
So obviously MySql and most of all Hibernate are not the way to go.
Could anyone please provide some information on which database would provide better performance.
I'm looking into Hypertable, MongoDb, Hbase, Graph Database, ... but I'm not sure which way to go.
Please help.
Thanks
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(3)
你的方法是错误的,你正在做 MySQL 本身所做的事情 - 它可以将数据集存储在 RAM 中并从那里使用它,这就是你对算法所做的事情。
另一件事是,对于诸如文本搜索之类的特定事物,存在专门用于此目的的已知方法和各种存储引擎。
例如,Sphinx 就是其中之一。
另一件事实际上是使用某种可以快速搜索的数据结构,例如 trie - 这是对于执行自动完成等操作非常有用(这只是一个示例,不必直接连接到您的问题 - 这只是一个提示,表明存在可以快速处理字符串的已知数据结构)。
另外,为什么您认为 NoSQL 解决方案在比较大量字符串数据时会更快?
正如其他人指出的那样 - 似乎您的应用程序设计和算法才是罪魁祸首,而不是底层技术。你应该更准确地提出你的问题,并概述你正在做什么、如何做以及你希望它做什么。当您回答这些问题时,人们可能会为您指出解决问题的正确方向,因为您似乎采取了错误的方法。
Your approach is wrong, and you're doing something MySQL does natively - it can store the dataset in the RAM and work with it from there, which is what you're doing with your algorithm.
The other thing is that for specific things like text searching - there are known methods and various storage engines that are specialized for such purpose.
For example, Sphinx is one of those.
Another thing is actually using some sort of data structure that makes searches quick, such as trie - which is incredibly useful for doing things such as autocomplete (this is just an example that doesn't have to be directly connected to your question - it's just a hint that there are known data structures that work fast with strings).
Also, why do you think a NoSQL solution would be quicker when it comes to comparing large volume of string data?
As others have pointed out - it seems your app design and algorithm are the ones that are the culprits here, not underlying technology. You should be more exact in your question and outline what it is that you're doing, how you're doing it and what you'd like for it to be doing. When you answer those questions, people might point you to right direction in solving your problem because it seems you took wrong approach.
也许我误解了你的问题,但是......
听起来您正在尝试在内存中完成数据库的工作?创建索引、编写更好的 SQL 查询或其他内容,但要加载所有可能的匹配项并迭代它们?那么,为什么还要使用数据库呢?
基本上,我不认为这是你选择的数据库(MySQL 可以毫无问题地处理比 40,000 条记录大得多的查询)。我认为你的算法需要一些工作。
Perhaps I misunderstand your question, but ...
Sounds like you're try to do the job of your database, in-memory? Create an index, write a better SQL query or something, but you're loading all possible matches and the iterating through them? At that point, why even use a database?
Basically, I don't think it's your choice of database (MySQL can handle much larger queries than 40,000 records with no problem). I think your algorithm needs some work.
您真正的问题是您使用了 40,000 个查询。
您能解释一下导致如此多查询的问题和过程吗?
无论您使用什么数据库,您的算法听起来都太过分了,因此它总是很慢。
我们先解决它。
Your real problem is your using 40,000 queries.
Can you explain your problem and process that leads to so many queries?
Regardless of what database you go with, your algorithm sounds too excessive and so it will always be slow.
Let's fix it first.