使用 Java 更快地检索 SQL 数据并搜索大数据
我有一个包含超过 10 万条由数字对组成的数据的表。其示例如下所示。
A B
0010 0010
0010 0011
0010 0019
0010 0056
0011 0010
0011 0011
0011 0019
0011 0040
0019 0010
0019 0058
这里,A 列中的数字可能存在于 B 列中。 解释 :用户将拥有多个范围在 10 -100 之间的数字。现在我们可以看到 0010 - 0011 和 0019 存在。因此,如果用户的列表包含 0010 和 0011,则会显示一条警告,表明不允许使用该对,反之亦然。
在 Java 中如何处理这个问题? 尽管搜索速度会快得多,但加载包含所有数据的哈希图似乎不是一个好的选择。
请建议。谢谢
Testcase:
num = 0010; //value from list which user will be passing
test(num){
if(num.equals("0019")||num.equals("0011")) //compairing with database
System.out.println("incompatible pair present");
}
上面的例子是一个非常简单的伪代码。实际问题会复杂得多。
I have a table with over 100 thousand data consisting of number pairs. A sample of which is shown below.
A B
0010 0010
0010 0011
0010 0019
0010 0056
0011 0010
0011 0011
0011 0019
0011 0040
0019 0010
0019 0058
Here the numbers in Column A has possible pairs present in column B.
Explanation : User will have several of these numbers ranging form 10 -100. Now as we can see for 0010 - 0011 and 0019 is present. So if the user has a list containing 0010 along with 0011 a warning will be shown that this pair is not allowed and vice versa.
How to approach this in Java?
Loading the hash map with all the data doesnot seem to be a good option although the search will be much faster.
please suggest. Thanks
Testcase:
num = 0010; //value from list which user will be passing
test(num){
if(num.equals("0019")||num.equals("0011")) //compairing with database
System.out.println("incompatible pair present");
}
The above example is a very simple pseudo code. The actual problem will me much more complex.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(7)
直到问题更清楚为止......
处理已存储在数据库中的大量数据,我给您一个建议:无论您想在这里做什么,请考虑使用 SQL 而不是 Java 来解决它。或者至少编写一个带有结果集的 SQL,之后可以很容易地用 Java 进行评估。
但直到问题还不清楚......
Until the question is more clear...
Handling large amounts of data which are already stored in a database let me give you a recommendation: Whatever you want to do here, consider solving it with SQL instead of Java. Or at least write a SQL with an resulting ResultSet which is easy to evaluate in Java afterwards.
But until the question is not that clear ...
您是否试图查找 A 值相同但 B 不同的条目?
Are you trying to find entries where A is the same value but B is different?
如果您担心堆空间不足,可以尝试使用持久缓存,例如 ehcache。我建议您在使用此解决方案之前检查实际消耗的内存
If you're worried of running out of heap space, you could try using a persistent cache like ehcache. I suggest you check the actual memory consumed before going in for this solution though
看起来你的问题仅限于一个非常小的域 - 为什么你不能实例化一个二维数组
bool
并将其设置为true
每当两个数字的索引创建一个不受支持的组合。使用示例:
您可以通过检查一次数据并设置此数组来从数据库实例化此数组。您只需将
0010
翻译为10
即可。索引 0-9 中会有垃圾,但您可以通过从9
中减去索引来“翻译”索引来消除它。这符合你的问题吗?
Seems like your problem is limited to a very small domain - why cant you instantiate an two dimensional array of
bool
and set it totrue
whenever the indexes of two numbers creates an unsupported combination.Example for usage:
You can instantiate this array from the database by going over the data once and setting this array. You just need to translate
0010
to10
. You will have junk in Indexes 0-9, but you can eliminate it by "translating" the index by subtracting it from9
.Does that hit your question?
如果我正确理解了您想要执行的操作...
在
t1(a,b)
上创建唯一索引。将用户的新对放入 try 块内的 INSERT 语句中。捕获键违规异常(将是SQLException
,可能是一个子类,具体取决于您的 RDBMS)并向用户解释这是一个禁止的对。If I have understood correctly what you want to do…
Create a unique index on
t1(a,b)
. Put the user's new pair in an INSERT statement inside a try block. Catch key violation exceptions (will be aSQLException
, possibly a subclass depending on your RDBMS) and explain to the user that is a forbidden pair.简单 - 绝对不是可扩展的解决方案 - 如果您的范围确实是 0000 - 9999。
只需有一个包含 999999 个条目的字节表。
每个条目由一个简单的 0(表示允许)或 1(表示不允许)组成。
通过逻辑连接两对数字(键 = 第一个 * 1000 + 第二个),可以在表中找到一个条目。
更具可扩展性的数据库解决方案是创建一个具有复合主键(对 1 和对 2)的表,只要存在指示不允许的对的条目即可。
Simple - definitely not scalable solution -- if your ranges really are 0000 - 9999.
Simply have a byte table with 999999 entries.
Each entry consists of a simple 0 for allowed or 1 for not allowed.
You find an entry in the table by logically concatenating the two pair numbers (key = first * 1000 + second).
The more scalable database solution is to create a table with a composite primary key (pair1 and pair2) the mere presence of an entry indicating a disallowed pair.
为了澄清这个问题:
您有一个表,其中每个记录包含两个数字,这些数字被声明为“不兼容”。
您有一个用户号码列表,并且想要检查该列表是否包含“不兼容的号码”。正确的?
这里有一个简单的 SQL(从评论中获取示例):
此 SQL 返回所有不兼容性。当结果集为空时,就不存在不兼容性,一切都很好。如果您只想检索此事实,则可以改为编写
SELECT 1 ...
。当然,必须动态构建 SQL 才能在 IN 子句中包含用户号码。
为了加快查询速度,您可以在这两列上创建一个(唯一)索引。所以数据库可以做索引范围扫描(唯一)。
如果该表尚不包含主键,那么您应该在两列上创建主键。
To clarify the question:
You have a table containing two numbers each record which are declared 'incompatible'.
You have a user list of numbers and you want to check if this list contains 'incompatible numbers'. Right?
Here you go with a simple SQL (took your example from comment):
This SQL returns all incompatibilities. When the resultset is empty then there are no incompatibilities and everything is fine. If you only want to retrieve this fact then you can write
SELECT 1 ...
instead.The SQL have to be build dynamically to contain the user's numbers in the
IN
clauses, of course.To speed up queries you can create an (unique) index over both columns. So the database can do a index range scan (unique).
If this table does not yet contain a primary key then you should create a primary key over both columns.