使用 Java 更快地检索 SQL 数据并搜索大数据

发布于 2024-11-14 11:35:37 字数 758 浏览 2 评论 0原文

我有一个包含超过 10 万条由数字对组成的数据的表。其示例如下所示。

   A          B
   0010     0010 
    0010     0011 
    0010     0019 
    0010     0056
    0011     0010 
    0011     0011 
    0011     0019 
    0011     0040
    0019     0010
    0019     0058 

这里,A 列中的数字可能存在于 B 列中。 解释 :用户将拥有多个范围在 10 -100 之间的数字。现在我们可以看到 0010 - 0011 和 0019 存在。因此,如果用户的列表包含 0010 和 0011,则会显示一条警告,表明不允许使用该对,反之亦然。

在 Java 中如何处理这个问题? 尽管搜索速度会快得多,但加载包含所有数据的哈希图似乎不是一个好的选择。

请建议。谢谢

Testcase:

num = 0010;  //value from list which user will be passing
test(num){
if(num.equals("0019")||num.equals("0011")) //compairing with database 
System.out.println("incompatible pair present");
}

上面的例子是一个非常简单的伪代码。实际问题会复杂得多。

I have a table with over 100 thousand data consisting of number pairs. A sample of which is shown below.

   A          B
   0010     0010 
    0010     0011 
    0010     0019 
    0010     0056
    0011     0010 
    0011     0011 
    0011     0019 
    0011     0040
    0019     0010
    0019     0058 

Here the numbers in Column A has possible pairs present in column B.
Explanation : User will have several of these numbers ranging form 10 -100. Now as we can see for 0010 - 0011 and 0019 is present. So if the user has a list containing 0010 along with 0011 a warning will be shown that this pair is not allowed and vice versa.

How to approach this in Java?
Loading the hash map with all the data doesnot seem to be a good option although the search will be much faster.

please suggest. Thanks

Testcase:

num = 0010;  //value from list which user will be passing
test(num){
if(num.equals("0019")||num.equals("0011")) //compairing with database 
System.out.println("incompatible pair present");
}

The above example is a very simple pseudo code. The actual problem will me much more complex.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(7

2024-11-21 11:35:37

直到问题更清楚为止......
处理已存储在数据库中的大量数据,我给您一个建议:无论您想在这里做什么,请考虑使用 SQL 而不是 Java 来解决它。或者至少编写一个带有结果集的 SQL,之后可以很容易地用 Java 进行评估。

但直到问题还不清楚......

Until the question is more clear...
Handling large amounts of data which are already stored in a database let me give you a recommendation: Whatever you want to do here, consider solving it with SQL instead of Java. Or at least write a SQL with an resulting ResultSet which is easy to evaluate in Java afterwards.

But until the question is not that clear ...

蓝眼泪 2024-11-21 11:35:37

您是否试图查找 A 值相同但 B 不同的条目?

SELECT t1.a, t1.b, t2.b 
FROM MyTable t1, MyTable t2 
WHERE t1.a = t2.b AND t1.b <> t2.b

Are you trying to find entries where A is the same value but B is different?

SELECT t1.a, t1.b, t2.b 
FROM MyTable t1, MyTable t2 
WHERE t1.a = t2.b AND t1.b <> t2.b
不弃不离 2024-11-21 11:35:37

如果您担心堆空间不足,可以尝试使用持久缓存,例如 ehcache。我建议您在使用此解决方案之前检查实际消耗的内存

If you're worried of running out of heap space, you could try using a persistent cache like ehcache. I suggest you check the actual memory consumed before going in for this solution though

早乙女 2024-11-21 11:35:37

看起来你的问题仅限于一个非常小的域 - 为什么你不能实例化一个二维数组 bool 并将其设置为 true 每当两个数字的索引创建一个不受支持的组合。

使用示例:

if (forbiden[10][11] || forbiden[11][10]) 
{
   throw new Exception("pairs of '10' and '11' are not allowed");
}

您可以通过检查一次数据并设置此数组来从数据库实例化此数组。您只需将 0010 翻译为 10 即可。索引 0-9 中会有垃圾,但您可以通过从 9 中减去索引来“翻译”索引来消除它。

这符合你的问题吗?

Seems like your problem is limited to a very small domain - why cant you instantiate an two dimensional array of bool and set it to true whenever the indexes of two numbers creates an unsupported combination.

Example for usage:

if (forbiden[10][11] || forbiden[11][10]) 
{
   throw new Exception("pairs of '10' and '11' are not allowed");
}

You can instantiate this array from the database by going over the data once and setting this array. You just need to translate 0010 to 10. You will have junk in Indexes 0-9, but you can eliminate it by "translating" the index by subtracting it from 9.

Does that hit your question?

飘落散花 2024-11-21 11:35:37

如果我正确理解了您想要执行的操作...

t1(a,b) 上创建唯一索引。将用户的新对放入 try 块内的 INSERT 语句中。捕获键违规异常(将是SQLException,可能是一个子类,具体取决于您的 RDBMS)并向用户解释这是一个禁止的对。

If I have understood correctly what you want to do…

Create a unique index on t1(a,b). Put the user's new pair in an INSERT statement inside a try block. Catch key violation exceptions (will be aSQLException, possibly a subclass depending on your RDBMS) and explain to the user that is a forbidden pair.

书信已泛黄 2024-11-21 11:35:37

简单 - 绝对不是可扩展的解决方案 - 如果您的范围确实是 0000 - 9999。

只需有一个包含 999999 个条目的字节表。
每个条目由一个简单的 0(表示允许)或 1(表示不允许)组成。
通过逻辑连接两对数字(键 = 第一个 * 1000 + 第二个),可以在表中找到一个条目。

更具可扩展性的数据库解决方案是创建一个具有复合主键(对 1 和对 2)的表,只要存在指示不允许的对的条目即可。

Simple - definitely not scalable solution -- if your ranges really are 0000 - 9999.

Simply have a byte table with 999999 entries.
Each entry consists of a simple 0 for allowed or 1 for not allowed.
You find an entry in the table by logically concatenating the two pair numbers (key = first * 1000 + second).

The more scalable database solution is to create a table with a composite primary key (pair1 and pair2) the mere presence of an entry indicating a disallowed pair.

油焖大侠 2024-11-21 11:35:37

为了澄清这个问题:
您有一个表,其中每个记录包含两个数字,这些数字被声明为“不兼容”。
您有一个用户号码列表,并且想要检查该列表是否包含“不兼容的号码”。正确的?

这里有一个简单的 SQL(从评论中获取示例):

SELECT *
FROM   incompatible
WHERE  A IN (1, 14, 67) AND B IN (1, 14, 67);

此 SQL 返回所有不兼容性。当结果集为空时,就不存在不兼容性,一切都很好。如果您只想检索此事实,则可以改为编写 SELECT 1 ...
当然,必须动态构建 SQL 才能在 IN 子句中包含用户号码。

为了加快查询速度,您可以在这两列上创建一个(唯一)索引。所以数据库可以做索引范围扫描(唯一)。
如果该表尚不包含主键,那么您应该在两列上创建主键。

To clarify the question:
You have a table containing two numbers each record which are declared 'incompatible'.
You have a user list of numbers and you want to check if this list contains 'incompatible numbers'. Right?

Here you go with a simple SQL (took your example from comment):

SELECT *
FROM   incompatible
WHERE  A IN (1, 14, 67) AND B IN (1, 14, 67);

This SQL returns all incompatibilities. When the resultset is empty then there are no incompatibilities and everything is fine. If you only want to retrieve this fact then you can write SELECT 1 ... instead.
The SQL have to be build dynamically to contain the user's numbers in the IN clauses, of course.

To speed up queries you can create an (unique) index over both columns. So the database can do a index range scan (unique).
If this table does not yet contain a primary key then you should create a primary key over both columns.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文