使用什么样的算法来分解数据?
我有一个包含大量数据的表,需要对每个数据进行查找并分解每个数据。这是一个简化的数字示例。我有这张表:
1 [1]
2 [1, 1]
4 [2, 2]
现在我想分解 4。我抬头看到 2+2=4。然后我查找 2 看看是否可以分解为 1+1,这样我就知道 2+1+1=4 和 1+1+1+1=4。对于这个问题,我应该将其分解(使用计算表)为 4 个结果(提到的 3 个和 4 *1 =4)。
我不确定,但这是否是一个图表问题?或者其他类型?我想我可以通过使用递归来解决这个问题,但我想了解是否有一种普遍接受的方法,并且这个过程将处理大量数据,所以我需要以一种方式来设计它故障可以分布在多个CPU上。
知道这是什么类型的问题或解决它的逻辑吗?
I have a table with a large amount of data and need to do lookups on each and break down each of the data. Here's a simplified numeric example. I have this table:
1 [1]
2 [1, 1]
4 [2, 2]
now I want to break down 4. I look up and see 2+2=4. so then I look up 2 and see if breaks down into 1+1 so I know 2+1+1=4 and 1+1+1+1=4. For this problem, I should break it down(using the computed table) into 4 results(the 3 mentioned and 4 *1 =4).
I am not sure but is this a graph problem? or some other type? I think I can solve this by just using a recursions that break this down, but I want to learn if there's an general accepted way and this process will deal with large amounts of data so I'll need to design it in a way that the breakdown can be distributed over multiple CPUs.
Any idea what type of problem this is or the logic to solve it?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
据我所知,您的具体示例可能是递归,可能是图形问题,也可能是其他几个问题,或者是它们的组合。并不是每个编程问题都可以归入一个单一的类别,并且通常有至少六种不同的有效方法来解决任何问题。
在具体处理大量数据方面,可以采用很多很多不同的策略,具体取决于需要如何访问(顺序?通过偏移随机?通过键随机或某种搜索?),如何更新的频率,有多少数据与存储层次结构的各个级别的大小有关,等等。
然后有多个CPU——并行处理——除了数据同步之外,数据同步也成为一个重要问题。其他问题。
As close as I can understand your specific example, it could be recursion, it could be a graph problem, it could be several other things, or a combination. Not every programming problem can be sorted into a single neat category, and there are generally at least a half-dozen different valid approaches to any problem.
In terms of dealing with large amounts of data specifically, there are many, many different strategies that may be employed, depending on how it needs to be accessed (sequentially? randomly by offset? randomly by key or some sort of search?), how frequently it will be updated, how much data there is in relationship to the sizes of the various levels of the storage hierarchy, etc.
And then there's multiple CPUs -- parallel processing -- where data synchronization becomes an important issue, in addition to the other problems.
你的例子确实太模糊了 - 你不是把它作为一个真实的场景或要解决的问题来呈现,而是作为一个算法 -
您不是在问如何做某事 - 您是在告诉我们您想做什么并询问该活动的名称。
从你的问题可以看出,你知道自己需要做什么。您的流程应该是
Your example is really too vague - you present it not as a real scenario or problem to be solved, but as an algorithm -
You aren't asking how to do something - you're telling us what you want to do and asking the name of that activity.
From your question it's clear that you know what you need to do. Your process should be