我可以用来存储表格数据的最佳数据结构?
我有一个> 10,000 的帧列表和一个源列表(坐标),我想找到哪个源存在于哪个帧上。每个帧都有一个过滤器属性,并且期望可以在同一过滤器的一个或多个帧上找到源。是这样的吗,我只想记录这样的事件的一次发生。
最终轻松运行脚本来生成网络表。下面是我想要生成的表格的示例。
Source | filter_1 |filter_2 |filter_3 |filter_4 |
-------------------------------------------------
1 | image1 | image 2 | image 3 | image 4 |
2 | image5 | image 6 | image 7 | image 8 |
这是我的代码,
webtable =[]
for frame in frames:
for x, y in sources:
if x_y_on_frame():
webtable.append(
{
'source':(x,y),
'ifilter':frame.filter.name,
'ifile':frame.filename,
'pFile':frame.pngfile,
'fFile':frame.fitsfile,
}
)
在我追加之前,我需要检查源即 (x,y)
和 ifilter
的组合是否已存在于 webtable
中记录。实现这个的最佳数据结构是什么?
I have a list of frames >10,000 and a list of sources (Coordinates), I want find which source exists on which frame. Each frame has a filter attribute, and it is expected that source can be found on one or more frames of the same filter. Is this is the case, i want to record only one one occurance of such an event.
Eventually run a script easily to generate a web-table. Below is an example of tables i want to generate.
Source | filter_1 |filter_2 |filter_3 |filter_4 |
-------------------------------------------------
1 | image1 | image 2 | image 3 | image 4 |
2 | image5 | image 6 | image 7 | image 8 |
this it my code
webtable =[]
for frame in frames:
for x, y in sources:
if x_y_on_frame():
webtable.append(
{
'source':(x,y),
'ifilter':frame.filter.name,
'ifile':frame.filename,
'pFile':frame.pngfile,
'fFile':frame.fitsfile,
}
)
I need to check if a combination of a source i.e. (x,y)
and ifilter
already exist in webtable
before i append the record. What is the best data structure to implement this?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(3)
假设 x,y 和 ifilter 都可以表示为字符串或整数(或其他不可变类型),实际上将您的信息简单地存储在字典中会更容易,其中 (x,y,ifilter) 的元组是关键,这需要最少的代码,并且仍然非常高效:
Assuming that x,y and ifilter can all be represented as strings, or integers (or other immutable types), it would actually be even easier to simply store your information in a dictionary where a tuple of (x,y,ifilter) is the key, this would require a minimal amount of code, and still be very efficient:
Python dict 就可以了。如果存在具有给定 ifilter、x 和 y 的条目 - 继续到源中的下一项:
Python dict would be just fine. If there is an entry with given ifilter, x and y - continue to next item in sources:
由于您的数据字典有一组静态键,因此
collections
模块中的namedtuple
实际上会比匿名字典更好。命名元组的开销比字典低(因为不必为每个项目存储重复的键),但具有命名访问的便利性。您可以定义类似于以下内容的命名元组:
然后,而不是创建以下形式的字典:
您将创建从工厂函数返回的命名元组的实例:
如果您需要访问附加值,则只需将其作为实例变量访问:
Since you have a static set of keys for your data dictionaries, a
namedtuple
from thecollections
module would actually be better than the anonymous dictionary. Namedtuples have a lower overhead than dictionaries (since the duplicate keys don't have to be stored per item), but have the convenience of named access.You could define your namedtuple similar to:
Then, rather creating a dictionary of the form:
you would create an instance of your namedtuple you got back from the factoryfunction:
If you need to access an attached value, you just access it as an instance variable: