在 Python 中访问标识符映射的有效方法
我正在编写一个应用程序来进行文件转换,其中一部分是将旧帐号替换为新帐号。
现在我有一个 CSV 文件映射新旧帐号,包含大约 30K 条记录。我读入此内容并将其存储为字典,然后在写入新文件时通过密钥从字典中获取新帐户。
我的问题是,如果 CSV 文件增加到 100K+ 条记录,最好的方法是什么?
将帐户映射从 CSV 转换为 sqlite 数据库比将它们存储为内存中的字典会更有效吗?
I am writing an app to do a file conversion and part of that is replacing old account numbers with a new account numbers.
Right now I have a CSV file mapping the old and new account numbers with around 30K records. I read this in and store it as dict and when writing the new file grab the new account from the dict by key.
My question is what is the best way to do this if the CSV file increases to 100K+ records?
Would it be more efficient to convert the account mappings from a CSV to a sqlite database rather than storing them as a dict in memory?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
只要它们都适合内存,字典将是最有效的解决方案。编码也容易得多。在现代计算机上 100k 条记录应该没有问题。
你是对的,当记录数量变得非常大时,切换到 SQLite 数据库是一个不错的选择。
As long as they will all fit in memory, a dict will be the most efficient solution. It's also a lot easier to code. 100k records should be no problem on a modern computer.
You are right that switching to an SQLite database is a good choice when the number of records gets very large.