什么更便宜:转换为 int 或修剪 C++ 中的字符串?
我正在从 linux /proc fs 读取多个文件,并且必须将这些值插入数据库中。我应该尽可能做到最佳。那么什么更便宜:
i)将 then 转换为 int,同时将 then 存储在内存中,以便稍后在构建 INSERT 语句时再次转换
为字符串 ii)或将它们保留为字符串,只需清理值(删除 ':',空格等...)
iii) 我应该考虑什么才能做出这个决定?
我已经在排队了,因为他们来的顺序对我来说不够好。
谢谢,
Pedro
编辑 - 澄清
抱歉大家,我的情况如下:我每 10 秒测量一次 cpu、内存、网络、磁盘等。我们正在开发我们的数据库系统,所以除了 INSERT 语句之外我不能指望任何东西。
我对这种优化很感兴趣,因为解析数据的频率很高。它只会被写入一次 - 写入后数据将不会更新。
I am reading several files from linux /proc fs and I will have to insert those values in a database. I should be as optimal as possible. So what is cheaper:
i) to cast then to int, while I stored then in memory, for later cast to string again while I build my INSERT statement
ii) or keep them as string, just sanitizing the values (removing ':', spaces, etc...)
iii) What should I take in account to learn to make this decision?
I am already doing a split in the lines, because the order they came is not good enough for me.
Thanks,
Pedro
Edit - Clarification
Sorry guys, my scenario is the following: I am measuring cpu, memory, network, disk, etc... every 10 seconds. We are developing our database system, so I cannot count with anything more than just INSERT statements.
I got interested in this optimization because the frequency off parsing data. Its gonna be write once - there will be no updates over the data after it is written.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
您似乎正在执行一些归档活动[一次写入,可能最多读取一次](存储数据库以供以后罕见/不频繁使用),如果没有,您应该根据数据的方式来强调优化将被读取(而不是写入)。
如果这是归档情况,也许将 BLOB(二进制大对象,[或类似的概念])插入数据库会更有效。
添加:
显然,这取决于您如何读取数据。您只是列出数据以供稍后浏览之用,还是会根据基准值进行更复杂的获取查询。
例如,如果您稍后执行以下操作:
SELECT * from db.Log WHERE log.time > time1 和 Max(内存)< 5000
那么最好将每个数据保持其原始格式(整数为int,String为字符串等),以便将主要数据处理留给DB服务器。You seem to be performing some archiving activity [write-once, read-probably-atmost-once] (storing the DB for a later rare/non-frequent use), if not, you should put the optimization emphasize based on how the data will be read (not written).
If this is the archiving case, maybe inseting BLOBs (binary large objects, [or similar concepts]) into the DB will be more efficient.
Addition:
Apparently it will depend on how you will read the data. Are you just listing the data for browse purpose later on, or there will be more complex fetching queries based on the benchmark values.
For example if you are later performing something like:
SELECT * from db.Log WHERE log.time > time1 and Max (Memory) < 5000
then it is best to keep each data in its original format (int in integer, string in String, etc) so that the main data processing is left to DB server.