使用fortran将一块数据插入到一个大文件中

发布于 2024-09-26 00:39:53 字数 463 浏览 5 评论 0原文

我有一些海量(460 万行)数据文件,我正在尝试使用 Fortran 进行编辑。基本上,整个文件都是一系列标题,后面跟着一个数字表。像这样的事情:
p he4 废话 99 ggg
1.0e+01 2.0e+01 2.0e+01
2.0e+01 5.0e+01 2.0e+01
.
.
3.2e+-1 2.0e+01 1.0e+00
p he3 blafoo 99 ggg
1.1e+00 2.3e+01 2.0e+01

我的任务是将一个文件中的某些条目替换为另一个文件中的某些条目。该列表单独提供。

我已经编写了一个已经可以运行的代码。我的策略是只读取并回显第一个文件,直到找到与替换列表匹配的标头。然后在第二个文件中找到相同的标头,回显条目。最后,切换回回显第一个文件。这种方法的唯一问题是它太慢了!我研究了文件的直接访问,但它们没有固定的记录长度。有人有更好的主意吗?

欢呼帮助, 富有的

I have some massive (4.6 million lines) data files that I'm trying to edit with fortran. Basically, throughout the files is a series of headers followed by a table of numbers. Something like this:
p he4 blah 99 ggg
1.0e+01 2.0e+01 2.0e+01
2.0e+01 5.0e+01 2.0e+01
.
.
3.2e+-1 2.0e+01 1.0e+00
p he3 blafoo 99 ggg
1.1e+00 2.3e+01 2.0e+01

My task is to replace certain entries in one file with those from the other. The list is supplied separately.

I have written a code that already works. My strategy is to just read and echo the first file until I find a header that matches the replacement list. Then find the same header in the second file, echo the entries. Finally, switch back to echoing the first file. The only problem with this approach is that it's SOOOOOO slow! I looked into direct access of the files, but they don't have fixed record lengths. Does anyone have a better idea?

Cheers for the help,
Rich

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(2

雨落星ぅ辰 2024-10-03 00:39:53

文件中的标题是否以任何方式排序?如果没有,那么在第二个文件中创建标头的索引文件应该可以加快第一次查找的速度。我的 Fortran 非常生疏,但是如果您可以将第二个文件中的标头排序到索引文件中,并参考完整条目的位置,您应该能够显着加快速度?

Are the headers in the files sorted in any way? If not then creating an index file of the headers in the second file should speed up the first lookup. My fortran is very rusty, but if you can sort the headers in the second file into an index file with a reference to the position of the full entry you should be able to speed things up dramatically?

梦断已成空 2024-10-03 00:39:53

我假设您正在读取文件 1,并将结果写入文件 3。
文件 2 包含替换内容。

Preprocess file 2, by loading each header, and using a hash algorithm to create 
an array with and integer hash representation of each header value in it, and a
pointer/subscript to the values to replace it by.

while there are lines left in file 1

    read an original line from file 1
    hash the original line to get the hash value.

    if the hash value is in the hash array
         write the replacement to file 3
    else
         write the original line to file 3

这应该能解决问题。

I am assuming that you are reading file 1, and writing the results to file 3.
File 2 contains the replacements.

Preprocess file 2, by loading each header, and using a hash algorithm to create 
an array with and integer hash representation of each header value in it, and a
pointer/subscript to the values to replace it by.

while there are lines left in file 1

    read an original line from file 1
    hash the original line to get the hash value.

    if the hash value is in the hash array
         write the replacement to file 3
    else
         write the original line to file 3

That ought to do the trick.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文