使用 Python scipy.io 读取未格式化的 Fortran 文件
目前我正在开发一个可以读取 Fortran 文件的 Python 模块。第一条记录有固定的开头:
recl = ['a20', 'i4', 'a20', 'a6', 'a1', 'i4', 'i4', 'a1', 'i4']
这是可能的最小大小。然而,recl
的长度未知。已知的是,将添加更多 ('i4', 'i4')
块。 所以下一个可能的大小是
recl = ['a20', 'i4', 'a20', 'a6', 'a1', 'i4', 'i4', 'a1', 'i4', "i4", "i4"]
等。我的想法如下:
def read_ff(fname):
from scipy.io import FortranFile, FortranEOFError, FortranFormattingError.
f = FortranFile(fname, "r")
recl = ["a8", "i4", "a6", "a1", "i4", "i4", "a1", "i4"] #RECL = integer-value must be present for DIRECT files.
#It is the length in bytes of each record in the file.
while True : # unknown length of the recl due to varying electron configuration
try: # it is added up until record can be read
record = f.read_record(*recl)
break
except ValueError:
recl += "i4", "i4"
except FortranEOFError:
break
except FortranFormattingError:
break
dim = f.read_ints("i4") #nrow, ncol, nzc1
matrix = f.read_record("f8").reshape((dim[2],dim[0],dim[1])) #reshape((nzc1,nrows,ncols))
return record, matrix
但是,从第二次循环运行开始,不再执行 f.read_record(*recl)
命令,而是执行 ValueError 直接返回,并在 except 语句中扩展了 recl
,以便循环无限期地运行。
如果您在开始时点击相应的recl
,则会读取该文件。所以我认为用于创建该文件的 Fortran 代码是无关紧要的,但如果您愿意,我可以提供它。
我期待您的意见和建议。
对于我们这些对化学有兴趣的人来说:第一个记录描述了带有图片变化误差校正因子的数据集,如下所示:'DFT/PBE ',[6],'cc-pVDZ ','point ' ,'C',[4],[0],'O',[1],[0],[2]。
未知长度的条目是完成计算的电子构型,其结构如下:“C”:闭壳 i,j:对称/不对称轨道中的 i/j 个电子,“O”:开壳 n: 0 表示没有电子,n >=1 表示 n * ("i4", "i4") 在开壳中。
At the moment I am working on a Python module that is supposed to read a Fortran file. The first record has a fixed beginning:
recl = ['a20', 'i4', 'a20', 'a6', 'a1', 'i4', 'i4', 'a1', 'i4']
This is the smallest possible size. However, the recl
has an unknown length. What is known is that more ('i4', 'i4')
blocks will be added.
So the next possible size is
recl = ['a20', 'i4', 'a20', 'a6', 'a1', 'i4', 'i4', 'a1', 'i4', "i4", "i4"]
etc. My idea is the following :
def read_ff(fname):
from scipy.io import FortranFile, FortranEOFError, FortranFormattingError.
f = FortranFile(fname, "r")
recl = ["a8", "i4", "a6", "a1", "i4", "i4", "a1", "i4"] #RECL = integer-value must be present for DIRECT files.
#It is the length in bytes of each record in the file.
while True : # unknown length of the recl due to varying electron configuration
try: # it is added up until record can be read
record = f.read_record(*recl)
break
except ValueError:
recl += "i4", "i4"
except FortranEOFError:
break
except FortranFormattingError:
break
dim = f.read_ints("i4") #nrow, ncol, nzc1
matrix = f.read_record("f8").reshape((dim[2],dim[0],dim[1])) #reshape((nzc1,nrows,ncols))
return record, matrix
However, from the second run of the loop, the f.read_record(*recl)
command is no longer executed, but the ValueError is returned directly and the recl
is extended in the except statement so that the loop runs indefinitely.
If you hit the appropriate recl
right at the beginning, the file is read. So I think the Fortran code used to create the file is irrelevant, but I can provide it if you like.
I look forward to your comments and advice.
For those of us with an affinity for chemistry: The first record describes the data set with a correction factor for the picture change error and looks like this: 'DFT/PBE ',[6],'cc-pVDZ ','point ','C',[4],[0], 'O',[1],[0],[2].
The entry of the unknown length is the electron configuration with which the calculation was done and is structured like this: "C" : closed shell i,j : i/j electrons in symmetrical / asymmetrical orbital, "O" : open shell n : 0 for no electrons, n >=1 for n * ("i4", "i4") in open shells.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
即使尝试失败,第一次尝试读取记录也会使文件指针前进。如果您要重试,则必须重置文件指针。尝试如下操作:
请注意,属性
_fp
的前导下划线表示它是私有属性。无法保证此属性在 SciPy 的未来版本中不会更改,因此使用时需自行承担风险。或者,您可以关闭并重新打开该文件:
The first attempt to read a record advances the file pointer even if the attempt fails. You'll have to reset the file pointer if you are going to try again. Try something like this:
Note that the leading underscore of the attribute
_fp
indicates that it is intended to be a private attribute. There is no guarantee that this attribute won't change in future versions of SciPy, so use at your own risk.Alternatively, you could close and reopen the file: