使用 Python 和 ftplib.FTP 从 z/os 下载文本文件

发布于 2024-07-29 16:36:41 字数 1819 浏览 11 评论 0原文

我正在尝试使用 Python 和 ftplib 自动从 az/os PDS 下载一些文本文件。

由于主机文件是 EBCDIC，我不能简单地使用 FTP.retrbinary()。

FTP.retrlines() 与 open(file,w).writelines 作为回调一起使用时，当然不提供 EOL。

所以，对于初学者来说，我已经想出了这段“对我来说看起来不错”的代码，但由于我是一个相对的 Python 菜鸟，有人能建议更好的方法吗？显然，为了简单起见，这不是最终的、花里胡哨的事情。

非常感谢。

#!python.exe
from ftplib import FTP

class xfile (file):
    def writelineswitheol(self, sequence):
        for s in sequence:
            self.write(s+"\r\n")

sess = FTP("zos.server.to.be", "myid", "mypassword")
sess.sendcmd("site sbd=(IBM-1047,ISO8859-1)")
sess.cwd("'FOO.BAR.PDS'")
a = sess.nlst("RTB*")
for i in a:
    sess.retrlines("RETR "+i, xfile(i, 'w').writelineswitheol)
sess.quit()

更新：Python 3.0，平台是Windows XP下的MingW。

z/os PDS 具有固定的记录结构，而不是依赖行结尾作为记录分隔符。然而，z/os FTP 服务器在以文本模式传输时，会提供记录结尾，而 retrlines() 会将其删除。

结束更新：

这是我修改后的解决方案，它将成为持续开发的基础（例如删除内置密码）：

import ftplib
import os
from sys import exc_info

sess = ftplib.FTP("undisclosed.server.com", "userid", "password")
sess.sendcmd("site sbd=(IBM-1047,ISO8859-1)")
for dir in ["ASM", "ASML", "ASMM", "C", "CPP", "DLLA", "DLLC", "DLMC", "GEN", "HDR", "MAC"]:
    sess.cwd("'ZLTALM.PREP.%s'" % dir)
    try:
        filelist = sess.nlst()
    except ftplib.error_perm as x:
        if (x.args[0][:3] != '550'):
            raise
    else:
        try:
            os.mkdir(dir)
        except:
            continue
        for hostfile in filelist:
            lines = []
            sess.retrlines("RETR "+hostfile, lines.append)
            pcfile = open("%s/%s"% (dir,hostfile), 'w')
            for line in lines:
                pcfile.write(line+"\n")
            pcfile.close()
        print ("Done: " + dir)
sess.quit()

感谢 John 和 Vinay

原文

I'm trying to automate downloading of some text files from a z/os PDS, using Python and ftplib.

Since the host files are EBCDIC, I can't simply use FTP.retrbinary().

FTP.retrlines(), when used with open(file,w).writelines as its callback, doesn't, of course, provide EOLs.

So, for starters, I've come up with this piece of code which "looks OK to me", but as I'm a relative Python noob, can anyone suggest a better approach? Obviously, to keep this question simple, this isn't the final, bells-and-whistles thing.

Many thanks.

#!python.exe
from ftplib import FTP

class xfile (file):
    def writelineswitheol(self, sequence):
        for s in sequence:
            self.write(s+"\r\n")

sess = FTP("zos.server.to.be", "myid", "mypassword")
sess.sendcmd("site sbd=(IBM-1047,ISO8859-1)")
sess.cwd("'FOO.BAR.PDS'")
a = sess.nlst("RTB*")
for i in a:
    sess.retrlines("RETR "+i, xfile(i, 'w').writelineswitheol)
sess.quit()

Update: Python 3.0, platform is MingW under Windows XP.

z/os PDSs have a fixed record structure, rather than relying on line endings as record separators. However, the z/os FTP server, when transmitting in text mode, provides the record endings, which retrlines() strips off.

Closing update:

Here's my revised solution, which will be the basis for ongoing development (removing built-in passwords, for example):

import ftplib
import os
from sys import exc_info

sess = ftplib.FTP("undisclosed.server.com", "userid", "password")
sess.sendcmd("site sbd=(IBM-1047,ISO8859-1)")
for dir in ["ASM", "ASML", "ASMM", "C", "CPP", "DLLA", "DLLC", "DLMC", "GEN", "HDR", "MAC"]:
    sess.cwd("'ZLTALM.PREP.%s'" % dir)
    try:
        filelist = sess.nlst()
    except ftplib.error_perm as x:
        if (x.args[0][:3] != '550'):
            raise
    else:
        try:
            os.mkdir(dir)
        except:
            continue
        for hostfile in filelist:
            lines = []
            sess.retrlines("RETR "+hostfile, lines.append)
            pcfile = open("%s/%s"% (dir,hostfile), 'w')
            for line in lines:
                pcfile.write(line+"\n")
            pcfile.close()
        print ("Done: " + dir)
sess.quit()

My thanks to both John and Vinay

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

背叛残局 2024-08-05 16:36:41

当我试图弄清楚如何从 z/OS 递归下载数据集时，我遇到了这个问题。多年来我一直在使用一个简单的 python 脚本从大型机下载 ebcdic 文件。它实际上只是这样做：

def writeline(line):
    file.write(line + "\n")

file = open(filename, "w")
ftp.retrlines("retr " + filename, writeline)

Just came across this question as I was trying to figure out how to recursively download datasets from z/OS. I've been using a simple python script for years now to download ebcdic files from the mainframe. It effectively just does this:

def writeline(line):
    file.write(line + "\n")

file = open(filename, "w")
ftp.retrlines("retr " + filename, writeline)

回复收藏 0 原文

固执像三岁 2024-08-05 16:36:41

您应该能够以二进制形式下载该文件（使用 retrbinary），并使用 codecs 模块将 EBCDIC 转换为您想要的任何输出编码。您应该知道 z/OS 系统上使用的特定 EBCDIC 代码页（例如 cp500）。如果文件很小，您甚至可以执行类似的操作（用于转换为 UTF-8）：

file = open(ebcdic_filename, "rb")
data = file.read()
converted = data.decode("cp500").encode("utf8")
file = open(utf8_filename, "wb")
file.write(converted)
file.close()

更新： 如果您需要使用 retrlines 来获取行和您的行以正确的编码返回，您的方法将不起作用，因为每行都会调用一次回调。因此，在回调中，sequence 将成为该行，并且 for 循环会将该行中的各个字符写入到输出中，每个字符在其自己的行上。因此，您可能想要执行 self.write(sequence + "\r\n") 而不是 for 循环。不过，仅仅为了添加此实用方法而对 file 进行子类化仍然感觉不太正确 - 它可能需要位于您的 bells-and-whistles 版本中的不同类中。

You should be able to download the file as a binary (using retrbinary) and use the codecs module to convert from EBCDIC to whatever output encoding you want. You should know the specific EBCDIC code page being used on the z/OS system (e.g. cp500). If the files are small, you could even do something like (for a conversion to UTF-8):

file = open(ebcdic_filename, "rb")
data = file.read()
converted = data.decode("cp500").encode("utf8")
file = open(utf8_filename, "wb")
file.write(converted)
file.close()

Update: If you need to use retrlines to get the lines and your lines are coming back in the correct encoding, your approach will not work, because the callback is called once for each line. So in the callback, sequence will be the line, and your for loop will write individual characters in the line to the output, each on its own line. So you probably want to do self.write(sequence + "\r\n") rather than the for loop. It still doesn' feel especially right to subclass file just to add this utility method, though - it probably needs to be in a different class in your bells-and-whistles version.

回复收藏 0 原文

尬尬 2024-08-05 16:36:41

您的 writelineswitheol 方法附加 '\r\n' 而不是 '\n'，然后将结果写入以文本模式打开的文件。无论您在哪个平台上运行，其结果都是不需要的“\r”。只需附加“\n”，您就会得到适当的行结尾。

正确的错误处理不应该被降级为“花里胡哨”的版本。您应该设置回调，以便您的文件 open() 位于 try/ except 中并保留对输出文件句柄的引用，您的 write 调用位于 try/ except 中，并且您有一个 callback_obj.close() 方法，该方法当 retrlines() 返回显式 file_handle.close() （在 try/ except 中）时使用 - 这样你就可以得到显式错误处理，例如消息“can't (open|write to|close) file X because Y”并且您不必考虑何时隐式关闭文件以及是否有耗尽文件句柄的风险。

Python 3.x ftplib.FTP.retrlines() 应该给你 str 对象，它们实际上是 Unicode 字符串，你需要在编写它们之前对它们进行编码——除非默认编码是 latin1，这对于 Windows 来说是相当不寻常的盒子。您应该拥有包含 (1) 所有可能的 256 字节 (2) 在预期 EBCDIC 代码页中有效的所有字节的测试文件。

[一些“卫生”评论]

您应该考虑将 Python 从 3.0（“概念验证”版本）升级到 3.1。
您的代码，请仅将“i”用作标识符作为序列索引，并且仅当您在 3 个或更多个世纪前无可救药地从 FORTRAN 中养成了这个习惯:-)
到目前为止发现的两个问题（将行终止符附加到每个字符），错误的行终止符）会在您第一次测试时出现。

回复收藏 0 原文

我不吻晚风 2024-08-05 16:36:41

使用ftplib的retrlines从z/os下载文件，每行没有'\n'。

它与 Windows ftp 命令“get xxx”不同。

我们可以将 ftplib.py 中的函数“retrlines”重写为“retrlines_zos”。

只需复制 retrlines 的整个代码，并将“回调”行更改为：

...

callback(line + "\n")

...

我测试过，它有效。

回复收藏 0 原文

丑疤怪 2024-08-05 16:36:41

你想要一个 lambda 函数和一个回调。像这样：

def writeLineCallback(line, file):
     file.write(line + "\n")

ftpcommand = "RETR {}{}{}".format("'",zOsFile,"'")  
filename = "newfilename"
with open( filename, 'w' ) as file :
     callback_lambda = lambda x: writeLineCallback(x,file)
     ftp.retrlines(ftpcommand, callback_lambda)

这将下载文件“zOsFile”并将其写入“newfilename”

you want a lambda function and a callback. Like so:

def writeLineCallback(line, file):
     file.write(line + "\n")

ftpcommand = "RETR {}{}{}".format("'",zOsFile,"'")  
filename = "newfilename"
with open( filename, 'w' ) as file :
     callback_lambda = lambda x: writeLineCallback(x,file)
     ftp.retrlines(ftpcommand, callback_lambda)

This will download file 'zOsFile' and write it to 'newfilename'

回复收藏 0 原文

~没有更多了~