Hadoop ftpfilesystem未能列出文件并抛出SockettimeoutException

发布于 2025-02-09 06:37:36 字数 2264 浏览 1 评论 0原文

我正在使用apache hadoop ftpfilesystem 版本3.2.0列出并从FTP服务器读取文件。

这是我的测试代码:

public static void main(String[] args) throws IOException {
    String host = "some-host";
    int port = 21;
    Configuration conf = new Configuration(false);
    conf.set("fs.ftp.host", host);
    conf.setInt("fs.ftp.host.port", port);
    conf.set("fs.ftp.user." + host, "username");
    conf.set("fs.ftp.password." + host, "password");
    conf.set("fs.ftp.data.connection.mode", "PASSIVE_LOCAL_DATA_CONNECTION_MODE");
    conf.set("fs.ftp.impl", "org.apache.hadoop.fs.ftp.FTPFileSystem");
    
    String fsURL = String.format("ftp://%s:%s", host, String.valueOf(port));
    conf.set("fs.default.name", fsURL);
    FileSystem fs =  FileSystem.newInstance(conf);
    Path somePath = new Path("actual/path");
    fs.getFileStatus(somePath).isDirectory(); // returns true
    fs.listStatus(somePath); // keeps spinning then throws SocketTimeOutException
}

在调试僵局或在此方法上发生延迟发生后org.apache.commons.net.ftp.ftp.ftp.ftpclient.initiatiatelistelistparsing(ftpfileentryparser,string) > Engine.ReadServerList(socket.getInputStream(),getControlenCoding());如下:

private FTPListParseEngine initiateListParsing(
        FTPFileEntryParser parser, String pathname)
throws IOException
{
    Socket socket = _openDataConnection_(FTPCmd.LIST, getListArguments(pathname));

    FTPListParseEngine engine = new FTPListParseEngine(parser, __configuration);
    if (socket == null)
    {
        return engine;
    }

    try {
        engine.readServerList(socket.getInputStream(), getControlEncoding());
    }
    finally {
        Util.closeQuietly(socket);
    }

    completePendingCommand();
    return engine;
}

方法调用一直被阻止,直到最终抛出SockettimeOutException,甚至使用具有相同凭证和属性的Filezilla,均匀列出了我可以列出的&amp&amp&在更快的时间内平稳读取文件。

我正在使用的凭据,并且属性是正确的,作为初始连接和fs.getFilestatus(somePath).isdirectory();呼叫工作起作用并返回正确的值。

我是否可以添加一个属性来使事情更快,或者是Apache Hadoop ftpfilesystem版本3.2.0中的错误?

I am using Apache Hadoop FTPFileSystem version 3.2.0 to list and read files from an FTP Server.

Here is my test code:

public static void main(String[] args) throws IOException {
    String host = "some-host";
    int port = 21;
    Configuration conf = new Configuration(false);
    conf.set("fs.ftp.host", host);
    conf.setInt("fs.ftp.host.port", port);
    conf.set("fs.ftp.user." + host, "username");
    conf.set("fs.ftp.password." + host, "password");
    conf.set("fs.ftp.data.connection.mode", "PASSIVE_LOCAL_DATA_CONNECTION_MODE");
    conf.set("fs.ftp.impl", "org.apache.hadoop.fs.ftp.FTPFileSystem");
    
    String fsURL = String.format("ftp://%s:%s", host, String.valueOf(port));
    conf.set("fs.default.name", fsURL);
    FileSystem fs =  FileSystem.newInstance(conf);
    Path somePath = new Path("actual/path");
    fs.getFileStatus(somePath).isDirectory(); // returns true
    fs.listStatus(somePath); // keeps spinning then throws SocketTimeOutException
}

After some debugging the deadlock or the delay happens at this method org.apache.commons.net.ftp.FTPClient.initiateListParsing(FTPFileEntryParser, String) at this method execution: engine.readServerList(socket.getInputStream(), getControlEncoding()); as below:

private FTPListParseEngine initiateListParsing(
        FTPFileEntryParser parser, String pathname)
throws IOException
{
    Socket socket = _openDataConnection_(FTPCmd.LIST, getListArguments(pathname));

    FTPListParseEngine engine = new FTPListParseEngine(parser, __configuration);
    if (socket == null)
    {
        return engine;
    }

    try {
        engine.readServerList(socket.getInputStream(), getControlEncoding());
    }
    finally {
        Util.closeQuietly(socket);
    }

    completePendingCommand();
    return engine;
}

The method call keeps blocked until it finally throws a socketTimeoutException, even-though using FileZilla with same credentials and properties I can list & read files smoothly and in a much faster time.

The credentials I am using and properties are correct as the initial connection and fs.getFileStatus(somePath).isDirectory(); call works and return correct value.

Is there a property I can add to make things faster or is it a bug in apache hadoop FTPFileSystem version 3.2.0?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

祁梦 2025-02-16 06:37:36

您可能需要将传输和/或连接模式更改为以下一个


conf.set("fs.ftp.transfer.mode", "COMPRESSED_TRANSFER_MODE");
// OR
conf.set("fs.ftp.transfer.mode", "STREAM_TRANSFER_MODE");

// AND

conf.set("fs.ftp.data.connection.mode", "PASSIVE_LOCAL_DATA_CONNECTION_MODE");
// OR
conf.set("fs.ftp.data.connection.mode", "PASSIVE_REMOTE_DATA_CONNECTION_MODE");

You may need to change the transfer and/or connection mode to one of the following


conf.set("fs.ftp.transfer.mode", "COMPRESSED_TRANSFER_MODE");
// OR
conf.set("fs.ftp.transfer.mode", "STREAM_TRANSFER_MODE");

// AND

conf.set("fs.ftp.data.connection.mode", "PASSIVE_LOCAL_DATA_CONNECTION_MODE");
// OR
conf.set("fs.ftp.data.connection.mode", "PASSIVE_REMOTE_DATA_CONNECTION_MODE");
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文