Java:暂停线程并获取文件中的位置

发布于 2024-11-02 20:20:31 字数 1716 浏览 0 评论 0原文

我正在用 Java 编写一个具有多线程的应用程序,我想暂停和恢复该应用程序。
该线程正在逐行读取文件,同时查找与模式匹配的行。它必须在我暂停线程的地方继续。为了读取文件,我将 BufferedReader 与 InputStreamReader 和 FileInputStream 结合使用。

fip = new FileInputStream(new File(*file*));
fileBuffer = new BufferedReader(new InputStreamReader(fip));

我使用这个 FileInputStream 是因为我需要文件中位置的文件指针。
处理这些行时,它将匹配的行写入 MySQL 数据库。为了在线程之间使用 MySQL 连接,我使用 ConnectionPool 来确保只有一个线程正在使用一个连接。

问题是当我暂停线程并恢复它们时,一些匹配的行就消失了。我还尝试从偏移量中减去缓冲区大小,但仍然存在相同的问题。

解决这个问题的好方法是什么,或者我做错了什么?

更多细节:

循环

    // Regex engine
    RunAutomaton ra = new RunAutomaton(this.conf.getAuto(), true);
    lw = new LogWriter();

        while((line=fileBuffer.readLine()) != null) {
            if(line.length()>0) {
                if(ra.run(line)) {
                        // Write to LogWriter
                        lw.write(line, this.file.getName());
                        lw.execute();
                    }
                }
            }
            // Loop when paused.
            while(pause) { }
        }

计算文件中的位置

// Get the position in the file
public long getFilePosition() throws IOException {
    long position = fip.getChannel().position() - bufferSize + fileBuffer.getNextChar();
    return position;
}

将其放入数据库

            // Get the connector
            ConnectionPoolManager cpl = ConnectionPoolManager.getManager();
            Connector con = null;
            while(con == null)
                con = cpl.getConnectionFromPool();
            // Insert the query
            con.executeUpdate(this.sql.toString());
            cpl.returnConnectionToPool(con);

I'm writing an application in Java with multithreading which I want to pause and resume.
The thread is reading a file line by line while finding matching lines to a pattern. It has to continue on the place I paused the thread. To read the file I use a BufferedReader in combination with an InputStreamReader and FileInputStream.

fip = new FileInputStream(new File(*file*));
fileBuffer = new BufferedReader(new InputStreamReader(fip));

I use this FileInputStream because I need the filepointer for the position in the file.
When processing the lines it writes the matching lines to a MySQL database. To use a MySQL-connection between the threads I use a ConnectionPool to make sure just one thread is using one connection.

The problem is when I pause the threads and resume them, a few matching lines just disappear. I also tried to subtract the buffersize from the offset but it still has the same problem.

What is a decent way to solve this problem or what am I doing wrong?

Some more details:

The loop

    // Regex engine
    RunAutomaton ra = new RunAutomaton(this.conf.getAuto(), true);
    lw = new LogWriter();

        while((line=fileBuffer.readLine()) != null) {
            if(line.length()>0) {
                if(ra.run(line)) {
                        // Write to LogWriter
                        lw.write(line, this.file.getName());
                        lw.execute();
                    }
                }
            }
            // Loop when paused.
            while(pause) { }
        }

Calculating place in file

// Get the position in the file
public long getFilePosition() throws IOException {
    long position = fip.getChannel().position() - bufferSize + fileBuffer.getNextChar();
    return position;
}

Putting it into the database

            // Get the connector
            ConnectionPoolManager cpl = ConnectionPoolManager.getManager();
            Connector con = null;
            while(con == null)
                con = cpl.getConnectionFromPool();
            // Insert the query
            con.executeUpdate(this.sql.toString());
            cpl.returnConnectionToPool(con);

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(3

笑咖 2024-11-09 20:20:31

这是我相信您正在寻找的示例。您没有展示太多的实现,因此很难调试可能导致您出现差距的原因。请注意,FileInputStream 的位置将是 8192 的倍数,因为 BufferedReader 使用该大小的缓冲区。如果您想使用多个线程读取同一个文件,您可能会发现这个回答有帮助。

public class ReaderThread extends Thread {
    private final FileInputStream fip;
    private final BufferedReader fileBuffer;
    private volatile boolean paused;

    public ReaderThread(File file) throws FileNotFoundException {
        fip = new FileInputStream(file);
        fileBuffer = new BufferedReader(new InputStreamReader(fip));
    }

    public void setPaused(boolean paused) {
        this.paused = paused;
    }

    public long getFilePos() throws IOException {
        return fip.getChannel().position();
    }

    public void run() {
        try {
            String line;
            while ((line = fileBuffer.readLine()) != null) {
                // process your line here
                System.out.println(line);

                while (paused) {
                    sleep(10);
                }
            }
        } catch (IOException e) {
            // handle I/O errors
        } catch (InterruptedException e) {
            // handle interrupt
        }
    }
}

Here's an example of what I believe you're looking for. You didn't show much of your implementation so it's hard to debug what might be causing gaps for you. Note that the position of the FileInputStream is going to be a multiple of 8192 because the BufferedReader is using a buffer of that size. If you want to use multiple threads to read the same file you might find this answer helpful.

public class ReaderThread extends Thread {
    private final FileInputStream fip;
    private final BufferedReader fileBuffer;
    private volatile boolean paused;

    public ReaderThread(File file) throws FileNotFoundException {
        fip = new FileInputStream(file);
        fileBuffer = new BufferedReader(new InputStreamReader(fip));
    }

    public void setPaused(boolean paused) {
        this.paused = paused;
    }

    public long getFilePos() throws IOException {
        return fip.getChannel().position();
    }

    public void run() {
        try {
            String line;
            while ((line = fileBuffer.readLine()) != null) {
                // process your line here
                System.out.println(line);

                while (paused) {
                    sleep(10);
                }
            }
        } catch (IOException e) {
            // handle I/O errors
        } catch (InterruptedException e) {
            // handle interrupt
        }
    }
}
南…巷孤猫 2024-11-09 20:20:31

我认为问题的根源在于您不应该减去 bufferSize。相反,您应该减去缓冲区中未读字符的数量。我认为没有办法得到这个。

我能想到的最简单的解决方案是创建 FilterReader 的自定义子类来跟踪读取的字符数。然后按如下方式堆叠流:

FileReader 
< BufferedReader 
< custom filter reader
< BufferedReader(sz == 1)

最终的 BufferedReader 就在那里,以便您可以使用 readLine ...但您需要将缓冲区大小设置为 1,以便过滤器中的字符计数与申请已达到。

或者,您可以在自定义过滤器读取器中实现自己的 readLine() 方法。

I think the root of the problem is that you shouldn't be subtracting bufferSize. Rather you should be subtracting the number of unread characters in the buffer. And I don't think there's a way to get this.

The easiest solution I can think of is to create a custom subclass of FilterReader that keeps track of the number of characters read. Then stack the streams as follows:

FileReader 
< BufferedReader 
< custom filter reader
< BufferedReader(sz == 1)

The final BufferedReader is there so that you can use readLine ... but you need to set the buffer size to 1 so that the character count from your filter matches the position that the application has reached.

Alternatively, you could implement your own readLine() method in the custom filter reader.

紅太極 2024-11-09 20:20:31

经过几天的搜索,我发现确实减去缓冲区大小并添加缓冲区中的位置并不是正确的方法。位置从来都不对,我总是漏掉一些台词。
当寻找一种新的工作方式时,我没有计算字符数,因为字符数太多,无法计算,这会大大降低我的性能。但我发现了别的东西。软件工程师 Mark S. Kolich 创建了 JumpToLine 类,它使用 Apache IO 库跳转到给定行。它还可以提供它已读取的最后一行,所以这确实是我所需要的。
他的主页上有一些示例,供感兴趣的人参考。

After a few days searching I found out that indeed subtracting the buffersize and adding the position in the buffer wasn't the right way to do it. The position was never right and I was always missing some lines.
When searching a new way to do my job I didn't count the number of characters because it are just too many characters to count which will decrease my performance a lot. But I've found something else. Software engineer Mark S. Kolich created a class JumpToLine which uses the Apache IO library to jump to a given line. It can also provide the last line it has readed so this is really what I need.
There are some examples on his homepage for those interested.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文