Java 文件 I/O 帮助

发布于 2024-11-04 11:58:00 字数 4511 浏览 4 评论 0 原文

我的代码有问题。我需要对具有这种结构的日志文件执行多项操作：

190.12.1.100 2011-03-02 12:12 test.html  
190.12.1.100 2011-03-03 13:18 data.html  
128.33.100.1 2011-03-03 15:25 test.html  
128.33.100.1 2011-03-04 18:30 info.html

我需要根据IP获取每月的访问次数、每页的访问次数以及唯一访问者的数量。这不是问题，我设法让所有三个操作都正常工作。问题是，只有第一个选择正确运行，而其他选择之后仅返回 0 值，就好像文件为空一样，所以我猜测我在某处的 I/O 上犯了错误。这是代码：

import java.io.*;
import java.util.*;

public class WebServerAnalyzer {

private Map<String, Integer> hm1;
private Map<String, Integer> hm2;
private int[] months;
private Scanner input;

public WebServerAnalyzer() throws IOException {
  hm1 = new HashMap<String, Integer>();
  hm2 = new HashMap<String, Integer>();
  months = new int[12];
  for (int i = 0; i < 12; i++) {
      months[i] = 0;
  }
  File file = new File("webserver.log");
  try {
      input = new Scanner(file);
  } catch (FileNotFoundException fne) {
      input = null;
  }
}

public String nextLine() {
  String line = null;
  if (input != null && input.hasNextLine()) {
    line = input.nextLine();
  }
  return line;
}

public int getMonth(String line) {
  StringTokenizer tok = new StringTokenizer(line);
  if (tok.countTokens() == 4) {
    String ip = tok.nextToken();
    String date = tok.nextToken();
    String hour = tok.nextToken();
    String page = tok.nextToken();
    StringTokenizer dtok = new StringTokenizer(date, "-");
    if (dtok.countTokens() == 3) {
      String year = dtok.nextToken();
      String month = dtok.nextToken();
      String day = dtok.nextToken();
      int m = Integer.parseInt(month);
        return m;
    }
  }
  return -1;
}

public String getIP(String line) {
  StringTokenizer tok = new StringTokenizer(line);
  if (tok.countTokens() == 4) {
    String ip = tok.nextToken();
    String date = tok.nextToken();
    String hour = tok.nextToken();
    String page = tok.nextToken();
    StringTokenizer dtok = new StringTokenizer(date, "-");
      return ip;
  }
  return null;
}

public String getPage(String line) {
  StringTokenizer tok = new StringTokenizer(line);
  if (tok.countTokens() == 4) {
    String ip = tok.nextToken();
    String date = tok.nextToken();
    String hour = tok.nextToken();
    String page = tok.nextToken();
    StringTokenizer dtok = new StringTokenizer(date, "-");
      return page;
  }
  return null;
}

public void visitsPerMonth() {
  String line = null;
  do {
    line = nextLine();
    if (line != null) {
      int m = getMonth(line);
      if (m != -1) {
        months[m - 1]++;
      }
    }
  } while (line != null);

  // Print the result
  String[] monthName = {"JAN ", "FEB ", "MAR ",
      "APR ", "MAY ", "JUN ", "JUL ", "AUG ", "SEP ",
      "OCT ", "NOV ", "DEC "};
  for (int i = 0; i < 12; i++) {
    System.out.println(monthName[i] + months[i]);
  }
}

public int count() throws IOException {
  InputStream is = new BufferedInputStream(new FileInputStream("webserver.log"));
  try {
    byte[] c = new byte[1024];
    int count = 0;
    int readChars = 0;
    while ((readChars = is.read(c)) != -1) {
      for (int i = 0; i < readChars; ++i) {
        if (c[i] == '\n')
          ++count;
      }
    }
    return count;
  } finally {
    is.close();
  }
}


public void UniqueIP() throws IOException{
  String line = null;
  for (int x = 0; x <count(); x++){
    line = nextLine();
    if (line != null) {
      if(hm1.containsKey(getIP(line)) == false) {
        hm1.put(getIP(line), 1);
      } else {
        hm1.put(getIP(line), hm1.get(getIP(line)) +1 );
      }
    }
  }

  Set set = hm1.entrySet();
  Iterator i = set.iterator();
  System.out.println("\nNumber of unique visitors: " + hm1.size());
  while(i.hasNext()) {
    Map.Entry me = (Map.Entry)i.next();
    System.out.print(me.getKey() + " - ");
    System.out.println(me.getValue() + " visits");
  }
}

public void pageVisits() throws IOException{
  String line = null;
  for (int x = 0; x <count(); x++){
    line = nextLine();
    if (line != null) {
      if(hm2.containsKey(getPage(line)) == false)
        hm2.put(getPage(line), 1);
      else
        hm2.put(getPage(line), hm2.get(getPage(line)) +1 );
    }
  }
  Set set = hm2.entrySet();
  Iterator i = set.iterator();
  System.out.println("\nNumber of pages visited: " + hm2.size());
  while(i.hasNext()) {
    Map.Entry me = (Map.Entry)i.next();
    System.out.print(me.getKey() + " - ");
    System.out.println(me.getValue() + " visits");
  }
}

任何解决问题的帮助将不胜感激，因为我陷入了困境。

原文

I have a problem with my code. I need to do several operations on a log file with this structure:

190.12.1.100 2011-03-02 12:12 test.html  
190.12.1.100 2011-03-03 13:18 data.html  
128.33.100.1 2011-03-03 15:25 test.html  
128.33.100.1 2011-03-04 18:30 info.html

I need to get the number of visits per month, number of visits per page and number of unique visitors based on the IP. That is not the question, I managed to get all three operations working. The problem is, only the first choice runs correctly while the other choices just return values of 0 afterwards, as if the file is empty, so i am guessing i made a mistake with the I/O somewhere. Here's the code:

import java.io.*;
import java.util.*;

public class WebServerAnalyzer {

private Map<String, Integer> hm1;
private Map<String, Integer> hm2;
private int[] months;
private Scanner input;

public WebServerAnalyzer() throws IOException {
  hm1 = new HashMap<String, Integer>();
  hm2 = new HashMap<String, Integer>();
  months = new int[12];
  for (int i = 0; i < 12; i++) {
      months[i] = 0;
  }
  File file = new File("webserver.log");
  try {
      input = new Scanner(file);
  } catch (FileNotFoundException fne) {
      input = null;
  }
}

public String nextLine() {
  String line = null;
  if (input != null && input.hasNextLine()) {
    line = input.nextLine();
  }
  return line;
}

public int getMonth(String line) {
  StringTokenizer tok = new StringTokenizer(line);
  if (tok.countTokens() == 4) {
    String ip = tok.nextToken();
    String date = tok.nextToken();
    String hour = tok.nextToken();
    String page = tok.nextToken();
    StringTokenizer dtok = new StringTokenizer(date, "-");
    if (dtok.countTokens() == 3) {
      String year = dtok.nextToken();
      String month = dtok.nextToken();
      String day = dtok.nextToken();
      int m = Integer.parseInt(month);
        return m;
    }
  }
  return -1;
}

public String getIP(String line) {
  StringTokenizer tok = new StringTokenizer(line);
  if (tok.countTokens() == 4) {
    String ip = tok.nextToken();
    String date = tok.nextToken();
    String hour = tok.nextToken();
    String page = tok.nextToken();
    StringTokenizer dtok = new StringTokenizer(date, "-");
      return ip;
  }
  return null;
}

public String getPage(String line) {
  StringTokenizer tok = new StringTokenizer(line);
  if (tok.countTokens() == 4) {
    String ip = tok.nextToken();
    String date = tok.nextToken();
    String hour = tok.nextToken();
    String page = tok.nextToken();
    StringTokenizer dtok = new StringTokenizer(date, "-");
      return page;
  }
  return null;
}

public void visitsPerMonth() {
  String line = null;
  do {
    line = nextLine();
    if (line != null) {
      int m = getMonth(line);
      if (m != -1) {
        months[m - 1]++;
      }
    }
  } while (line != null);

  // Print the result
  String[] monthName = {"JAN ", "FEB ", "MAR ",
      "APR ", "MAY ", "JUN ", "JUL ", "AUG ", "SEP ",
      "OCT ", "NOV ", "DEC "};
  for (int i = 0; i < 12; i++) {
    System.out.println(monthName[i] + months[i]);
  }
}

public int count() throws IOException {
  InputStream is = new BufferedInputStream(new FileInputStream("webserver.log"));
  try {
    byte[] c = new byte[1024];
    int count = 0;
    int readChars = 0;
    while ((readChars = is.read(c)) != -1) {
      for (int i = 0; i < readChars; ++i) {
        if (c[i] == '\n')
          ++count;
      }
    }
    return count;
  } finally {
    is.close();
  }
}


public void UniqueIP() throws IOException{
  String line = null;
  for (int x = 0; x <count(); x++){
    line = nextLine();
    if (line != null) {
      if(hm1.containsKey(getIP(line)) == false) {
        hm1.put(getIP(line), 1);
      } else {
        hm1.put(getIP(line), hm1.get(getIP(line)) +1 );
      }
    }
  }

  Set set = hm1.entrySet();
  Iterator i = set.iterator();
  System.out.println("\nNumber of unique visitors: " + hm1.size());
  while(i.hasNext()) {
    Map.Entry me = (Map.Entry)i.next();
    System.out.print(me.getKey() + " - ");
    System.out.println(me.getValue() + " visits");
  }
}

public void pageVisits() throws IOException{
  String line = null;
  for (int x = 0; x <count(); x++){
    line = nextLine();
    if (line != null) {
      if(hm2.containsKey(getPage(line)) == false)
        hm2.put(getPage(line), 1);
      else
        hm2.put(getPage(line), hm2.get(getPage(line)) +1 );
    }
  }
  Set set = hm2.entrySet();
  Iterator i = set.iterator();
  System.out.println("\nNumber of pages visited: " + hm2.size());
  while(i.hasNext()) {
    Map.Entry me = (Map.Entry)i.next();
    System.out.print(me.getKey() + " - ");
    System.out.println(me.getValue() + " visits");
  }
}

Any help figuring out the problem would be much appreciated as I am quite stuck.

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

逆光飞翔i 2024-11-11 11:58:00

我还没有彻底阅读代码，但我猜当您开始新操作时，您没有将读取位置设置回文件的开头。因此 nextLine() 将返回 null。

您应该为每个操作创建一个新的扫描程序，然后将其关闭。据我所知扫描仪不提供返回第一个字节的方法。

目前我还可以想到 3 个替代方案：

使用 BufferedReader 并为每个新操作调用 reset()。如果您没有在某处调用 mark()，这应该会导致读取器返回到字节 0。
读取文件内容一次并迭代内存中的行，即将所有行放入 List 中，然后从每一行开始。
读取文件一次，解析每一行并构造一个包含您需要的数据的适当的数据结构。例如，您可以使用 TreeMap>>，即您可以存储每页每个 IP 地址的访问次数每个日期。然后您可以按日期、页面和 IP 地址选择适当的子地图。

回复收藏 0 原文

江湖彼岸 2024-11-11 11:58:00

Thomas 推荐的 BufferedReader 的 reset 方法仅在以下情况下才有效文件大小小于缓冲区大小，或者如果您调用标记具有足够大的预读限制。

我建议通读一次文件并更新每行的地图和月份数组。顺便说一句，您不需要 Scanner 来读取行，BufferedReader 本身就有一个 readLine 方法。

BufferedReader br = ...;
String line;
while (null != (line = br.readLine())) {
    String ip = getIP(line);
    String page = getPage(line);
    int month = getMonth(line);
    // update hashmaps and arrays
}

The reset method of BufferedReader that Thomas recommended would only work if the file size is smaller than the buffer size or if you called mark with a large enough read ahead limit.

I would recommend reading throught the file once and to update your maps and month array for each line. BTW, you don't need a Scanner just to read lines, BufferedReader has a readLine method itself.

BufferedReader br = ...;
String line;
while (null != (line = br.readLine())) {
    String ip = getIP(line);
    String page = getPage(line);
    int month = getMonth(line);
    // update hashmaps and arrays
}

回复收藏 0 原文

~没有更多了~