Java 文件 I/O 帮助

发布于 2024-11-04 11:58:00 字数 4511 浏览 4 评论 0 原文

我的代码有问题。我需要对具有这种结构的日志文件执行多项操作:

190.12.1.100 2011-03-02 12:12 test.html  
190.12.1.100 2011-03-03 13:18 data.html  
128.33.100.1 2011-03-03 15:25 test.html  
128.33.100.1 2011-03-04 18:30 info.html

我需要根据IP获取每月的访问次数、每页的访问次数以及唯一访问者的数量。这不是问题,我设法让所有三个操作都正常工作。问题是,只有第一个选择正确运行,而其他选择之后仅返回 0 值,就好像文件为空一样,所以我猜测我在某处的 I/O 上犯了错误。这是代码:

import java.io.*;
import java.util.*;

public class WebServerAnalyzer {

private Map<String, Integer> hm1;
private Map<String, Integer> hm2;
private int[] months;
private Scanner input;

public WebServerAnalyzer() throws IOException {
  hm1 = new HashMap<String, Integer>();
  hm2 = new HashMap<String, Integer>();
  months = new int[12];
  for (int i = 0; i < 12; i++) {
      months[i] = 0;
  }
  File file = new File("webserver.log");
  try {
      input = new Scanner(file);
  } catch (FileNotFoundException fne) {
      input = null;
  }
}

public String nextLine() {
  String line = null;
  if (input != null && input.hasNextLine()) {
    line = input.nextLine();
  }
  return line;
}

public int getMonth(String line) {
  StringTokenizer tok = new StringTokenizer(line);
  if (tok.countTokens() == 4) {
    String ip = tok.nextToken();
    String date = tok.nextToken();
    String hour = tok.nextToken();
    String page = tok.nextToken();
    StringTokenizer dtok = new StringTokenizer(date, "-");
    if (dtok.countTokens() == 3) {
      String year = dtok.nextToken();
      String month = dtok.nextToken();
      String day = dtok.nextToken();
      int m = Integer.parseInt(month);
        return m;
    }
  }
  return -1;
}

public String getIP(String line) {
  StringTokenizer tok = new StringTokenizer(line);
  if (tok.countTokens() == 4) {
    String ip = tok.nextToken();
    String date = tok.nextToken();
    String hour = tok.nextToken();
    String page = tok.nextToken();
    StringTokenizer dtok = new StringTokenizer(date, "-");
      return ip;
  }
  return null;
}

public String getPage(String line) {
  StringTokenizer tok = new StringTokenizer(line);
  if (tok.countTokens() == 4) {
    String ip = tok.nextToken();
    String date = tok.nextToken();
    String hour = tok.nextToken();
    String page = tok.nextToken();
    StringTokenizer dtok = new StringTokenizer(date, "-");
      return page;
  }
  return null;
}

public void visitsPerMonth() {
  String line = null;
  do {
    line = nextLine();
    if (line != null) {
      int m = getMonth(line);
      if (m != -1) {
        months[m - 1]++;
      }
    }
  } while (line != null);

  // Print the result
  String[] monthName = {"JAN ", "FEB ", "MAR ",
      "APR ", "MAY ", "JUN ", "JUL ", "AUG ", "SEP ",
      "OCT ", "NOV ", "DEC "};
  for (int i = 0; i < 12; i++) {
    System.out.println(monthName[i] + months[i]);
  }
}

public int count() throws IOException {
  InputStream is = new BufferedInputStream(new FileInputStream("webserver.log"));
  try {
    byte[] c = new byte[1024];
    int count = 0;
    int readChars = 0;
    while ((readChars = is.read(c)) != -1) {
      for (int i = 0; i < readChars; ++i) {
        if (c[i] == '\n')
          ++count;
      }
    }
    return count;
  } finally {
    is.close();
  }
}


public void UniqueIP() throws IOException{
  String line = null;
  for (int x = 0; x <count(); x++){
    line = nextLine();
    if (line != null) {
      if(hm1.containsKey(getIP(line)) == false) {
        hm1.put(getIP(line), 1);
      } else {
        hm1.put(getIP(line), hm1.get(getIP(line)) +1 );
      }
    }
  }

  Set set = hm1.entrySet();
  Iterator i = set.iterator();
  System.out.println("\nNumber of unique visitors: " + hm1.size());
  while(i.hasNext()) {
    Map.Entry me = (Map.Entry)i.next();
    System.out.print(me.getKey() + " - ");
    System.out.println(me.getValue() + " visits");
  }
}

public void pageVisits() throws IOException{
  String line = null;
  for (int x = 0; x <count(); x++){
    line = nextLine();
    if (line != null) {
      if(hm2.containsKey(getPage(line)) == false)
        hm2.put(getPage(line), 1);
      else
        hm2.put(getPage(line), hm2.get(getPage(line)) +1 );
    }
  }
  Set set = hm2.entrySet();
  Iterator i = set.iterator();
  System.out.println("\nNumber of pages visited: " + hm2.size());
  while(i.hasNext()) {
    Map.Entry me = (Map.Entry)i.next();
    System.out.print(me.getKey() + " - ");
    System.out.println(me.getValue() + " visits");
  }
}

任何解决问题的帮助将不胜感激,因为我陷入了困境。

I have a problem with my code. I need to do several operations on a log file with this structure:

190.12.1.100 2011-03-02 12:12 test.html  
190.12.1.100 2011-03-03 13:18 data.html  
128.33.100.1 2011-03-03 15:25 test.html  
128.33.100.1 2011-03-04 18:30 info.html

I need to get the number of visits per month, number of visits per page and number of unique visitors based on the IP. That is not the question, I managed to get all three operations working. The problem is, only the first choice runs correctly while the other choices just return values of 0 afterwards, as if the file is empty, so i am guessing i made a mistake with the I/O somewhere. Here's the code:

import java.io.*;
import java.util.*;

public class WebServerAnalyzer {

private Map<String, Integer> hm1;
private Map<String, Integer> hm2;
private int[] months;
private Scanner input;

public WebServerAnalyzer() throws IOException {
  hm1 = new HashMap<String, Integer>();
  hm2 = new HashMap<String, Integer>();
  months = new int[12];
  for (int i = 0; i < 12; i++) {
      months[i] = 0;
  }
  File file = new File("webserver.log");
  try {
      input = new Scanner(file);
  } catch (FileNotFoundException fne) {
      input = null;
  }
}

public String nextLine() {
  String line = null;
  if (input != null && input.hasNextLine()) {
    line = input.nextLine();
  }
  return line;
}

public int getMonth(String line) {
  StringTokenizer tok = new StringTokenizer(line);
  if (tok.countTokens() == 4) {
    String ip = tok.nextToken();
    String date = tok.nextToken();
    String hour = tok.nextToken();
    String page = tok.nextToken();
    StringTokenizer dtok = new StringTokenizer(date, "-");
    if (dtok.countTokens() == 3) {
      String year = dtok.nextToken();
      String month = dtok.nextToken();
      String day = dtok.nextToken();
      int m = Integer.parseInt(month);
        return m;
    }
  }
  return -1;
}

public String getIP(String line) {
  StringTokenizer tok = new StringTokenizer(line);
  if (tok.countTokens() == 4) {
    String ip = tok.nextToken();
    String date = tok.nextToken();
    String hour = tok.nextToken();
    String page = tok.nextToken();
    StringTokenizer dtok = new StringTokenizer(date, "-");
      return ip;
  }
  return null;
}

public String getPage(String line) {
  StringTokenizer tok = new StringTokenizer(line);
  if (tok.countTokens() == 4) {
    String ip = tok.nextToken();
    String date = tok.nextToken();
    String hour = tok.nextToken();
    String page = tok.nextToken();
    StringTokenizer dtok = new StringTokenizer(date, "-");
      return page;
  }
  return null;
}

public void visitsPerMonth() {
  String line = null;
  do {
    line = nextLine();
    if (line != null) {
      int m = getMonth(line);
      if (m != -1) {
        months[m - 1]++;
      }
    }
  } while (line != null);

  // Print the result
  String[] monthName = {"JAN ", "FEB ", "MAR ",
      "APR ", "MAY ", "JUN ", "JUL ", "AUG ", "SEP ",
      "OCT ", "NOV ", "DEC "};
  for (int i = 0; i < 12; i++) {
    System.out.println(monthName[i] + months[i]);
  }
}

public int count() throws IOException {
  InputStream is = new BufferedInputStream(new FileInputStream("webserver.log"));
  try {
    byte[] c = new byte[1024];
    int count = 0;
    int readChars = 0;
    while ((readChars = is.read(c)) != -1) {
      for (int i = 0; i < readChars; ++i) {
        if (c[i] == '\n')
          ++count;
      }
    }
    return count;
  } finally {
    is.close();
  }
}


public void UniqueIP() throws IOException{
  String line = null;
  for (int x = 0; x <count(); x++){
    line = nextLine();
    if (line != null) {
      if(hm1.containsKey(getIP(line)) == false) {
        hm1.put(getIP(line), 1);
      } else {
        hm1.put(getIP(line), hm1.get(getIP(line)) +1 );
      }
    }
  }

  Set set = hm1.entrySet();
  Iterator i = set.iterator();
  System.out.println("\nNumber of unique visitors: " + hm1.size());
  while(i.hasNext()) {
    Map.Entry me = (Map.Entry)i.next();
    System.out.print(me.getKey() + " - ");
    System.out.println(me.getValue() + " visits");
  }
}

public void pageVisits() throws IOException{
  String line = null;
  for (int x = 0; x <count(); x++){
    line = nextLine();
    if (line != null) {
      if(hm2.containsKey(getPage(line)) == false)
        hm2.put(getPage(line), 1);
      else
        hm2.put(getPage(line), hm2.get(getPage(line)) +1 );
    }
  }
  Set set = hm2.entrySet();
  Iterator i = set.iterator();
  System.out.println("\nNumber of pages visited: " + hm2.size());
  while(i.hasNext()) {
    Map.Entry me = (Map.Entry)i.next();
    System.out.print(me.getKey() + " - ");
    System.out.println(me.getValue() + " visits");
  }
}

Any help figuring out the problem would be much appreciated as I am quite stuck.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(2

逆光飞翔i 2024-11-11 11:58:00

我还没有彻底阅读代码,但我猜当您开始新操作时,您没有将读取位置设置回文件的开头。因此 nextLine() 将返回 null。

您应该为每个操作创建一个新的扫描程序,然后将其关闭。据我所知扫描仪不提供返回第一个字节的方法。

目前我还可以想到 3 个替代方案:

  1. 使用 BufferedReader 并为每个新操作调用 reset()。如果您没有在某处调用 mark(),这应该会导致读取器返回到字节 0。

  2. 读取文件内容一次并迭代内存中的行,即将所有行放入 List 中,然后从每一行开始。

  3. 读取文件一次,解析每一行并构造一个包含您需要的数据的适当的数据结构。例如,您可以使用 TreeMap>>,即您可以存储每页每个 IP 地址的访问次数每个日期。然后您可以按日期、页面和 IP 地址选择适当的子地图。

I didn't read the code thoroughly yet, but I guess you're not setting the read position back to the beginning of the file when you start a new operation. Thus nextLine() would return null.

You should create a new Scanner for each operation and close it afterwards. AFAIK scanner doesn't provide a method to go back to the first byte.

Currently I could also think of 3 alternatives:

  1. Use a BufferedReader and call reset() for each new operation. This should cause the reader to go back to byte 0 provided you didn't call mark() somewhere.

  2. Read the file contents once and iterate over the lines in memory, i.e. put all lines into a List<String> and then start at each line.

  3. Read the file once, parse each line and construct an apropriate data structure that contains the data you need. For example, you could use a TreeMap<Date, Map<Page, Map<IPAdress, List<Visit>>>>, i.e. you'd store the visits per ip address per page for each date. You could then select the appropriate submaps by date, page and ip address.

江湖彼岸 2024-11-11 11:58:00

Thomas 推荐的 BufferedReaderreset 方法仅在以下情况下才有效文件大小小于缓冲区大小,或者如果您调用 标记具有足够大的预读限制。

我建议通读一次文件并更新每行的地图和月份数组。顺便说一句,您不需要 Scanner 来读取行,BufferedReader 本身就有一个 readLine 方法。

BufferedReader br = ...;
String line;
while (null != (line = br.readLine())) {
    String ip = getIP(line);
    String page = getPage(line);
    int month = getMonth(line);
    // update hashmaps and arrays
}

The reset method of BufferedReader that Thomas recommended would only work if the file size is smaller than the buffer size or if you called mark with a large enough read ahead limit.

I would recommend reading throught the file once and to update your maps and month array for each line. BTW, you don't need a Scanner just to read lines, BufferedReader has a readLine method itself.

BufferedReader br = ...;
String line;
while (null != (line = br.readLine())) {
    String ip = getIP(line);
    String page = getPage(line);
    int month = getMonth(line);
    // update hashmaps and arrays
}
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文