Java 文件 I/O 帮助
我的代码有问题。我需要对具有这种结构的日志文件执行多项操作:
190.12.1.100 2011-03-02 12:12 test.html
190.12.1.100 2011-03-03 13:18 data.html
128.33.100.1 2011-03-03 15:25 test.html
128.33.100.1 2011-03-04 18:30 info.html
我需要根据IP获取每月的访问次数、每页的访问次数以及唯一访问者的数量。这不是问题,我设法让所有三个操作都正常工作。问题是,只有第一个选择正确运行,而其他选择之后仅返回 0 值,就好像文件为空一样,所以我猜测我在某处的 I/O 上犯了错误。这是代码:
import java.io.*;
import java.util.*;
public class WebServerAnalyzer {
private Map<String, Integer> hm1;
private Map<String, Integer> hm2;
private int[] months;
private Scanner input;
public WebServerAnalyzer() throws IOException {
hm1 = new HashMap<String, Integer>();
hm2 = new HashMap<String, Integer>();
months = new int[12];
for (int i = 0; i < 12; i++) {
months[i] = 0;
}
File file = new File("webserver.log");
try {
input = new Scanner(file);
} catch (FileNotFoundException fne) {
input = null;
}
}
public String nextLine() {
String line = null;
if (input != null && input.hasNextLine()) {
line = input.nextLine();
}
return line;
}
public int getMonth(String line) {
StringTokenizer tok = new StringTokenizer(line);
if (tok.countTokens() == 4) {
String ip = tok.nextToken();
String date = tok.nextToken();
String hour = tok.nextToken();
String page = tok.nextToken();
StringTokenizer dtok = new StringTokenizer(date, "-");
if (dtok.countTokens() == 3) {
String year = dtok.nextToken();
String month = dtok.nextToken();
String day = dtok.nextToken();
int m = Integer.parseInt(month);
return m;
}
}
return -1;
}
public String getIP(String line) {
StringTokenizer tok = new StringTokenizer(line);
if (tok.countTokens() == 4) {
String ip = tok.nextToken();
String date = tok.nextToken();
String hour = tok.nextToken();
String page = tok.nextToken();
StringTokenizer dtok = new StringTokenizer(date, "-");
return ip;
}
return null;
}
public String getPage(String line) {
StringTokenizer tok = new StringTokenizer(line);
if (tok.countTokens() == 4) {
String ip = tok.nextToken();
String date = tok.nextToken();
String hour = tok.nextToken();
String page = tok.nextToken();
StringTokenizer dtok = new StringTokenizer(date, "-");
return page;
}
return null;
}
public void visitsPerMonth() {
String line = null;
do {
line = nextLine();
if (line != null) {
int m = getMonth(line);
if (m != -1) {
months[m - 1]++;
}
}
} while (line != null);
// Print the result
String[] monthName = {"JAN ", "FEB ", "MAR ",
"APR ", "MAY ", "JUN ", "JUL ", "AUG ", "SEP ",
"OCT ", "NOV ", "DEC "};
for (int i = 0; i < 12; i++) {
System.out.println(monthName[i] + months[i]);
}
}
public int count() throws IOException {
InputStream is = new BufferedInputStream(new FileInputStream("webserver.log"));
try {
byte[] c = new byte[1024];
int count = 0;
int readChars = 0;
while ((readChars = is.read(c)) != -1) {
for (int i = 0; i < readChars; ++i) {
if (c[i] == '\n')
++count;
}
}
return count;
} finally {
is.close();
}
}
public void UniqueIP() throws IOException{
String line = null;
for (int x = 0; x <count(); x++){
line = nextLine();
if (line != null) {
if(hm1.containsKey(getIP(line)) == false) {
hm1.put(getIP(line), 1);
} else {
hm1.put(getIP(line), hm1.get(getIP(line)) +1 );
}
}
}
Set set = hm1.entrySet();
Iterator i = set.iterator();
System.out.println("\nNumber of unique visitors: " + hm1.size());
while(i.hasNext()) {
Map.Entry me = (Map.Entry)i.next();
System.out.print(me.getKey() + " - ");
System.out.println(me.getValue() + " visits");
}
}
public void pageVisits() throws IOException{
String line = null;
for (int x = 0; x <count(); x++){
line = nextLine();
if (line != null) {
if(hm2.containsKey(getPage(line)) == false)
hm2.put(getPage(line), 1);
else
hm2.put(getPage(line), hm2.get(getPage(line)) +1 );
}
}
Set set = hm2.entrySet();
Iterator i = set.iterator();
System.out.println("\nNumber of pages visited: " + hm2.size());
while(i.hasNext()) {
Map.Entry me = (Map.Entry)i.next();
System.out.print(me.getKey() + " - ");
System.out.println(me.getValue() + " visits");
}
}
任何解决问题的帮助将不胜感激,因为我陷入了困境。
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
我还没有彻底阅读代码,但我猜当您开始新操作时,您没有将读取位置设置回文件的开头。因此
nextLine()
将返回 null。您应该为每个操作创建一个新的扫描程序,然后将其关闭。据我所知扫描仪不提供返回第一个字节的方法。
目前我还可以想到 3 个替代方案:
使用
BufferedReader
并为每个新操作调用reset()
。如果您没有在某处调用mark()
,这应该会导致读取器返回到字节 0。读取文件内容一次并迭代内存中的行,即将所有行放入
List
中,然后从每一行开始。读取文件一次,解析每一行并构造一个包含您需要的数据的适当的数据结构。例如,您可以使用
TreeMap>>
,即您可以存储每页每个 IP 地址的访问次数每个日期。然后您可以按日期、页面和 IP 地址选择适当的子地图。I didn't read the code thoroughly yet, but I guess you're not setting the read position back to the beginning of the file when you start a new operation. Thus
nextLine()
would return null.You should create a new Scanner for each operation and close it afterwards. AFAIK scanner doesn't provide a method to go back to the first byte.
Currently I could also think of 3 alternatives:
Use a
BufferedReader
and callreset()
for each new operation. This should cause the reader to go back to byte 0 provided you didn't callmark()
somewhere.Read the file contents once and iterate over the lines in memory, i.e. put all lines into a
List<String>
and then start at each line.Read the file once, parse each line and construct an apropriate data structure that contains the data you need. For example, you could use a
TreeMap<Date, Map<Page, Map<IPAdress, List<Visit>>>>
, i.e. you'd store the visits per ip address per page for each date. You could then select the appropriate submaps by date, page and ip address.Thomas 推荐的
BufferedReader
的reset
方法仅在以下情况下才有效文件大小小于缓冲区大小,或者如果您调用 标记具有足够大的预读限制。我建议通读一次文件并更新每行的地图和月份数组。顺便说一句,您不需要 Scanner 来读取行,BufferedReader 本身就有一个 readLine 方法。
The
reset
method ofBufferedReader
that Thomas recommended would only work if the file size is smaller than the buffer size or if you called mark with a large enough read ahead limit.I would recommend reading throught the file once and to update your maps and month array for each line. BTW, you don't need a Scanner just to read lines, BufferedReader has a
readLine
method itself.