在 C# 中为文件中的条目分配编号
改进的格式
所以,我以前使用的方法都没有帮助我对日志文件数据进行聚类:(
现在我要尝试一种索引方法..为此我需要根据 URL 中出现的关键字对每个日志文件条目建立索引字段..
示例:
192.162.1.4 [3/May/2009 00:34:45] "GET /books/casual/4534.pdf" 200 454353 "http://ljdhjg.com" "Mozillablahblah"<br/>
190.100.1.4 [3/May/2009 00:37:45] "GET /resources/help.pdf" 200 4353 "http://ljdhjg.com" "Mozillablahblah"<br/>
192.162.1.4 [3/May/2009 00:40:45] "GET /books/serious/44.pdf" 200 234353 "http://ljdhjg.com" "Mozillablahblah"<br/>
....我还有数千个这样的条目..
现在,所有 "books"
都需要分配一个编号...1(例如)..接下来,需要分配 "resources"
2..我如何在 C# 中完成这个任务?我的意思是,我知道逻辑...
提取关键字..分配数字..将关键字数组与文件的每一行进行比较..如果匹配,则分配。但由于我是 C# 新手,我真的不知道如何编写上述逻辑。所以..帮忙?
Improved Formatting
So,none of the previous approaches i used helped me to cluster my log file data :(
Now Im going to try an indexing approach..for which I need to index each log file entry based on the keyword that appears in the URL field..
example:
192.162.1.4 [3/May/2009 00:34:45] "GET /books/casual/4534.pdf" 200 454353 "http://ljdhjg.com" "Mozillablahblah"<br/>
190.100.1.4 [3/May/2009 00:37:45] "GET /resources/help.pdf" 200 4353 "http://ljdhjg.com" "Mozillablahblah"<br/>
192.162.1.4 [3/May/2009 00:40:45] "GET /books/serious/44.pdf" 200 234353 "http://ljdhjg.com" "Mozillablahblah"<br/>
....And i have thousands more entries like this..
Now all of "books"
needs to be assigned a number...1 (say)..and next, "resources"
needs to be assigned 2..how do i go about accomplishing this in C# ? I mean,i know the logic...
Extract keyword..assign number..compare keyword array with each line of file..if match,assign. But since im new to C#, i dont really know how to code the above mentioned logic. So..help?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
您可以尝试这种临时方法进行分配(我假设这意味着将索引添加到日志条目的前缀),
对
GetIndexedEntry()
的调用如下所示,其中
logEntry
是代表日志文件中每个条目的字符串。对于
192.162.1.4 [3/May/2009 00:34:45]的
logEntry
“GET /books/casual/4534.pdf”200 454353“http://ljdhjg.com " "Mozillablahblah"
indexedLogEntry
将是1 : 192.162.1.4 [3/May/2009 00:34:45] "GET /books/casual/4534.pdf" 200 454353 "http://ljdhjg.com" "Mozillablahblah"
可以采用更优雅的方法如果使用正则表达式。
You can try this adhoc approach to assign (I am assuming this means prefixing the index to the log entry),
The call to
GetIndexedEntry()
would look like,where
logEntry
is the string representing each entry in the log file.For a
logEntry
of192.162.1.4 [3/May/2009 00:34:45] "GET /books/casual/4534.pdf" 200 454353 "http://ljdhjg.com" "Mozillablahblah"
the
indexedLogEntry
would be1 : 192.162.1.4 [3/May/2009 00:34:45] "GET /books/casual/4534.pdf" 200 454353 "http://ljdhjg.com" "Mozillablahblah"
A more elegant approach is possible if one uses regular expressions.