C# 编程 如何不使用正则表达式过滤目录中的空间?
我有一个程序,它利用 tokenize 和正则表达式从日志文件字符串中过滤掉空格('')和“,”。
但是,由于日志文件字符串目录中存在空格,因此有人可以提供一些有关我可以使用的正则表达式的建议吗?谢谢!
*请注意,由于必须标记化的日期、时间和内容,因此存在空格和逗号!不要以为我放置这些空间是为了好玩并开始投反对票!就像某人一样。
日志文本文件的这样一个字符串行将是:
Thu Mar 02 1995 21:31:00,2245107,m...,r/rrwxrwxrwx,0,0,8349-128-3,C:/Program Files/AccessData/AccessData Forensic Toolkit/Program/wordnet/Adj.dat
程序的结果输出将是"
Thu
Mar
02
1995
21:31:00
2245107
m...
r/rrwxrwxrwx
0
0
8349-128-3
C:/Program
Files/AccessData/AccessData
Forensic
Toolkit/Program/wordnet/Adj.dat
因此,"C:/Program Files/AccessData/AccessData Forensic Toolkit/Program/wordnet/Adj.dat" 由于空格而被分隔 程序代码
:
using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Diagnostics;
using System.IO;
using System.Text.RegularExpressions;
namespace Testing
{
class Program
{
static void Main(string[] args)
{
String value = "Thu Mar 02 1995 21:31:00,2245107,m...,r/rrwxrwxrwx,0,0,8349-128-
3,C:/Program Files/AccessData/AccessData Forensic
Toolkit/Program/wordnet/Adj.dat";
//
// Split the string on line breaks.
// ... The return value from Split is a string[] array.
//
//foreach (String r in lines)
//{
String rex = @"[\s,]";
String[] token = Regex.Split(value, rex);
foreach (String line in token)
{
Console.WriteLine(line);
}
//}
}
}
}
I have a program which utilizes both tokenize and Regular expressions to filter out both spaces (' ') and "," from a log file string.
However as there are spaces located within a log file string directory, so may someone please offer some advice regarding the regular expressions that I could use? Thanks!
*Please not that there are SPACES and COMMAS due to the date, time and contents that have to be tokenized! DO NOT ASSUME THAT I PLACED THE SPACES FOR FUN and start giving negative votes! Like someone.
One such string line of the log text file would be:
Thu Mar 02 1995 21:31:00,2245107,m...,r/rrwxrwxrwx,0,0,8349-128-3,C:/Program Files/AccessData/AccessData Forensic Toolkit/Program/wordnet/Adj.dat
The results output of the program would be"
Thu
Mar
02
1995
21:31:00
2245107
m...
r/rrwxrwxrwx
0
0
8349-128-3
C:/Program
Files/AccessData/AccessData
Forensic
Toolkit/Program/wordnet/Adj.dat
Therefore the "C:/Program Files/AccessData/AccessData Forensic Toolkit/Program/wordnet/Adj.dat" is seperated due to the spaces regular expressions.
The program codes:
using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Diagnostics;
using System.IO;
using System.Text.RegularExpressions;
namespace Testing
{
class Program
{
static void Main(string[] args)
{
String value = "Thu Mar 02 1995 21:31:00,2245107,m...,r/rrwxrwxrwx,0,0,8349-128-
3,C:/Program Files/AccessData/AccessData Forensic
Toolkit/Program/wordnet/Adj.dat";
//
// Split the string on line breaks.
// ... The return value from Split is a string[] array.
//
//foreach (String r in lines)
//{
String rex = @"[\s,]";
String[] token = Regex.Split(value, rex);
foreach (String line in token)
{
Console.WriteLine(line);
}
//}
}
}
}
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
不要按空格分割,它们是值的一部分。
如果您希望日期的各个部分作为单独的值,您可以将其拆分为空格:
Don't split on spaces, they are part of the values.
If you want the components of the date as separate values, you can split that on spaces:
如果您必须在单个正则表达式中执行此操作,并且您确实想要按空格分割的唯一实例是在第一项(即日期字符串)中,那么你可以做
This regex splits 或者在逗号上或在空格上,但前提是该空格不在字符串中前面的某个地方跟随逗号。
说明:
但请注意,文件名也可能包含逗号(至少它们在那里是合法的 - 它们是否确实出现在您的数据中只有您可以判断)。
\s+ # 然后匹配一个或多个空白字符
If you have to do it in a single regex, and if the only instance where you do want to split on spaces is in the first item (i. e. the date string), then you can do
This regex splits either on a comma or on a space, but only if that space doesn't follow a comma somewhere before in the string.
Explanation:
Be aware, though, that filenames might contain commas, too (at least they are legal there - whether they do appear in your data is something only you can judge).
\s+ # then match one or more whitespace characters