C# 编程 如何不使用正则表达式过滤目录中的空间?

发布于 2024-10-04 02:44:53 字数 1560 浏览 1 评论 0原文

我有一个程序,它利用 tokenize 和正则表达式从日志文件字符串中过滤掉空格('')和“,”。

但是,由于日志文件字符串目录中存在空格,因此有人可以提供一些有关我可以使用的正则表达式的建议吗?谢谢!

*请注意,由于必须标记化的日期、时间和内容,因此存在空格和逗号!不要以为我放置这些空间是为了好玩并开始投反对票!就像某人一样。

日志文本文件的这样一个字符串行将是:

Thu Mar 02 1995 21:31:00,2245107,m...,r/rrwxrwxrwx,0,0,8349-128-3,C:/Program Files/AccessData/AccessData Forensic Toolkit/Program/wordnet/Adj.dat

程序的结果输出将是"

Thu
Mar
02
1995
21:31:00
2245107
m...
r/rrwxrwxrwx
0
0
8349-128-3
C:/Program
Files/AccessData/AccessData
Forensic
Toolkit/Program/wordnet/Adj.dat

因此,"C:/Program Files/AccessData/AccessData Forensic Toolkit/Program/wordnet/Adj.dat" 由于空格而被分隔 程序代码

using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Diagnostics;
using System.IO;
using System.Text.RegularExpressions;


namespace Testing
{
class Program
{
    static void Main(string[] args)
    {

      String value = "Thu Mar 02 1995 21:31:00,2245107,m...,r/rrwxrwxrwx,0,0,8349-128-
      3,C:/Program Files/AccessData/AccessData Forensic 
      Toolkit/Program/wordnet/Adj.dat";
        //
        // Split the string on line breaks.
        // ... The return value from Split is a string[] array.
        //

        //foreach (String r in lines)
        //{
            String rex = @"[\s,]";

            String[] token = Regex.Split(value, rex);

            foreach (String line in token)
            {
                Console.WriteLine(line);
            }
        //}
    }
}
}

I have a program which utilizes both tokenize and Regular expressions to filter out both spaces (' ') and "," from a log file string.

However as there are spaces located within a log file string directory, so may someone please offer some advice regarding the regular expressions that I could use? Thanks!

*Please not that there are SPACES and COMMAS due to the date, time and contents that have to be tokenized! DO NOT ASSUME THAT I PLACED THE SPACES FOR FUN and start giving negative votes! Like someone.

One such string line of the log text file would be:

Thu Mar 02 1995 21:31:00,2245107,m...,r/rrwxrwxrwx,0,0,8349-128-3,C:/Program Files/AccessData/AccessData Forensic Toolkit/Program/wordnet/Adj.dat

The results output of the program would be"

Thu
Mar
02
1995
21:31:00
2245107
m...
r/rrwxrwxrwx
0
0
8349-128-3
C:/Program
Files/AccessData/AccessData
Forensic
Toolkit/Program/wordnet/Adj.dat

Therefore the "C:/Program Files/AccessData/AccessData Forensic Toolkit/Program/wordnet/Adj.dat" is seperated due to the spaces regular expressions.

The program codes:

using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Diagnostics;
using System.IO;
using System.Text.RegularExpressions;


namespace Testing
{
class Program
{
    static void Main(string[] args)
    {

      String value = "Thu Mar 02 1995 21:31:00,2245107,m...,r/rrwxrwxrwx,0,0,8349-128-
      3,C:/Program Files/AccessData/AccessData Forensic 
      Toolkit/Program/wordnet/Adj.dat";
        //
        // Split the string on line breaks.
        // ... The return value from Split is a string[] array.
        //

        //foreach (String r in lines)
        //{
            String rex = @"[\s,]";

            String[] token = Regex.Split(value, rex);

            foreach (String line in token)
            {
                Console.WriteLine(line);
            }
        //}
    }
}
}

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(2

つ低調成傷 2024-10-11 02:45:09

不要按空格分割,它们是值的一部分。

string value = "Thu Mar 02 1995 21:31:00,2245107,m...,r/rrwxrwxrwx,0,0,8349-128-3,C:/Program Files/AccessData/AccessData Forensic Toolkit/Program/wordnet/Adj.dat";
string[] token = value.Split(',');
foreach (String line in token) {
  Console.WriteLine(line);
}

如果您希望日期的各个部分作为单独的值,您可以将其拆分为空格:

string[] dateCompent = token[0].Split(' ');

Don't split on spaces, they are part of the values.

string value = "Thu Mar 02 1995 21:31:00,2245107,m...,r/rrwxrwxrwx,0,0,8349-128-3,C:/Program Files/AccessData/AccessData Forensic Toolkit/Program/wordnet/Adj.dat";
string[] token = value.Split(',');
foreach (String line in token) {
  Console.WriteLine(line);
}

If you want the components of the date as separate values, you can split that on spaces:

string[] dateCompent = token[0].Split(' ');
绻影浮沉 2024-10-11 02:45:09

如果您必须在单个正则表达式中执行此操作,并且您确实想要按空格分割的唯一实例是在第一项(即日期字符串)中,那么你可以做

splitArray = Regex.Split(subjectString, @",|(?<=^[^,]*)\s+");

This regex splits 或者在逗号上或在空格上,但前提是该空格不在字符串中前面的某个地方跟随逗号。

说明:

,       # match a ,
|       # or
(?<=    # assert that it is possible to match the following before the current position:
 ^      # start of string
 [^,]*  # any number of characters except commas
)       # end of positive lookahead assertion

但请注意,文件名也可能包含逗号(至少它们在那里是合法的 - 它们是否确实出现在您的数据中只有您可以判断)。
\s+ # 然后匹配一个或多个空白字符

If you have to do it in a single regex, and if the only instance where you do want to split on spaces is in the first item (i. e. the date string), then you can do

splitArray = Regex.Split(subjectString, @",|(?<=^[^,]*)\s+");

This regex splits either on a comma or on a space, but only if that space doesn't follow a comma somewhere before in the string.

Explanation:

,       # match a ,
|       # or
(?<=    # assert that it is possible to match the following before the current position:
 ^      # start of string
 [^,]*  # any number of characters except commas
)       # end of positive lookahead assertion

Be aware, though, that filenames might contain commas, too (at least they are legal there - whether they do appear in your data is something only you can judge).
\s+ # then match one or more whitespace characters

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文