将字符串拆分为某些部分

发布于 2024-08-15 15:32:00 字数 855 浏览 2 评论 0原文

我需要创建一种简单的方法来将一些字符串拆分为格式化字符串,例如,我有这个字符串

":[email protected] PRIVMSG #channel :test message"

,我需要将其拆分为:

string nickname = "JStoker"
string ident = "stoker"
string host = "jcs.me.uk"
string channel = "#channel"
string message = "test message"

我需要以某种方式做到这一点,如果我得到一个字符串,

":irc.testnet.com PRIVMSG #channel :test message"

例如,我需要类似的东西,

string nickname = "irc.testnet.com"
string ident = ""
string host = ""
string channel = "#channel"
string message = "test message"

通过同样的事情,而不抛出错误......并且我使用的字符串一直在变化,如果你熟悉的话,这是原始的 IRC 数据......我只需要知道如何有效地解析数据。

可能可以通过正则表达式完成,但我不确定。请帮忙! 〜请代码示例

I need to create an easy way to split up some strings into formatted strings, for example, i have this string

":[email protected] PRIVMSG #channel :test message"

and i need to split that into:

string nickname = "JStoker"
string ident = "stoker"
string host = "jcs.me.uk"
string channel = "#channel"
string message = "test message"

and i need to do that in a way that if say i get a string like

":irc.testnet.com PRIVMSG #channel :test message"

for instance, i would need something like

string nickname = "irc.testnet.com"
string ident = ""
string host = ""
string channel = "#channel"
string message = "test message"

through the same thing, without throwing an error... and the string im using changes all the time, if your familiar this is raw IRC data.. i just need to know how to parse the data efficiantly.

possibly could be done through Regex but im not sure. please help! ~ code examples please

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(3

情话墙 2024-08-22 15:32:01

我对 IRC 消息分割所做的是(简单来说,因为我不记得 C# 中的确切代码):

  • 删除第一个 :
  • Split on :,这会给你两个元素,最后一个“消息”参数,以及所有内容else
  • 在空间上分割“其他所有内容”,这将为您提供所有其他参数。
  • 然后,您可以使用一个简单的方法将昵称字符串解析为不同的部分(再进行两次分割就可以了),

对我来说,这种方法比为其创建正则表达式更合适,尽管我不确定性能差异(我如果您只是编写客户端,那么我愿意打赌,无论哪种方式都并不重要)

或者您可以这样做:

  • 在空格上拆分字符串
  • 遍历结果数组,并检查元素是否以 :, if 开头确实如此,用空格连接该元素和后面的元素以获得完整的字符串。

不过,我不确定哪个“更快”,但我相信第二个不太优雅。

无论您输入什么命令,这些都应该有效(因此可以用于通用解析),并且您必须注意并非所有命令都具有以 : 开头的元素。例如,NICK 命令仅允许单个单词,并且通常不会使用 : 进行转义,其他命令在 : 之前有多个单个单词(USER 命令有两个)

What I do for IRC message splitting is (in simple terms as I don't remember the exact code in C#), is:

  • Remove the first :
  • Split on :, this gives you two elements, the last "message" parameter, and everything else
  • Split the "everything else" on space, which will give you all the other parameters.
  • Then you can use a simple method to parse the nick string into its different parts (two more splits should do it)

This method, to me, is more apt than creating a regex for it, though I am unsure about the performance difference (I'd be willing to bet it doesn't really matter either way if you're just writing a client)

Alternatively you could do this:

  • Split the string on space
  • Walk through the resulting array, and check if the element starts with :, if it does, join that and the following elements with a space to get the full string.

I'm not sure which is "faster", though, but I believe the second is less elegant.

These should work no matter the command you're getting in (and as such can be used for generic parsing), and you have to pay attention to the fact that not all commands will have an element that starts with :. For instance, the NICK command allows only a single word, and such does not usually come escaped with :, other commands have multiple single words before the : (the USER command has two)

白首有我共你 2024-08-22 15:32:00

是的,像这样的正则表达式应该可以做到这一点:

^:(\w+)(?:!(\w+)@([\w\.]+))? PRIVMSG (#\w+) :(.+)$

示例:

Match m = Regex.Match(input, @"^:(\w+)(?:!(\w+)@([\w\.]+))? PRIVMSG (#\w+) :(.+)$");
string nickname = m.Groups[1].Value;
string ident = m.Groups[2].Value;
string host = m.Groups[3].Value;
string channel = m.Groups[4].Value;
string message = m.Groups[5].Value;

注意: \w 匹配 AZ az 0-9 _,您可能需要根据不同标识符可能包含的字符使用不同的集合。

Yes, a regular expression like this should do it:

^:(\w+)(?:!(\w+)@([\w\.]+))? PRIVMSG (#\w+) :(.+)$

Example:

Match m = Regex.Match(input, @"^:(\w+)(?:!(\w+)@([\w\.]+))? PRIVMSG (#\w+) :(.+)$");
string nickname = m.Groups[1].Value;
string ident = m.Groups[2].Value;
string host = m.Groups[3].Value;
string channel = m.Groups[4].Value;
string message = m.Groups[5].Value;

Note: \w matches A-Z a-z 0-9 _, you might want to use a different set depending on which characters the different identifiers may contain.

走过海棠暮 2024-08-22 15:32:00
/\"\:(?:(.+)\!(.+)\@)?([^ ]+) PRIVMSG([^ ]+) \:(.+)\"/

$nick = $3
$ident = $1
$host = $2
$chan = $4
$message = $5

我转义了所有字符只是因为它取决于正则表达式引擎。你应该根据你使用的内容来转义那些不是特殊字符的字符

/\"\:(?:(.+)\!(.+)\@)?([^ ]+) PRIVMSG([^ ]+) \:(.+)\"/

$nick = $3
$ident = $1
$host = $2
$chan = $4
$message = $5

i escaped all chars just because it depends on the regexp engine. you should unescape the ones that arent special chars depending on what you use

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文