查找其中可能包含 RTF 标签的特定单词
我目前正在编写一个程序,通过在富文本文档中放置 RTF 标签来格式化某些单词。
由于程序允许附加的格式规则可以重叠(即将短语“请帮助我”全部着色为黄色,将“请”着色为蓝色),因此程序具有很难找到已按先前规则格式化的匹配项(即“请帮助我”,从前面的示例变为“\cf1 请\” cf0 帮助me",这与其他规则不匹配。
我一直在通过使用正则表达式来解决这个问题,并在短语中的每个字符后面放置一个与任何 RTF 标签匹配的表达式(因为规则是由用户定义的,并且我不知道具体的重叠点),就像这样:
line = Regex.Replace(line, @"\bP(?:\\[^ ]* )*l(?:\\[^ ]* )*e(?:\\[^ ]* )*a(?:\\[^ ]* )*s(?:\\[^ ]* )*e(?:\\[^ ]* )*", Evaluator);
上述序列中的 '(?:\\[^ ]* )*' 表达式会查找前面的字符,而不管它后面有任何标签,但是对每个规则执行此操作会大大减慢代码速度,并且需要我生成一个正则表达式每一个;根据规则,这可能无法按预期工作。
呃,抱歉,文字墙太长了,我现在就进入正题。有谁知道一种更有效的方法来查找未知点内带有 RTF 标签的单词?
I am currently writing a program that formats certain words in a Rich Text document by putting RTF tags around them.
As the program allows for additional formatting rules that can overlap (i.e. colouring the phrase "please help me" all in yellow and colouring "please" in blue), the program has difficulty finding matches that have already been formatted by a previous rule (i.e. "please help me", from the previous example becomes "\cf1 please\cf0 help me", which will not match the other rule.
I have been getting around this by using regular expressions and putting an expression that matches any RTF tag after each character in the phrase (as the rules are defined by the user, and I don't know the specific point of overlap), like this:
line = Regex.Replace(line, @"\bP(?:\\[^ ]* )*l(?:\\[^ ]* )*e(?:\\[^ ]* )*a(?:\\[^ ]* )*s(?:\\[^ ]* )*e(?:\\[^ ]* )*", Evaluator);
The '(?:\\[^ ]* )*' expression in the above sequence finds the preceding character regardless of any tags after it, but doing this with every rule slows down the code drastically and requires me to generate a regular expression for each one; which may not work as expected depending on the rule.
Err, sorry for the wall of text, I'll get to the point now. Does anyone know a more efficient way of finding a word that has RTF tags inside it at an unknown point?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
一种可能性是永远不允许相似标签重叠 - 如果您需要开始将文本格式化为红色,但它已经被格式化为蓝色,请在启动红色标签之前关闭蓝色标签。否则我想格式化规则可能会变得相当模糊。
编辑
如果您绝对需要重叠,请创建一个堆栈格式。正如我上面所说,结束旧标签,但将其保存到堆栈中。启动新标签,当新标签完成后,如果旧格式未关闭(并且仍在堆栈中),请再次启动旧格式。每当您关闭标签时,都会将其从堆栈中删除。
One possibility is to never allow overlap of similar tags - if you need to start formatting the text in red but it's already being formatted in blue, close the blue tag before you start the red tag. Otherwise I'd imagine that the formatting rules could get quite ambiguous.
Edit
If you absolutely need to have overlap, then create a stack of formatting. End the old tag as I said above, but save it to the stack. Start the new tag, and when your new tag is done, if the old formatting was not closed (and is still in the stack) start the old formatting again. Whenever you close a tag you remove it from the stack.