根据我的需要开发正则表达式

发布于 2025-01-05 06:08:40 字数 1952 浏览 0 评论 0原文

我真的不擅长正则表达式,发现它们太复杂了。但是,我需要使用它们在经典 asp 中进行一些字符串操作。

输入字符串:

"James John Junior 

S.D. Industrial Corpn  
D-2341, Focal Point, Phase 4-a, 
Sarsona, Penns
Japan
Phone : 92-161-4633248 Fax : 92-161-253214
email : [email protected]"

所需输出字符串:

"JXXXX JXXX JXXXXX 

S.X. IXXXXXXXXX CXXXX  
D-XXXX, FXXXX PXXXX, PXXXX 4-X, 
SXXXXXX, PXXXX
JXXXX
PXXXX : 9X-XXX-XXXXXXX Fax : 9X-XXX-XXXXXX
eXXXX : [email protected]"

注意:我们需要根据单个空格将原始字符串拆分为单词然后,在这些单词中,我们需要替换所有字母 (小写和大写)和数字,除了每个单词中的第一个字符带有“X”,

我知道这有点困难,但我认为经验丰富的正则表达式专家可以很容易地解决这个问题。不?

编辑:

我已经取得了一些进展。找到了一个可以完成这项工作的函数(http://www.addedbytes.com/lab/vbscript-regular-expressions/)。但需要一点改进,如果有人可以帮助

function ereg_replace(strOriginalString, strPattern, strReplacement, varIgnoreCase) 
' Function replaces pattern with replacement 
' varIgnoreCase must be TRUE (match is case insensitive) or FALSE (match is case sensitive) 
dim objRegExp : set objRegExp = new RegExp 
 with objRegExp 
    .Pattern = strPattern 
    .IgnoreCase = varIgnoreCase 
    .Global = True 
end with 
ereg_replace = objRegExp.replace(strOriginalString, strReplacement) 
set objRegExp = nothing 
end function

我这样称呼它 -

orgstr = ereg_replace(orgstr, "\w", "X", True)

然而,结果看起来像 -

XXXXX XXXXXXXX

XXXXXXXX XXXXXXXX XXX.
XX, XXXXX XXXX, XXXXXX XXXXXX, XXXXXXX XXXXXXX, XXXXXXXXX
XXXXX : XXX-XXX-XXXX
XXX :
XXXXX : [email protected]

我希望它显示每个单词中的第一个字符。有什么帮助吗?

I'm really bad with regular expressions and find them to be too complex. However, I need to use them to do some string manipulation in classic asp.

Input String :

"James John Junior 

S.D. Industrial Corpn  
D-2341, Focal Point, Phase 4-a, 
Sarsona, Penns
Japan
Phone : 92-161-4633248 Fax : 92-161-253214
email : [email protected]"

Desired Output string:

"JXXXX JXXX JXXXXX 

S.X. IXXXXXXXXX CXXXX  
D-XXXX, FXXXX PXXXX, PXXXX 4-X, 
SXXXXXX, PXXXX
JXXXX
PXXXX : 9X-XXX-XXXXXXX Fax : 9X-XXX-XXXXXX
eXXXX : [email protected]"

Note: We need to split the original string into words based on a single space Then, in those words, we need to replace all letters (lower and upper case) and numbers except for the first character in each word with an "X"

I know its sort of difficult, but a seasoned RegEx expert could nail this pretty easily I would think. No?

Edit:

I've made some progress. Found a function (http://www.addedbytes.com/lab/vbscript-regular-expressions/) that sort of does the job. But needs a little refinement, if anyone can help

function ereg_replace(strOriginalString, strPattern, strReplacement, varIgnoreCase) 
' Function replaces pattern with replacement 
' varIgnoreCase must be TRUE (match is case insensitive) or FALSE (match is case sensitive) 
dim objRegExp : set objRegExp = new RegExp 
 with objRegExp 
    .Pattern = strPattern 
    .IgnoreCase = varIgnoreCase 
    .Global = True 
end with 
ereg_replace = objRegExp.replace(strOriginalString, strReplacement) 
set objRegExp = nothing 
end function

Im calling it like so -

orgstr = ereg_replace(orgstr, "\w", "X", True)

However, the result looks like -

XXXXX XXXXXXXX

XXXXXXXX XXXXXXXX XXX.
XX, XXXXX XXXX, XXXXXX XXXXXX, XXXXXXX XXXXXXX, XXXXXXXXX
XXXXX : XXX-XXX-XXXX
XXX :
XXXXX : [email protected]

I'd like this to show the first character in every word. Any help out there?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(4

小霸王臭丫头 2025-01-12 06:08:40

这种方法很接近:

Function AnonymiseWord(m, p, s)

   AnonymiseWord = Left(m, 1) & String(Len(m) - 1, "X")

End Function 


Function AnonymiseText(input)

    Dim rgx: Set rgx = new RegExp
    rgx.Global = True
    rgx.Pattern = "\b\w+?\b"

    AnonymiseText = rgx.Replace(input, GetRef("AnonymiseWord"))

End Function

这可能会让您足够接近您需要的内容,否则基本方法是合理的,但您可能需要摆弄该模式以使其与您想要通过 AnonymiseWord 放入的文本片段完全匹配代码>.

This approach gets close:

Function AnonymiseWord(m, p, s)

   AnonymiseWord = Left(m, 1) & String(Len(m) - 1, "X")

End Function 


Function AnonymiseText(input)

    Dim rgx: Set rgx = new RegExp
    rgx.Global = True
    rgx.Pattern = "\b\w+?\b"

    AnonymiseText = rgx.Replace(input, GetRef("AnonymiseWord"))

End Function

This might get you close enough to what you need otherwise the basic approach is sound but you may need to fiddle with that pattern to get it match exactly the stretches of text you want to put through AnonymiseWord.

成熟的代价 2025-01-12 06:08:40

好吧,在 .NET 中这很容易:

resultString = Regex.Replace(subjectString, 
    @"(?<=         # Assert that there is before the current position...
     \b            # a word boundary
     \w            # one alphanumeric character (= first letter/digit/underscore)
     [\w.@-]*      # any number of alnum characters or ., @ or -
    )              # End of lookbehind
    [\p{L}\p{N}]   # Match any letter or digit to be replaced", 
    "X", RegexOptions.Multiline | RegexOptions.IgnorePatternWhitespace);

不过,结果会与您所写的略有不同:(

"JXXXX JXXX JXXXXX 

S.X. IXXXXXXXXX CXXXX  
D-XXXX, FXXXX PXXXX, PXXXX 4-X, 
SXXXXXX, PXXXX
JXXXX
PXXXX : 9X-XXX-XXXXXXX FXX : 9X-XXX-XXXXXX
eXXXX : [email protected]"

Fax 也已更改为 FXX

观察 .NET,您可以尝试类似的方法

orgstr = ereg_replace("\b(\w)[\w.@-]*", "\1XXXX", True); // not sure about the syntax here, you possibly need double backslashes

"JXXXX JXXXX JXXXX 

SXXXX IXXXX CXXXX  
DXXXX, FXXXX PXXXX, PXXXX 4XXXX, 
SXXXX, PXXXX
JXXXX
PXXXX : 9XXXX FXXXX : 9XXXX
eXXXX : sXXXX"

您不会比使用单个正则表达式得到更好的结果。

Well, in .NET it would be easy:

resultString = Regex.Replace(subjectString, 
    @"(?<=         # Assert that there is before the current position...
     \b            # a word boundary
     \w            # one alphanumeric character (= first letter/digit/underscore)
     [\w.@-]*      # any number of alnum characters or ., @ or -
    )              # End of lookbehind
    [\p{L}\p{N}]   # Match any letter or digit to be replaced", 
    "X", RegexOptions.Multiline | RegexOptions.IgnorePatternWhitespace);

The result, though, would be slightly different than what you wrote:

"JXXXX JXXX JXXXXX 

S.X. IXXXXXXXXX CXXXX  
D-XXXX, FXXXX PXXXX, PXXXX 4-X, 
SXXXXXX, PXXXX
JXXXX
PXXXX : 9X-XXX-XXXXXXX FXX : 9X-XXX-XXXXXX
eXXXX : [email protected]"

(observe that Fax has also been changed to FXX)

Without .NET, you could try something like

orgstr = ereg_replace("\b(\w)[\w.@-]*", "\1XXXX", True); // not sure about the syntax here, you possibly need double backslashes

which would give you

"JXXXX JXXXX JXXXX 

SXXXX IXXXX CXXXX  
DXXXX, FXXXX PXXXX, PXXXX 4XXXX, 
SXXXX, PXXXX
JXXXX
PXXXX : 9XXXX FXXXX : 9XXXX
eXXXX : sXXXX"

You won't get it better than that with a single regex.

满地尘埃落定 2025-01-12 06:08:40

我不知道经典的 ASP,但如果它确实支持(负)lookbehinds 并且唯一的问题是lookbehind 中的量词,那么为什么不把它反过来并这样做:

(?<!^)(?<!\s)[a-zA-Z0-9]

并用“X”替换。

意味着,如果前面没有空格或不是字符串/行的开头,则替换每个字母和数字。

请参阅Regexr 上的此处

I have no idea about classic ASP, but if it does support (negative) lookbehinds and the only problem is the quantifier in the lookbehind, then why not turn it around and do it this way:

(?<!^)(?<!\s)[a-zA-Z0-9]

and replace with "X".

Means, replace every letter and number if there is not a whitespace or not the start of the string/row before.

See it here on Regexr

一腔孤↑勇 2025-01-12 06:08:40

虽然我喜欢正则表达式,但您可以不用它们也能做到这一点,特别是因为 VBScript 不支持后视。

Dim mystring, myArray, newString, i, j
Const forbiddenChars = "abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789"
myString = "James John Junior   S.D. Industrial Corpn   D-2341, Focal Point, Phase 4-a,  Sarsona, Penns Japan Phone : 92-161-4633248 Fax : 92-161-253214 email : [email protected]"
myArray = split(myString, " ")

For i = lbound(myArray) to ubound(myArray)
    newString = left(myArray(i), 1)
    For j = 2 to len(myArray(i))
        If instr(forbiddenChars, mid(myArray(i), j, 1)) > 0 Then
            newString = newString & "X"
        else
            newString = newString & mid(myArray(i), j, 1)
        End If
    Next
    myArray(i) = newString
Next

myString = join(myArray, " ")

它不处理 VbNewLine 字符,但您会明白的。例如,您可以对 VbNewLine 字符进行额外的分割,迭代所有元素并在空间上分割每个元素。

Although I love regular expressions, you could do it without them, especially because VBScript does not support look behind.

Dim mystring, myArray, newString, i, j
Const forbiddenChars = "abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789"
myString = "James John Junior   S.D. Industrial Corpn   D-2341, Focal Point, Phase 4-a,  Sarsona, Penns Japan Phone : 92-161-4633248 Fax : 92-161-253214 email : [email protected]"
myArray = split(myString, " ")

For i = lbound(myArray) to ubound(myArray)
    newString = left(myArray(i), 1)
    For j = 2 to len(myArray(i))
        If instr(forbiddenChars, mid(myArray(i), j, 1)) > 0 Then
            newString = newString & "X"
        else
            newString = newString & mid(myArray(i), j, 1)
        End If
    Next
    myArray(i) = newString
Next

myString = join(myArray, " ")

It doesn't cope with the VbNewLine character, but you will get the idea. You can do an extra split on the VbNewLine character, iterate through all elements and split each element on the space for example.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文