Oracle Regex substr忽略可选组
我正在研究REGEX_SUBSTR
以在Oracle
Test 1.Input: JOHN 10BC STUDENT Desired Output: JOHN
Test 2.Input: JOHN STUDENT Desired Output: JOHN
Test 3.Input: JOHN 10BC STU Desired Output: JOHN
Test 4.Input: JOHN 10BC TEACHER Desired Output:NULL
Test 5.Input: JOHN TEACHER Desired Output:NULL
Test 6. Input: MR JOHN 08BC STU Desired Output: MR JOHN
Test 7. Input: MR JOHN STUDENT Desired Output: MR JOHN
Test 8. Input: MR JOHN 07BC TEACHER Desired Output: Null
Test 9. Input: MR STUART 06BC STDUENT Desired Output: MR STUART
Test 10. Input: MR STUART LEE 05BC STDUENT Desired Output: MR STUART LEE
查询中获取以下结果:
测试1:
select REGEXP_SUBSTR('JOHN 10BC STUDENT','(.*)(\s+.*BC)?\sSTU(DENT)?',1,1,'i',1) from dual;
输出:失败。返回约翰10BC而不是约翰
测试2:
select REGEXP_SUBSTR('JOHN STUDENT','(.*)(\s+.*BC)?\sSTU(DENT)?',1,1,'i',1) from dual;
输出:通过。返回的约翰,
我通过删除?修改了以下查询?在第二个块
测试1中:选择Regexp_substr('John 10BC Student','(。*)(\ s+。*bc)\ sstu(dent)?双重的;
输出:通过。返回的John
测试2:选择Regexp_substr('John Student','(。*)(\ s+。*bc)\ sstu(dent)?',1,1,'i',1)来自Dual;
输出:失败。返回null而不是约翰。
如何使用regexp_substr
忽略中间的可选单词BC,并始终将单词返回到BC作为学生的子弦。提前致谢!!!
I am working on regex_substr
to get the following result in oracle
Test 1.Input: JOHN 10BC STUDENT Desired Output: JOHN
Test 2.Input: JOHN STUDENT Desired Output: JOHN
Test 3.Input: JOHN 10BC STU Desired Output: JOHN
Test 4.Input: JOHN 10BC TEACHER Desired Output:NULL
Test 5.Input: JOHN TEACHER Desired Output:NULL
Test 6. Input: MR JOHN 08BC STU Desired Output: MR JOHN
Test 7. Input: MR JOHN STUDENT Desired Output: MR JOHN
Test 8. Input: MR JOHN 07BC TEACHER Desired Output: Null
Test 9. Input: MR STUART 06BC STDUENT Desired Output: MR STUART
Test 10. Input: MR STUART LEE 05BC STDUENT Desired Output: MR STUART LEE
Query:
Test 1:
select REGEXP_SUBSTR('JOHN 10BC STUDENT','(.*)(\s+.*BC)?\sSTU(DENT)?',1,1,'i',1) from dual;
Output: Failed . Returned JOHN 10BC instead of JOHN
Test 2:
select REGEXP_SUBSTR('JOHN STUDENT','(.*)(\s+.*BC)?\sSTU(DENT)?',1,1,'i',1) from dual;
Output: Passed. Returned JOHN
I modified the query as below by removing the ? in the second block
Test 1:select REGEXP_SUBSTR('JOHN 10BC STUDENT','(.*)(\s+.*BC)\sSTU(DENT)?',1,1,'i',1) from dual;
Output: Passed. Returned JOHN
Test 2:select REGEXP_SUBSTR('JOHN STUDENT','(.*)(\s+.*BC)\sSTU(DENT)?',1,1,'i',1) from dual;
Output: Failed. Returned Null instead of JOHN.
How to ignore the middle optional word BC and always return the words till BC as a substring for STUDENT using REGEXP_SUBSTR
. Thanks in Advance!!!
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(3)
通过在其之后添加问号(
*?*?
)来防止其抓取可选零件(<。代码>(\ s+。*bc)?
):db&lt;&gt; fiddle
3CF734918BC764B24B24B24B24B24B53D9EFA17FEA17FEB 。要允许多个单词作为第一部分,您可以使用以下以下等级:
说明:
(。+?)
- 与最短匹配的任何一个或多次匹配(允许匹配以下可选组) )。(\ s+\ s*bc)?
- 零或多个非Whitespace组的可选匹配(\ s
)字符,然后是bc
。该组应通过一个或多个空格字符与以前的非Whitespace角色分开。\ s*
而不是。*
将阻止将多个单词纳入该组。\ s+Stu(dent)?$
- 最后一部分应该是Wordstudent
或缩写Stu
。它应该是行中的最后一个单词($
之后)。扩展测试用例的结果是:
db&lt;&gt; fiddle
Use non-greedy pattern for any-character match (
.*
) by adding a question mark after it (.*?
) to prevent it from grabbing an optional part ((\s+.*BC)?
):db<>fiddle here
UPD. To allow multiple words as the first part, you may use the following regex:
Explanation:
(.+?)
- match any character one or more times with the shortest match (allowing following optional groups to be matched).(\s+\S*BC)?
- optional match for the group of zero or more non-whitespace (\S
) characters followed byBC
. The group should be separated from the previous non-whitespace character by one or more whitespace characters.\S*
instead of.*
will prevent from inclusion of multiple words into this group.\s+STU(DENT)?$
- the last part should be a wordSTUDENT
or it's abbreviationSTU
. It should be the last word in the line ($
right after it).And the result of extended test case is:
db<>fiddle here
,1,1,'i',1) as qwe from testdb<>fiddle here
您真的需要检查中间的内容吗?也许您可以忽略这一点,并与其余的要求一起工作(检查Stu或Student(如您的示例数据所建议)并根据此获取数据)?
这是两个选项(
result_1
和result_2
),请参阅其中任何一个有所帮助。示例数据:
查询从这里开始:
result_1
搜索col
stu> stu
string;如果找到它,它将返回第一个单词(从col的开头
到第一个空格字符)result_2
检查是否末尾(但是这次使用Regexp_substr,到col> col
的结尾$
)是“ stu”或“ student”,然后返回第一个单词(固定在开始^
的col
Do you really need to check what's in the middle? Perhaps you could ignore that and work with the rest of the requirement (check for STU or STUDENT (as your sample data suggest) and fetch data according to that)?
Here are two options (
result_1
andresult_2
), see if any of these help.Sample data:
Query begins here:
result_1
searchescol
for theSTU
string; if it is found, it returns the first word (from beginning ofcol
up to the first space character)result_2
checks whether the last word (but this time using regexp_substr, anchoring to the end$
ofcol
) is "STU" or "STUDENT" and returns the first word (anchored to the beginning^
ofcol
您不需要使用(慢)正则表达式,并且可以使用(更快)简单的字符串函数:
(使用 astentx的数据)
输出为:
db&lt;&gt; fiddle “ nofollow noreferrer”>在这里
You do not need to use (slow) regular expressions and can use (much faster) simple string functions:
(using astentx's data)
The output is:
db<>fiddle here