SQL中从字符串中提取子字符串

发布于 2024-11-03 08:12:04 字数 523 浏览 1 评论 0原文

我需要提取由 ***[some text] 字符串包围的文本,如下例所示:

some text
some text
***[some text]
THIS SHOULD BE EXTRACTED
***[some text]
some text
some text
some text
some text
some text
***[some text]
THIS SHOULD BE EXTRACTED TOO
***[some text]
some text

输出应该是:

THIS SHOULD BE EXTRACTED
THIS SHOULD BE EXTRACTED TOO

我尝试了 PATINDEX 像这里一样,但找不到提取字符串的方法。

PATINDEX('%[*][*][*][[]%]%%[*][*][*][[]%]%',@Text)

我期待听到任何建议。

I need to extract a text that is surrounded by ***[some text] strings, like in the following example:

some text
some text
***[some text]
THIS SHOULD BE EXTRACTED
***[some text]
some text
some text
some text
some text
some text
***[some text]
THIS SHOULD BE EXTRACTED TOO
***[some text]
some text

the output should be:

THIS SHOULD BE EXTRACTED
THIS SHOULD BE EXTRACTED TOO

I tried PATINDEX like here, but couln't find the way to extract the string.

PATINDEX('%[*][*][*][[]%]%%[*][*][*][[]%]%',@Text)

I am looking forward to hearing any suggestions.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(5

羞稚 2024-11-10 08:12:04

对于评论中提出的更简单的情况,您可以这样做

;WITH T(C) AS
(
 SELECT '
    some text
    some text
    ***[some text 1]
    THIS SHOULD BE EXTRACTED
    ***[some text 2]
    some text
    some text
    some text
    some text
    some text
    ***[some text 1]
    THIS SHOULD BE EXTRACTED TOO
    ***[some text 2]
    some text'
)
SELECT col.value('.','varchar(max)')
FROM T
CROSS APPLY (SELECT CAST('<a keep="false">' + 
                        REPLACE(
                            REPLACE(C,'***[some text 2]','</a><a keep="false">'),
                        '***[some text 1]','</a><a keep="true">') + 
                    '</a>' AS xml) as xcol) x
CROSS APPLY xcol.nodes('/a[@keep="true"]') tab(col)

For the somewhat easier case raised in the comments you could do

;WITH T(C) AS
(
 SELECT '
    some text
    some text
    ***[some text 1]
    THIS SHOULD BE EXTRACTED
    ***[some text 2]
    some text
    some text
    some text
    some text
    some text
    ***[some text 1]
    THIS SHOULD BE EXTRACTED TOO
    ***[some text 2]
    some text'
)
SELECT col.value('.','varchar(max)')
FROM T
CROSS APPLY (SELECT CAST('<a keep="false">' + 
                        REPLACE(
                            REPLACE(C,'***[some text 2]','</a><a keep="false">'),
                        '***[some text 1]','</a><a keep="true">') + 
                    '</a>' AS xml) as xcol) x
CROSS APPLY xcol.nodes('/a[@keep="true"]') tab(col)
栀梦 2024-11-10 08:12:04

不是正则表达式解决方案,我仍然是 SQL 新手,因此可能不是最佳选择,但您应该能够使用

CHARINDEX 进行 WHILE 循环解析 * ** 然后以此为起点
CHARINDEXLF 使用它作为
的起点
SUBSTRING,结束点是下一个 ***
CHARINDEX
将子字符串连接到输出,移过结尾的 *** 并循环查找下一个。

我会尝试一下,看看是否可以添加一个示例。
编辑 - 这可能需要更多的错误检查

declare @inText nvarchar(2000) = 'some text 
some text 
***[some text] 
THIS SHOULD BE EXTRACTED 
***[some text] 
some text 
some text 
some text 
some text 
some text 
***[some text] 
THIS SHOULD BE EXTRACTED TOO 
***[some text] 
some text '

declare @delim1 nvarchar(50) = '***'
declare @delim2 char = char(10)
declare @output nvarchar(1000) = ''
declare @position int
declare @positionEnd int

set @position = CHARINDEX(@delim1,@inText)
while (@position != 0 and @position is not null)
BEGIN
  set @position = CHARINDEX(@delim2,@inText,@position)
  set @positionEnd = CHARINDEX(@delim1,@inText,@position)
  set @output = @output + SUBSTRING(@inText,@position,@positionEnd-@position)
  set @position = CHARINDEX(@delim1,@inText,@positionEnd+LEN(@delim1))
END
select @output

Not a regex solution and I'm still a SQL novice so may not be optimal but you should be able to parse with a WHILE loop using

CHARINDEX for the *** then using that as a starting point to
CHARINDEX to the LF Use that as the starting point for a
SUBSTRING with the ending point being a CHARINDEX of the next ***
concatenate the Substring to your output, move past the ending *** and loop to find the next one.

I'll play with it some and see if I can add an example.
EDIT - This probably needs more error checking

declare @inText nvarchar(2000) = 'some text 
some text 
***[some text] 
THIS SHOULD BE EXTRACTED 
***[some text] 
some text 
some text 
some text 
some text 
some text 
***[some text] 
THIS SHOULD BE EXTRACTED TOO 
***[some text] 
some text '

declare @delim1 nvarchar(50) = '***'
declare @delim2 char = char(10)
declare @output nvarchar(1000) = ''
declare @position int
declare @positionEnd int

set @position = CHARINDEX(@delim1,@inText)
while (@position != 0 and @position is not null)
BEGIN
  set @position = CHARINDEX(@delim2,@inText,@position)
  set @positionEnd = CHARINDEX(@delim1,@inText,@position)
  set @output = @output + SUBSTRING(@inText,@position,@positionEnd-@position)
  set @position = CHARINDEX(@delim1,@inText,@positionEnd+LEN(@delim1))
END
select @output
十雾 2024-11-10 08:12:04

你可以在我的博客中找到这个:
http://sql-tricks.blogspot.com/2011 /04/extract-strings-with-delimiters.html
这是纯粹的解决方案,没有额外的修改,只有分隔符序列应该被标记。

You can find this in my blog:
http://sql-tricks.blogspot.com/2011/04/extract-strings-with-delimiters.html
It is pure solution with no additional modification, only delimiters sequences should be decalred.

仅一夜美梦 2024-11-10 08:12:04

我可能是错的,但我不认为有一种干净的方法可以直接在 SQL 中执行此操作。我将使用 CLR 存储过程 并使用 C# 或中的正则表达式您选择的 .NET 语言。

请参阅本文(或本文)使用相关示例正则表达式。

I may be wrong but I don't think there's a clean way to do this directly in SQL. I would use a CLR stored procedure and use regular expressions from C# or your .NET language of choice.

See this article (or this article) for a relevant example using regexes.

乖不如嘢 2024-11-10 08:12:04

我相信您可以按照 http://www.codeproject.com/KB/mcpp/xpregex.aspx?q=use+sql+function+to+parse+text 解析您的 nvarchar 字段。我很久以前写过类似的东西。

I believe you can use the xp_regex_match as described in http://www.codeproject.com/KB/mcpp/xpregex.aspx?q=use+sql+function+to+parse+text to parse your nvarchar field. I wrote something similar quite a while back.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文