SQL Server xml字符串解析varchar字段
我的表中有一个 varchar 列,用于存储 xml 数据。是的,我知道我应该使用一个 xml 数据类型,但我认为这是在 xml 数据类型可用之前设置的,所以我现在必须使用 varchar 。 :)
存储的数据类似于以下内容:
<xml filename="100100_456_484351864768.zip"
event_dt="10/5/2009 11:42:52 AM">
<info user="TestUser" />
</xml>
我需要解析文件名以获取两个下划线之间的数字,在本例中为“456”。文件名的第一部分“不应该”改变长度,但中间的数字会改变。我需要一个解决方案,如果第一部分的长度确实发生变化(你知道它会改变,因为“不应该改变”似乎总是意味着它会改变),那么它就可以工作。
对于我现在所拥有的,我使用 XQuery 来提取文件名,因为我认为这可能比直接字符串操作更好。我将字符串转换为 xml 来执行此操作,但我不是 XQuery 专家,所以我当然遇到了问题。我找到了一个 XQuery 函数(substring-before),但无法让它工作(我什至不确定该函数是否适用于 SQL Server)。可能有一个 XQuery 函数可以轻松地完成此操作,但如果有的话我不知道。
因此,我使用类似于以下的查询从表中获取文件名:
select CAST(parms as xml).query('data(/xml/@filename)') as p
from Table1
从这里我假设我能够将其转换回字符串,然后执行一些 instring 或 charindex 函数来找出下划线的位置这样我就可以将所有这些封装在一个子字符串函数中以挑选出我需要的部分。不用太深入,我很确定我最终可以通过这种方式完成它,但我知道必须有一种更简单的方法。这种方式会在 SQL 语句中产生一个巨大的不可读字段,即使我将其移动到一个函数中,试图弄清楚发生了什么仍然会令人困惑。
我确信有比这更简单的方法,因为它似乎是简单的字符串操作。也许有人可以指出我正确的方向。谢谢
I have a varchar column in a table that is used to store xml data. Yeah I know there is an xml data type that I should be using, but I think this was set up before the xml data type was available so a varchar is what I have to use for now. :)
The data stored looks similar to the following:
<xml filename="100100_456_484351864768.zip"
event_dt="10/5/2009 11:42:52 AM">
<info user="TestUser" />
</xml>
I need to parse the filename to get the digits between the two underscores which in this case would be "456". The first part of the file name "shouldn't" change in length, but the middle number will. I need a solution that would work if the first part does change in length (you know it will change because "shouldn't change" always seems to mean it will change).
For what I have for now, I'm using XQuery to pull out the filename because I figured this is probably the better than straight string manipulation. I cast the string to xml to do this, but I'm not an XQuery expert so of course I'm running into issues. I found a function for XQuery (substring-before), but was unable to get it to work (I'm not even sure that function will work with SQL Server). There might be an XQuery function to do this easily, but if there is I am unaware of it.
So, I get the filename from the table with a query similar to the following:
select CAST(parms as xml).query('data(/xml/@filename)') as p
from Table1
From this I'd assume that I'd be able to CAST this back to a string then do some instring or charindex function to figure out where the underscores are so that I can encapsulate all of that in a substring function to pick out the part I need. Without going too far into this I am pretty sure that I can eventually get it done this way, but I know that there has to be an easier way. This way would make a huge unreadable field in the SQL Statement which even if I moved it to a function would still be confusing to try to figure out what is going on.
I'm sure there is an easier than this since it seems to be simple string manipulation. Perhaps someone can point me in the right direction. Thanks
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(3)
您可以为此使用 XQuery - 只需将您的语句更改为:
这为您提供了一个足够长的 VARCHAR(260) 来保存任何有效的文件名和路径 - 现在您有了一个字符串并且可以使用 SUBSTRING 等对其进行处理
。
You can use XQuery for this - just change your statement to:
That gives you a VARCHAR(260) long enough to hold any valid file name and path - now you have a string and can work on it with SUBSTRING etc.
Marc
最简单的方法是使用 SUBSTRING 和 CHARINDEX。假设(明智或不明智)文件名的第一部分不会改变长度,但您仍然想使用 XQuery 来定位文件名,这里有一个简短的重现,可以满足您的要求:
有一些使用其他字符串的偷偷摸摸的解决方案像 REPLACE 和 PARSENAME 或 REVERSE 这样的函数,但没有一个可能更高效或可读。可以考虑的一种可能性是编写一个 CLR 例程,将正则表达式处理引入 SQL。
顺便说一句,如果您的 xml 总是这么简单,那么我认为根本没有什么特别的理由要使用 XQuery。这里有两个查询将提取您想要的数字。如果您无法控制 xml 字符串中的额外空格或文件名第一部分更改长度的可能性,则第二种方法更安全:
The straightforward way to do this is with SUBSTRING and CHARINDEX. Assuming (wise or not) that the first part of the filename doesn't change length, but that you still want to use XQuery to locate the filename, here's a short repro that does what you want:
There are sneaky solutions that use other string functions like REPLACE and PARSENAME or REVERSE, but none is likely to be more efficient or readable. One possibility to consider is writing a CLR routine that brings regular expression handling into SQL.
By the way, if your xml is always this simple, there's no particular reason I can see to use XQuery at all. Here are two queries that will extract the number you want. The second is safer if you don't have control over extra white space in your xml string or over the possibility that the first part of the file name will change length:
不幸的是,SQL Server 不是一个一致的 XQuery 实现 - 相反,它是 XQuery 规范草案版本的相当有限的子集。它不仅没有
fn:substring-before
,也没有fn:index-of
来使用fn:substring< 自行完成/code>,也不是
fn:string-to-codepoints
。所以,据我所知,您在这里陷入了 SQL 困境。Unfortunately, SQL Server is not a conformant XQuery implementation - rather, it's a fairly limited subset of a draft version of XQuery spec. Not only it doesn't have
fn:substring-before
, it also doesn't havefn:index-of
to do it yourself usingfn:substring
, norfn:string-to-codepoints
. So, as far as I can tell, you're stuck with SQL here.