如何递归解析SQL语句中的数据
我正在尝试清理数据库中“名称”字段中的数据,并将该数据拆分为名字、中间名和名称。姓。目前,我正在使用 Case
语句来查找文本内的各种触发器,以帮助我以某种方式格式化输出。
然而,我开始注意到我将测试嵌套在其他测试中,并且需要弄清楚如何递归地处理数据。请参阅此示例,了解我如何提取名字。
Case
When CharIndex(' ',LTrim(RTrim(Name))) in (0,1) Then '' --'empty or LName'
When Left(Name,3) IN ('MR ','MS ', 'DR ','MRS') Then --'Prefix Titles'
Case --'If we found a prefix, run the same "tests" with the prefix removed'
When CharIndex(' ',LTrim(RTrim(Right(Name,Len(Name)-CharIndex(' ',Name)))))
in (0,1) Then ''
When SubString(LTrim(RTrim(Right(Name,Len(Name)-CharIndex(' ',Name)))),3,1)
= '&' Then SubString(LTrim(RTrim(Right(Name,Len(Name)-CharIndex(' ',
Name)))),1,5)
Else Left(LTrim(RTrim(Right(Name,Len(Name)-CharIndex(' ',Name)))),
CHarIndex(' ',LTrim(RTrim(Right(Name,Len(Name)-CharIndex(' ',Name)))))-1)
End
When SubString(LTrim(RTrim(Name)),3,1) = '&' Then
SubString(LTrim(RTrim(Name)),1,5) --'Look for initials e.g. J & A Smith'
Else Left(LTrim(RTrim(Name)),CHarIndex(' ',LTrim(RTrim(Name)))-1)
End
因此,为了让它在更复杂的情况下工作(例如MR JOHN A SMITH JR
),我需要递归测试。在命令式编程中,如果我有一个名为 GetFirstName
的函数,我会执行类似的操作:
GetFirstName('MR JOHN A SMITH JR')
//GetFirstName identfies 'MR' and within the function it calls:
||
\\
==> GetFirstName('JOHN A SMITH JR')
//GetFirstName identifies 'JR' and within the function it calls:
||
\\
==> GetFirstName('JOHN A SMITH')
//Finally, it returns 'JOHN'
理想情况下,用直接 SQL 执行此操作会很棒,但我不确定这是否可行。如果我不使用直接 SQL,我还有哪些替代方案? (我使用的是 SQL Server 2005)
I'm trying to clean up data in a "Name" field in our database and split that data into FirstName, MiddleName, & LastName. Currently, I'm using a Case
statement to look for various triggers inside the text to help me format the output in a certain way.
However, I'm starting to notice that I'm nesting tests inside other tests and need to figure out how to process data recursively. See this example of how I'm extracting the FirstName.
Case
When CharIndex(' ',LTrim(RTrim(Name))) in (0,1) Then '' --'empty or LName'
When Left(Name,3) IN ('MR ','MS ', 'DR ','MRS') Then --'Prefix Titles'
Case --'If we found a prefix, run the same "tests" with the prefix removed'
When CharIndex(' ',LTrim(RTrim(Right(Name,Len(Name)-CharIndex(' ',Name)))))
in (0,1) Then ''
When SubString(LTrim(RTrim(Right(Name,Len(Name)-CharIndex(' ',Name)))),3,1)
= '&' Then SubString(LTrim(RTrim(Right(Name,Len(Name)-CharIndex(' ',
Name)))),1,5)
Else Left(LTrim(RTrim(Right(Name,Len(Name)-CharIndex(' ',Name)))),
CHarIndex(' ',LTrim(RTrim(Right(Name,Len(Name)-CharIndex(' ',Name)))))-1)
End
When SubString(LTrim(RTrim(Name)),3,1) = '&' Then
SubString(LTrim(RTrim(Name)),1,5) --'Look for initials e.g. J & A Smith'
Else Left(LTrim(RTrim(Name)),CHarIndex(' ',LTrim(RTrim(Name)))-1)
End
So, in order to get this to work in more complicated situations (e.g. MR JOHN A SMITH JR
), I would need to test recursively. In imperative programming, I would do something like this if I had a function called GetFirstName
:
GetFirstName('MR JOHN A SMITH JR')
//GetFirstName identfies 'MR' and within the function it calls:
||
\\
==> GetFirstName('JOHN A SMITH JR')
//GetFirstName identifies 'JR' and within the function it calls:
||
\\
==> GetFirstName('JOHN A SMITH')
//Finally, it returns 'JOHN'
Ideally, it would be great to do this in straight SQL, but I'm not sure that is possible. What alternatives would I have if I'm not using straight SQL? (I'm using SQL Server 2005)
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
我认为用直接的
SQL
来完成这件事并不容易/干净。您可以使用正则表达式,但是您必须编写自己的CLR
函数来提供正则表达式功能。I don't think it is something easy/clean to accomplish in straight
SQL
. You can use regular expressions though however you have to write your ownCLR
function to provide regex functionality.这听起来像是一项一次性活动。您确定不能在使用临时表的多个语句中执行此操作吗?如果它是一次性活动,那么您可能对代码调试的准确性和简单性的需求比对性能的需求更高。
考虑这样的事情:
只需继续添加步骤并编辑 [WorkingName] 字段即可。完成一次扫描后,只需执行...
...即可准备进行另一次扫描。
This sounds like a one-time activity. Are you sure you can't do this in several statements using temp tables? If it is a one-time activity, then likely you have a higher need for accuracy and simplicity of code debugging than performance.
Consider something like this:
Just keep adding steps and editing the [WorkingName] field. When you're done with one pass, just do...
...and you're ready to do another sweep.