如何处理嵌入列中的列表和范围?
我无法控制我要处理的数据的格式。当然,我可以使用脚本语言来处理数据库之外的以下问题,但我想避免这种情况,因为我正在处理的数据量很大,而且我想消除手动的必要性步骤。
简而言之,我有一张清单。列表可能包含单个 3 位数字符串、多个 3 位数字符串、一系列 3 位数字符串(例如 012-018
)或多个 3 位数字符串和范围3 位数字的字符串。例如:
drop table list;
drop table lists;
create table lists (id varchar, vals varchar);
insert into lists values('A', '001,003-005');
insert into lists values('B', '008-007');
insert into lists values('C', '010, 011, 012');
insert into lists values('D', '011-013, 016-018, 020');
我知道、我知道。
我想把它变成下表:
create table list (id varchar, val varchar);
A 001 A 003 A 004 A 005 B 008 B 007 C 010 C 011 C 012 D 011 D 012 D 013 D 016 D 017 D 018 D 020
有没有办法在 SQL 中做到这一点?
I have no control over the format of the data I am trying to process. I could, of course, use a scripting language to deal with the following problem outside of the database, but I would like to avoid that because of the amount of data I am dealing with and because I'd like to eliminate the necessity of manual steps.
In short, I have a table of lists. A list might consist of a single 3-digit string, more than one 3-digit strings, a range of 3-digit strings, e.g. 012-018
, or a number of 3-digit strings and ranges of 3-digit strings. For example:
drop table list;
drop table lists;
create table lists (id varchar, vals varchar);
insert into lists values('A', '001,003-005');
insert into lists values('B', '008-007');
insert into lists values('C', '010, 011, 012');
insert into lists values('D', '011-013, 016-018, 020');
I know, I know.
I would like to turn this into the following table:
create table list (id varchar, val varchar);
A 001 A 003 A 004 A 005 B 008 B 007 C 010 C 011 C 012 D 011 D 012 D 013 D 016 D 017 D 018 D 020
Is there any way to do this in SQL?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
由于您没有使用特定的 RDBMS 标记您的问题,因此我必须笼统地回答。
SQL 本身不提供您正在寻找的基本操作,这基本上是字符串拆分。这意味着您必须自己编写,或者使用已在线发布的许多之一。
不过,您的数据范围使问题变得有点复杂。这意味着您的过程将如下所示:
','
分割。'-'
分割(对于非范围元素,应该返回一个结果) 。'-'
上)产生一个结果,则它是一条记录,您可以将其插入到最终目标中。如果它产生两个结果,那么它是一个范围,您必须从开始到结束进行迭代(使用该拆分的元素 1 和 2)并将记录插入到最终目标评论后编辑
不幸的是,我对 PROC SQL 或 SAS 不熟悉,所以我无法为此提供具体的解决方案。我可以在下面发布一些关于 SQL Server T-SQL 的内容,希望能够帮助您入门。
Since you haven't tagged your question with a specific RDBMS, I'll have to answer generally.
SQL itself doesn't provide the basic operation that you're looking for, which is basically a string split. This means that you'll have to write your own, or use one of the many that have been published online.
You've complicated matters a bit, though, with the ranges that you have in your data. This means that your procedure is going to look something like this:
','
.'-'
(which, for non-range elements, should return you a single result).'-'
) yields one result, it's a single record that you can insert into your final destination. If it yields two results, then it's a range and you'll have to iterate from the start to the finish (using elements 1 and 2 of that split) and insert records into your final destinationEdit after comment
Unfortunately, I don't have any familiarity with PROC SQL or SAS, so I can't provide a specific solution for that. I can post something below in SQL Server T-SQL, which should hopefully get you started.