如何使用 TSQL for SQL Server 从文本字段中提取日期

发布于 2025-01-11 02:32:11 字数 452 浏览 1 评论 0原文

我正在尝试从文本字段中提取日期。我正在使用 PADINDEX 来执行此操作。我的结果非常不一致。无论如何,我可以通过 SQL 来做到这一点吗?我没有可以为我执行此操作的应用程序。我正在尝试将此作为所需的报告。

文本字段的日期并不总是列为 MM/DD/YYYY,有时会列为 M/DD/YYYY。另外,

这是我正在使用的查询:

select
substring(ar.finding_text,patindex('%[0-9]%/[0-9][0-9]/[0-9][0-9][0-9][0-9]%',ar.finding_text),10)

FROM [ARKPPDB].[PowerPath].[dbo].[accession_2] a
LEFT OUTER JOIN acc_results ar on a.id = ar.acc_id

I'm trying to extract date from a text field. I'm using PADINDEX to do this. My results are very inconsistent. Is there anyway I can do this through SQL. I don't have an application to do this for me. I am trying to get this for a report that is needed.

The text field has dates that are not always listed as MM/DD/YYYY sometimes its listed as M/DD/YYYY. Also,

Here is the query I am using:

select
substring(ar.finding_text,patindex('%[0-9]%/[0-9][0-9]/[0-9][0-9][0-9][0-9]%',ar.finding_text),10)

FROM [ARKPPDB].[PowerPath].[dbo].[accession_2] a
LEFT OUTER JOIN acc_results ar on a.id = ar.acc_id

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(3

尘世孤行 2025-01-18 02:32:12

我得到了这个工作。感谢大家的帮助。

select 
*,
LTRIM(RTRIM(REPLACE(REPLACE(IIF(LTRIM(RTRIM(RIGHT(SUBSTRING([Time Client 
Called], CHARINDEX('on',[Time Client Called]), 15),13))) is null, null, 
LTRIM(RTRIM(RIGHT(SUBSTRING([Time Client Called], CHARINDEX('on',[Time 
Client Called]), 15),13)))), 'a', ''),'t','')))  as "Date Client Called",

IIF(LTRIM(RTRIM(RIGHT(SUBSTRING([Time Client Called], CHARINDEX('at',[Time 
Client Called]), 11),8))) is null, null, LTRIM(RTRIM(RIGHT(SUBSTRING([Time 
Client Called], CHARINDEX('at',[Time Client Called]), 11),8))))  as "Time 
Called"

into #tmpCalls
FROM #tmpPattern

I got this working. Thank you all for your assistance.

select 
*,
LTRIM(RTRIM(REPLACE(REPLACE(IIF(LTRIM(RTRIM(RIGHT(SUBSTRING([Time Client 
Called], CHARINDEX('on',[Time Client Called]), 15),13))) is null, null, 
LTRIM(RTRIM(RIGHT(SUBSTRING([Time Client Called], CHARINDEX('on',[Time 
Client Called]), 15),13)))), 'a', ''),'t','')))  as "Date Client Called",

IIF(LTRIM(RTRIM(RIGHT(SUBSTRING([Time Client Called], CHARINDEX('at',[Time 
Client Called]), 11),8))) is null, null, LTRIM(RTRIM(RIGHT(SUBSTRING([Time 
Client Called], CHARINDEX('at',[Time Client Called]), 11),8))))  as "Time 
Called"

into #tmpCalls
FROM #tmpPattern
吃→可爱长大的 2025-01-18 02:32:12

我从另一个线程中获得了有关这篇文章的更多信息,包括数据实际外观的提示。这是我创建的响应的相同代码部分,包括测试数据(如果其他人想玩的话)。

   DROP TABLE IF EXISTS #TestTable;
GO
--===== Create and populate the test table from the data provided.
     -- This is NOT a part of the solution. We're just creating test data here.
 SELECT *
   INTO #TestTable
   FROM (VALUES
         ('Physician/Physician’s office called on 2/1/2022 at 3:27 PM Central.')
        ,('Physician/Physician’s office called on 2/1/2022 at 3:34 PM Central.')
        ,('Physician/Physician’s office called on 2/1/2022 at 2:47 PM Central.')
        ,('Physician/Physician’s office called on 2/1/2022 at 4:17 PM Central.')
        ,('Physician/Physician’s office called on 2/1/2022 at 2:52 PM Central.')
        ,('Physician/Physician’s office called on 2/1/2022 at 2:51 PM Central.')
        ,('Physician/Physician’s office called on 2/1/2022 at 4:17 PM Central.')
        ,('Physician/Physician’s office called on 2/1/2022 at 4:34 PM Central.')
        ,('Physician/Physician’s office called on 2/1/2022 at 11:49 PM Eastern.')
        ,('Physician/Physician’s office called on 2/1/2022 at 11:27 AM Eastern.')
        ,('Physician/Physician’s office called on 2/1/2022 at 11:34 AM Eastern.')
        ,('Physician/Physician’s office called on 2/1/2022 at 10:47 AM Central.')
        ,('Physician/Physician’s office called on 2/1/2022 at 9:17 AM Mountain.')
        ,('Physician/Physician’s office called on 2/1/2022 at 10:52 AM Central.')
        ,('Physician/Physician’s office called on 2/1/2022 at 10:51 AM Central.')
        ,('Physician/Physician’s office called on 2/1/2022 at 9:17 AM Pacific.')
        ,('Physician/Physician’s office called on 2/1/2022 at 9:34 AM Pacific.')
        ,('Physician/Physician’s office called on 2/1/2022 at 11:49 AM Somewhere over the rainbow.')
        )v(SomeString)
;
--===== Let's see what we've got
 SELECT * FROM #TestTable
;
--===== Solve the given problem.
     -- So long as the following pattern is true in the source data, this should always work.
     --     WhoCalled by the string ' called on ' to determine the caller followed by...
     --     by a date string followed by the string ' at ' followed by a meridian time (Has AM/PM) for the CallDT followed by...
     --     a time zone name string for the TimeZone.
     -- The result is available as a DATETIME2(0) with no decimal seconds.
 SELECT SomeString
        ,WhoCalled  = LEFT(SomeString,ca1.CalledOn-1)
        ,CallDT     = CONVERT(DATETIME2(0),REPLACE(SUBSTRING(SomeString,ca1.CalledOn+10,ca2.Meridian-CalledOn-9),'at',''))
        ,TimeZone   = TRIM(' .' FROM SUBSTRING(SomeString,ca2.Meridian+2,500))  
   FROM #TestTable
  CROSS APPLY (VALUES(CHARINDEX(' called on ',SomeString)))ca1(CalledOn)
  CROSS APPLY (VALUES(PATINDEX('% [AP]M %',SomeString)+2))ca2(Meridian)
;
GO

I got a little more information about this post from another thread including a hint of what the data actually looks like. Here's the same code part of the response I created including the test data if someone else wants to play.

   DROP TABLE IF EXISTS #TestTable;
GO
--===== Create and populate the test table from the data provided.
     -- This is NOT a part of the solution. We're just creating test data here.
 SELECT *
   INTO #TestTable
   FROM (VALUES
         ('Physician/Physician’s office called on 2/1/2022 at 3:27 PM Central.')
        ,('Physician/Physician’s office called on 2/1/2022 at 3:34 PM Central.')
        ,('Physician/Physician’s office called on 2/1/2022 at 2:47 PM Central.')
        ,('Physician/Physician’s office called on 2/1/2022 at 4:17 PM Central.')
        ,('Physician/Physician’s office called on 2/1/2022 at 2:52 PM Central.')
        ,('Physician/Physician’s office called on 2/1/2022 at 2:51 PM Central.')
        ,('Physician/Physician’s office called on 2/1/2022 at 4:17 PM Central.')
        ,('Physician/Physician’s office called on 2/1/2022 at 4:34 PM Central.')
        ,('Physician/Physician’s office called on 2/1/2022 at 11:49 PM Eastern.')
        ,('Physician/Physician’s office called on 2/1/2022 at 11:27 AM Eastern.')
        ,('Physician/Physician’s office called on 2/1/2022 at 11:34 AM Eastern.')
        ,('Physician/Physician’s office called on 2/1/2022 at 10:47 AM Central.')
        ,('Physician/Physician’s office called on 2/1/2022 at 9:17 AM Mountain.')
        ,('Physician/Physician’s office called on 2/1/2022 at 10:52 AM Central.')
        ,('Physician/Physician’s office called on 2/1/2022 at 10:51 AM Central.')
        ,('Physician/Physician’s office called on 2/1/2022 at 9:17 AM Pacific.')
        ,('Physician/Physician’s office called on 2/1/2022 at 9:34 AM Pacific.')
        ,('Physician/Physician’s office called on 2/1/2022 at 11:49 AM Somewhere over the rainbow.')
        )v(SomeString)
;
--===== Let's see what we've got
 SELECT * FROM #TestTable
;
--===== Solve the given problem.
     -- So long as the following pattern is true in the source data, this should always work.
     --     WhoCalled by the string ' called on ' to determine the caller followed by...
     --     by a date string followed by the string ' at ' followed by a meridian time (Has AM/PM) for the CallDT followed by...
     --     a time zone name string for the TimeZone.
     -- The result is available as a DATETIME2(0) with no decimal seconds.
 SELECT SomeString
        ,WhoCalled  = LEFT(SomeString,ca1.CalledOn-1)
        ,CallDT     = CONVERT(DATETIME2(0),REPLACE(SUBSTRING(SomeString,ca1.CalledOn+10,ca2.Meridian-CalledOn-9),'at',''))
        ,TimeZone   = TRIM(' .' FROM SUBSTRING(SomeString,ca2.Meridian+2,500))  
   FROM #TestTable
  CROSS APPLY (VALUES(CHARINDEX(' called on ',SomeString)))ca1(CalledOn)
  CROSS APPLY (VALUES(PATINDEX('% [AP]M %',SomeString)+2))ca2(Meridian)
;
GO
许仙没带伞 2025-01-18 02:32:12

您可以添加这样的模式“优先级”:

DECLARE @Patterns TABLE(Pattern VARCHAR(100),PatternLength INT)

INSERT INTO @Patterns(Pattern, PatternLength)
VALUES('%[0-9][0-9]/[0-9][0-9]/[0-9][0-9][0-9][0-9] [0-9][0-9]:[0-9][0-9] [AP]M%',19),
    ('%[0-9][0-9]/[0-9][0-9]/[0-9][0-9][0-9][0-9] [0-9]:[0-9][0-9] [AP]M%',18),
    ('%[0-9][0-9]/[0-9]/[0-9][0-9][0-9][0-9] [0-9][0-9]:[0-9][0-9] [AP]M%',18),
    ('%[0-9][0-9]/[0-9]/[0-9][0-9][0-9][0-9] [0-9]:[0-9][0-9] [AP]M%',17),
    ('%[0-9]/[0-9][0-9]/[0-9][0-9][0-9][0-9] [0-9][0-9]:[0-9][0-9] [AP]M%',18),
    ('%[0-9]/[0-9][0-9]/[0-9][0-9][0-9][0-9] [0-9]:[0-9][0-9] [AP]M%',17),
    ('%[0-9]/[0-9]/[0-9][0-9][0-9][0-9] [0-9][0-9]:[0-9][0-9] [AP]M%',17),
    ('%[0-9]/[0-9]/[0-9][0-9][0-9][0-9] [0-9]:[0-9][0-9] [AP]M%',16)

SELECT IIF(pat1.Pattern is null or LEN(replace(upper(ar.finding_text),' AT ',' ')) < patindex(pat1.Pattern,replace(upper(ar.finding_text),' AT ',' '))+pat1.PatternLength, null,substring(replace(UPPER(ar.finding_text),' AT ',' '),patindex(pat1.Pattern,replace(upper(ar.finding_text),' AT ',' ')),pat1.PatternLength))
FROM [ARKPPDB].[PowerPath].[dbo].[accession_2] a
LEFT OUTER JOIN acc_results ar on a.id = ar.acc_id
LEFT OUTER JOIN @Patterns pat1 on patindex(pat1.Pattern,replace(upper(ar.finding_text),' AT ',' ')) > 0
LEFT OUTER JOIN @Patterns pat2 on patindex(pat2.Pattern,replace(upper(ar.finding_text),' AT ',' ')) > 0 and pat2.PatternLength > pat1.PatternLength
WHERE pat2.Pattern IS NULL

因此,一些示例文本:

医生/医生办公室于 2022 年 2 月 1 日下午 3:27 中部地区致电

这应该返回“2/1/2022 3:27 PM”,您应该能够将其转换为 DATETIME。我没有打扰中央。您展示的每个示例都包含中部时间,因此您可以假设现在是中部时间。

You could add a pattern "priority" like this:

DECLARE @Patterns TABLE(Pattern VARCHAR(100),PatternLength INT)

INSERT INTO @Patterns(Pattern, PatternLength)
VALUES('%[0-9][0-9]/[0-9][0-9]/[0-9][0-9][0-9][0-9] [0-9][0-9]:[0-9][0-9] [AP]M%',19),
    ('%[0-9][0-9]/[0-9][0-9]/[0-9][0-9][0-9][0-9] [0-9]:[0-9][0-9] [AP]M%',18),
    ('%[0-9][0-9]/[0-9]/[0-9][0-9][0-9][0-9] [0-9][0-9]:[0-9][0-9] [AP]M%',18),
    ('%[0-9][0-9]/[0-9]/[0-9][0-9][0-9][0-9] [0-9]:[0-9][0-9] [AP]M%',17),
    ('%[0-9]/[0-9][0-9]/[0-9][0-9][0-9][0-9] [0-9][0-9]:[0-9][0-9] [AP]M%',18),
    ('%[0-9]/[0-9][0-9]/[0-9][0-9][0-9][0-9] [0-9]:[0-9][0-9] [AP]M%',17),
    ('%[0-9]/[0-9]/[0-9][0-9][0-9][0-9] [0-9][0-9]:[0-9][0-9] [AP]M%',17),
    ('%[0-9]/[0-9]/[0-9][0-9][0-9][0-9] [0-9]:[0-9][0-9] [AP]M%',16)

SELECT IIF(pat1.Pattern is null or LEN(replace(upper(ar.finding_text),' AT ',' ')) < patindex(pat1.Pattern,replace(upper(ar.finding_text),' AT ',' '))+pat1.PatternLength, null,substring(replace(UPPER(ar.finding_text),' AT ',' '),patindex(pat1.Pattern,replace(upper(ar.finding_text),' AT ',' ')),pat1.PatternLength))
FROM [ARKPPDB].[PowerPath].[dbo].[accession_2] a
LEFT OUTER JOIN acc_results ar on a.id = ar.acc_id
LEFT OUTER JOIN @Patterns pat1 on patindex(pat1.Pattern,replace(upper(ar.finding_text),' AT ',' ')) > 0
LEFT OUTER JOIN @Patterns pat2 on patindex(pat2.Pattern,replace(upper(ar.finding_text),' AT ',' ')) > 0 and pat2.PatternLength > pat1.PatternLength
WHERE pat2.Pattern IS NULL

So, some example text:

Physician/Physician’s office called on 2/1/2022 at 3:27 PM Central

This should return '2/1/2022 3:27 PM' which you should be able to convert to a DATETIME. I did not bother with the Central. Every example you showed included Central, so you can probably just assume it is Central time.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文