如何自动检测和解析朱莉娅的日期格式?

发布于 2025-02-09 01:08:24 字数 1262 浏览 2 评论 0原文

我正在尝试构建一种算法来执行日期时间格式的自动检测并相应地解析。我想寻求一些有关改进和增强算法的建议。

在以下代码中,我尝试了一种简单的方法来构建所有可能的日期格式,然后在它们上迭代将字符串与dateFormat匹配,一旦匹配,它将分析日期。

代码:

using Dates
function createDateFormats()
    sep = [",",".","-","/",":"]
    dateFormatComb = []
    for i in sep
        vals = [string("dd",i,"mm",i,"yyy"),string("mm",i,"dd",i,"yyy"),
                string("yyy",i,"mm",i,"dd"),string("yyy",i,"dd",i,"mm"),
                string("mm",i,"yyy",i,"dd"),string("dd",i,"yyy",i,"mm")
                ]
        push!(dateFormatComb, vals)
    end
    return vcat(dateFormatComb...)
end

function parse(x)
    dateFormat = createDateFormats()
    try
        for i in 1:size(dateFormat,1)
             try
                 val = Date.(x, dateFormat[i])
                 yearCol = Dates.year.(val)
                 monthCol = Dates.month(val)
                 dayCol = Dates.day.(val) 
                 dayofweekCol = Dates.dayofweek.(dayCol)
                 return yearCol, monthCol, dayCol, dayofweekCol
             catch
                    continue
             end
         end
     catch 
         throw(ArgumentError("Invalid date object"))
     end
end

但是,这是非常有限的,也不有效。同样,一旦涉及时间,复杂性就会增加。我可以问一下,如果有人有更好的方法来执行此类操作? 谢谢,感谢所有建议和建议。

I am trying to build an algorithm to perform auto detection of date time formats and parse them accordingly. And I would like to seek some advice on improving and enhancing my algorithm.

In the following code, I tried a simple approach to build all the possible date formats and then iterate over them to match the string to dateformat, once matched it will parse the date.

Code:

using Dates
function createDateFormats()
    sep = [",",".","-","/",":"]
    dateFormatComb = []
    for i in sep
        vals = [string("dd",i,"mm",i,"yyy"),string("mm",i,"dd",i,"yyy"),
                string("yyy",i,"mm",i,"dd"),string("yyy",i,"dd",i,"mm"),
                string("mm",i,"yyy",i,"dd"),string("dd",i,"yyy",i,"mm")
                ]
        push!(dateFormatComb, vals)
    end
    return vcat(dateFormatComb...)
end

function parse(x)
    dateFormat = createDateFormats()
    try
        for i in 1:size(dateFormat,1)
             try
                 val = Date.(x, dateFormat[i])
                 yearCol = Dates.year.(val)
                 monthCol = Dates.month(val)
                 dayCol = Dates.day.(val) 
                 dayofweekCol = Dates.dayofweek.(dayCol)
                 return yearCol, monthCol, dayCol, dayofweekCol
             catch
                    continue
             end
         end
     catch 
         throw(ArgumentError("Invalid date object"))
     end
end

However, this is quite limited and not efficient. Also, once the time is involved the complexity increases furthermore. May I ask, if someone has a better approach to perform such operations?
Thanks, would appreciate all the suggestions and advice.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

踏月而来 2025-02-16 01:08:24

这是一种做到这一点的方法,还概述了歧义:

using Dates

const text = "2022-3-5 is one and 2021/12.14 5:30:00 is another"
const datetimeregex = 
   Regex(raw"(\d+)[:.\-/,](\d+)[:.\-,](\d+)(?:\s?T?\s?(\d+):(\d+):(\d+))?")

dt = [DateTime([parse(Int, s) for s in m.captures if s isa AbstractString]...)
                            for m in eachmatch(datetimeregex, text)]

@show dt # [DateTime("2022-03-05T00:00:00"), DateTime("2021-12-14T05:30:00")]

现在,如果这一年是2位数字,会发生什么?如果订单是MM,DD,YY对DD,MM,YY,会发生什么?您如何判断10-11-2021是10月11日还是11月10日?在这些情况下,您必须知道使用了哪些惯例或会发生错误。

Here is a way to do it that also outlines the ambiguity:

using Dates

const text = "2022-3-5 is one and 2021/12.14 5:30:00 is another"
const datetimeregex = 
   Regex(raw"(\d+)[:.\-/,](\d+)[:.\-,](\d+)(?:\s?T?\s?(\d+):(\d+):(\d+))?")

dt = [DateTime([parse(Int, s) for s in m.captures if s isa AbstractString]...)
                            for m in eachmatch(datetimeregex, text)]

@show dt # [DateTime("2022-03-05T00:00:00"), DateTime("2021-12-14T05:30:00")]

Now, what happens if the year is 2 digits? What happens if the order is mm, dd, yy versus dd, mm, yy? How do you tell whether 10-11-2021 is October 11 or November 10? In those cases you have to know what convention was used or errors will occur.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文