在 Python 中从复杂字符串中检索日期

发布于 2024-11-29 20:28:40 字数 682 浏览 4 评论 0原文

我正在尝试使用 datetime.strptime 从两个字符串中获取单个日期时间。

时间很简单（例如晚上 8:53），所以我可以这样做：

theTime = datetime.strptime(givenTime, "%I:%M%p")

但是，该字符串不仅仅是一个日期，它是一个格式类似于 http://site.com/ 的链接？年=2011&月=10&日=5&小时=11。我知道我可以做类似的事情：

theDate = datetime.strptime(givenURL, "http://site.com/?year=%Y&month=%m&day=%d&hour=%H")

但我不想从链接中获取该时间，因为它是在其他地方检索的。有没有办法放置一个虚拟符号（例如％x或其他东西）作为最后一个变量的灵活空间？

最后，我设想有一行类似于：（

theDateTime = datetime.strptime(givenURL + givenTime, ""http://site.com/?year=%Y&month=%m&day=%d&hour=%x%I:%M%p")

尽管显然不会使用 %x）。有什么想法吗？

原文

I'm trying to get a single datetime out of two strings using datetime.strptime.

The time is pretty easy (ex. 8:53PM), so I can do something like:

theTime = datetime.strptime(givenTime, "%I:%M%p")

However, the string has more than just a date, it's a link in a format similar to http://site.com/?year=2011&month=10&day=5&hour=11. I know that I could do something like:

theDate = datetime.strptime(givenURL, "http://site.com/?year=%Y&month=%m&day=%d&hour=%H")

but I don't want to get that hour from the link since it's being retrieved elsewhere. Is there a way to put a dummy symbol (like %x or something) to serve as a flexible space for that last variable?

In the end, I envision having a single line similar to:

theDateTime = datetime.strptime(givenURL + givenTime, ""http://site.com/?year=%Y&month=%m&day=%d&hour=%x%I:%M%p")

(although, obviously, the %x wouldn't be used). Any ideas?

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

哑剧 2024-12-06 20:28:40

认为如果您想简单地从 URL 中跳过时间，您可以使用 split 例如以下方式：

givenURL = 'http://site.com/?year=2011&month=10&day=5&hour=11'
pattern = "http://site.com/?year=%Y&month=%m&day=%d"
theDate = datetime.strptime(givenURL.split('&hour=')[0], pattern)

所以不确定是否正确理解您，但是：

givenURL = 'http://site.com/?year=2011&month=10&day=5&hour=11'
datePattern = "http://site.com/?year=%Y&month=%m&day=%d"
timePattern = "&time=%I:%M%p"

theDateTime = datetime.strptime(givenURL.split('&hour=')[0] + '&time=' givenTime, datePattern + timePattern)

Think that if you would like to simple skip time from the URL you can use split for example the following way:

givenURL = 'http://site.com/?year=2011&month=10&day=5&hour=11'
pattern = "http://site.com/?year=%Y&month=%m&day=%d"
theDate = datetime.strptime(givenURL.split('&hour=')[0], pattern)

So not sure that understood you correctly, but:

givenURL = 'http://site.com/?year=2011&month=10&day=5&hour=11'
datePattern = "http://site.com/?year=%Y&month=%m&day=%d"
timePattern = "&time=%I:%M%p"

theDateTime = datetime.strptime(givenURL.split('&hour=')[0] + '&time=' givenTime, datePattern + timePattern)

回复收藏 0 原文

萤火眠眠 2024-12-06 20:28:40

import datetime
import re

givenURL  = 'http://site.com/?year=2011&month=10&day=5&hour=11'
givenTime = '08:53PM'

print ' givenURL == ' + givenURL
print 'givenTime == ' + givenTime

regx = re.compile('year=(\d\d\d\d)&month=(\d\d?)&day=(\d\d?)&hour=\d\d?')
print '\nmap(int,regx.search(givenURL).groups()) ==',map(int,regx.search(givenURL).groups())

theDate = datetime.date(*map(int,regx.search(givenURL).groups()))
theTime = datetime.datetime.strptime(givenTime, "%I:%M%p")

print '\ntheDate ==',theDate,type(theDate)
print '\ntheTime ==',theTime,type(theTime)


theDateTime = theTime.replace(theDate.year,theDate.month,theDate.day)
print '\ntheDateTime ==',theDateTime,type(theDateTime)

result

 givenURL == http://site.com/?year=2011&month=10&day=5&hour=11
givenTime == 08:53PM

map(int,regx.search(givenURL).groups()) == [2011, 10, 5]

theDate == 2011-10-05 <type 'datetime.date'>

theTime == 1900-01-01 20:53:00 <type 'datetime.datetime'>

theDateTime == 2011-10-05 20:53:00 <type 'datetime.datetime'>

编辑 1

由于 strptime() 很慢，我改进了代码以消除它

from datetime import datetime
import re
from time import clock


n = 10000

givenURL  = 'http://site.com/?year=2011&month=10&day=5&hour=11'
givenTime = '08:53AM'

# eyquem
regx = re.compile('year=(\d\d\d\d)&month=(\d\d?)&day=(\d\d?)&hour=\d\d? (\d\d?):(\d\d?)(PM|pm)?')
t0 = clock()
for i in xrange(n):
    given = givenURL + ' ' + givenTime
    mat = regx.search(given)
    grps = map(int,mat.group(1,2,3,4,5))
    if mat.group(6):
        grps[3] += 12 # when it is PM/pm, the hour must be augmented with 12
    theDateTime1 = datetime(*grps)
print clock()-t0,"seconds   eyquem's code"
print theDateTime1


print

# Artsiom Rudzenka
dateandtimePattern = "http://site.com/?year=%Y&month=%m&day=%d&time=%I:%M%p"
t0 = clock()
for i in xrange(n):
    theDateTime2 = datetime.strptime(givenURL.split('&hour=')[0] + '&time=' + givenTime, dateandtimePattern)
print clock()-t0,"seconds   Artsiom's code"
print theDateTime2

print
print theDateTime1 == theDateTime2

result

0.460598763251 seconds   eyquem's code
2011-10-05 08:53:00

2.10386180366 seconds   Artsiom's code
2011-10-05 08:53:00

True

我的代码快了 4.5 倍。如果有很多这样的转换需要执行，那可能会很有趣

import datetime
import re

givenURL  = 'http://site.com/?year=2011&month=10&day=5&hour=11'
givenTime = '08:53PM'

print ' givenURL == ' + givenURL
print 'givenTime == ' + givenTime

regx = re.compile('year=(\d\d\d\d)&month=(\d\d?)&day=(\d\d?)&hour=\d\d?')
print '\nmap(int,regx.search(givenURL).groups()) ==',map(int,regx.search(givenURL).groups())

theDate = datetime.date(*map(int,regx.search(givenURL).groups()))
theTime = datetime.datetime.strptime(givenTime, "%I:%M%p")

print '\ntheDate ==',theDate,type(theDate)
print '\ntheTime ==',theTime,type(theTime)


theDateTime = theTime.replace(theDate.year,theDate.month,theDate.day)
print '\ntheDateTime ==',theDateTime,type(theDateTime)

result

 givenURL == http://site.com/?year=2011&month=10&day=5&hour=11
givenTime == 08:53PM

map(int,regx.search(givenURL).groups()) == [2011, 10, 5]

theDate == 2011-10-05 <type 'datetime.date'>

theTime == 1900-01-01 20:53:00 <type 'datetime.datetime'>

theDateTime == 2011-10-05 20:53:00 <type 'datetime.datetime'>

Edit 1

As strptime() is slow, I improved my code to eliminate it

from datetime import datetime
import re
from time import clock


n = 10000

givenURL  = 'http://site.com/?year=2011&month=10&day=5&hour=11'
givenTime = '08:53AM'

# eyquem
regx = re.compile('year=(\d\d\d\d)&month=(\d\d?)&day=(\d\d?)&hour=\d\d? (\d\d?):(\d\d?)(PM|pm)?')
t0 = clock()
for i in xrange(n):
    given = givenURL + ' ' + givenTime
    mat = regx.search(given)
    grps = map(int,mat.group(1,2,3,4,5))
    if mat.group(6):
        grps[3] += 12 # when it is PM/pm, the hour must be augmented with 12
    theDateTime1 = datetime(*grps)
print clock()-t0,"seconds   eyquem's code"
print theDateTime1


print

# Artsiom Rudzenka
dateandtimePattern = "http://site.com/?year=%Y&month=%m&day=%d&time=%I:%M%p"
t0 = clock()
for i in xrange(n):
    theDateTime2 = datetime.strptime(givenURL.split('&hour=')[0] + '&time=' + givenTime, dateandtimePattern)
print clock()-t0,"seconds   Artsiom's code"
print theDateTime2

print
print theDateTime1 == theDateTime2

result

0.460598763251 seconds   eyquem's code
2011-10-05 08:53:00

2.10386180366 seconds   Artsiom's code
2011-10-05 08:53:00

True

My code is 4.5 times faster. That may be interesting if there are a lot of such transformations to perform

回复收藏 0 原文

身边 2024-12-06 20:28:40

没有办法用格式字符串来做到这一点。但是，如果小时并不重要，您可以从第一个示例中的 URL 获取它，然后调用 theDateTime.replace(hour=hour_from_a_ Different_source)。

这样您就不必进行任何额外的解析。

回复收藏 0 原文

~没有更多了~

关于作者

嘦怹

暂无简介

文章

619 人气

关注发私信

友情链接

文江博客

在 Python 中从复杂字符串中检索日期

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

评论（3）

编辑 1

Edit 1

关于作者

相关话题

热门标签

推荐作者

忆悲凉

hgfg1645

qq_qLPLYi

戏舞

殊姿

﹂绝世的画

友情链接

在 Python 中从复杂字符串中检索日期

如果你对这篇内容有疑问，欢迎到本站社区发帖提问 参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

评论（3）

编辑 1

Edit 1

关于作者

相关话题

热门标签

推荐作者

忆悲凉

hgfg1645

qq_qLPLYi

戏舞

殊姿

﹂绝世的画

友情链接

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。