Python:如何从日期列表计算日期范围?

发布于 2024-11-27 19:23:47 字数 382 浏览 2 评论 0原文

我有一个日期列表,例如:

['2011-02-27', '2011-02-28', '2011-03-01', '2011-04-12', '2011-04-13', '2011-06-08']

如何查找这些日期中包含的连续日期范围?在上面的例子中,范围应该是:

[{"start_date": '2011-02-27', "end_date": '2011-03-01'},
 {"start_date": '2011-04-12', "end_date": '2011-04-13'},
 {"start_date": '2011-06-08', "end_date": '2011-06-08'}
]

谢谢。

I have a list of dates, for example:

['2011-02-27', '2011-02-28', '2011-03-01', '2011-04-12', '2011-04-13', '2011-06-08']

How do I find the contiguous date ranges contained within those dates? In the above example, the ranges should be:

[{"start_date": '2011-02-27', "end_date": '2011-03-01'},
 {"start_date": '2011-04-12', "end_date": '2011-04-13'},
 {"start_date": '2011-06-08', "end_date": '2011-06-08'}
]

Thanks.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(5

兮颜 2024-12-04 19:23:48

这是一个替代解决方案:它返回一个(开始,完成)的列表元组,因为这就是我所需要的;)。

这会改变列表,所以我需要复制一份。显然,这会增加内存使用量。我怀疑 list.pop() 不是超级高效,但这可能取决于 python 中 list 的实现。

def collapse_dates(date_list):
    if not date_list:
        return date_list
    result = []
    # We are going to alter the list, so create a (sorted) copy.
    date_list = sorted(date_list)
    while len(date_list):
        # Grab the first item: this is both the start and end of the range.
        start = current = date_list.pop(0)
        # While the first item in the list is the next day, pop that and
        # set it to the end of the range.
        while len(date_list) and date_list[0] == current + datetime.timedelta(1):
            current = date_list.pop(0)
        # That's a completed range.
        result.append((start,current))

    return result

您可以轻松更改附加行以附加字典,或使用yield而不是附加到列表。

哦,我的假设他们已经是约会对象了。

Here is an alternative solution: It returns a list tuples of (start,finish), as that's what I needed ;).

This mutates the list, so I needed to make a copy. Obviously, that increases the memory usage. I suspect that list.pop() is not super-efficient, but that probably depends on the implementation of list in python.

def collapse_dates(date_list):
    if not date_list:
        return date_list
    result = []
    # We are going to alter the list, so create a (sorted) copy.
    date_list = sorted(date_list)
    while len(date_list):
        # Grab the first item: this is both the start and end of the range.
        start = current = date_list.pop(0)
        # While the first item in the list is the next day, pop that and
        # set it to the end of the range.
        while len(date_list) and date_list[0] == current + datetime.timedelta(1):
            current = date_list.pop(0)
        # That's a completed range.
        result.append((start,current))

    return result

You could easily change the append line to append a dict, or yield instead of appending to a list.

Oh, and mine assumes they are already dates.

╭⌒浅淡时光〆 2024-12-04 19:23:47

这可行,但我对此不满意,将研究更清晰的解决方案并编辑答案。完成,这是一个干净、有效的解决方案:

import datetime
import pprint

def parse(date):
    return datetime.date(*[int(i) for i in date.split('-')])

def get_ranges(dates):
    while dates:
        end = 1
        try:
            while dates[end] - dates[end - 1] == datetime.timedelta(days=1):
                end += 1
        except IndexError:
            pass

        yield {
            'start-date': dates[0],
            'end-date': dates[end-1]
        }
        dates = dates[end:]

dates = [
    '2011-02-27', '2011-02-28', '2011-03-01',
    '2011-04-12', '2011-04-13',
    '2011-06-08'
]

# Parse each date and convert it to a date object. Also ensure the dates
# are sorted, you can remove 'sorted' if you don't need it
dates = sorted([parse(d) for d in dates]) 

pprint.pprint(list(get_ranges(dates)))

以及相对输出:

[{'end-date': datetime.date(2011, 3, 1),
  'start-date': datetime.date(2011, 2, 27)},
 {'end-date': datetime.date(2011, 4, 13),
  'start-date': datetime.date(2011, 4, 12)},
 {'end-date': datetime.date(2011, 6, 8),
  'start-date': datetime.date(2011, 6, 8)}]

This works, but I'm not happy with it, will work on a cleaner solution an edit the answer. Done, here is a clean, working solution:

import datetime
import pprint

def parse(date):
    return datetime.date(*[int(i) for i in date.split('-')])

def get_ranges(dates):
    while dates:
        end = 1
        try:
            while dates[end] - dates[end - 1] == datetime.timedelta(days=1):
                end += 1
        except IndexError:
            pass

        yield {
            'start-date': dates[0],
            'end-date': dates[end-1]
        }
        dates = dates[end:]

dates = [
    '2011-02-27', '2011-02-28', '2011-03-01',
    '2011-04-12', '2011-04-13',
    '2011-06-08'
]

# Parse each date and convert it to a date object. Also ensure the dates
# are sorted, you can remove 'sorted' if you don't need it
dates = sorted([parse(d) for d in dates]) 

pprint.pprint(list(get_ranges(dates)))

And the relative output:

[{'end-date': datetime.date(2011, 3, 1),
  'start-date': datetime.date(2011, 2, 27)},
 {'end-date': datetime.date(2011, 4, 13),
  'start-date': datetime.date(2011, 4, 12)},
 {'end-date': datetime.date(2011, 6, 8),
  'start-date': datetime.date(2011, 6, 8)}]
时光磨忆 2024-12-04 19:23:47

尝试忍者 GaretJax 的编辑:;)

def date_to_number(date):
  return datetime.date(*[int(i) for i in date.split('-')]).toordinal()

def number_to_date(number):
  return datetime.date.fromordinal(number).strftime('%Y-%m-%d')

def day_ranges(dates):
  day_numbers = set(date_to_number(d) for d in dates)
  start = None
  # We loop including one element guaranteed not to be in the set, to force the
  # closing of any range that's currently open.
  for n in xrange(min(day_numbers), max(day_numbers) + 2):
    if start == None:
      if n in day_numbers: start = n
    else:
      if n not in day_numbers: 
        yield {
          'start_date': number_to_date(start),
          'end_date': number_to_date(n - 1)
        }
        start = None

list(
  day_ranges([
    '2011-02-27', '2011-02-28', '2011-03-01',
    '2011-04-12', '2011-04-13', '2011-06-08'
  ])
)

Attempting to ninja GaretJax's edit: ;)

def date_to_number(date):
  return datetime.date(*[int(i) for i in date.split('-')]).toordinal()

def number_to_date(number):
  return datetime.date.fromordinal(number).strftime('%Y-%m-%d')

def day_ranges(dates):
  day_numbers = set(date_to_number(d) for d in dates)
  start = None
  # We loop including one element guaranteed not to be in the set, to force the
  # closing of any range that's currently open.
  for n in xrange(min(day_numbers), max(day_numbers) + 2):
    if start == None:
      if n in day_numbers: start = n
    else:
      if n not in day_numbers: 
        yield {
          'start_date': number_to_date(start),
          'end_date': number_to_date(n - 1)
        }
        start = None

list(
  day_ranges([
    '2011-02-27', '2011-02-28', '2011-03-01',
    '2011-04-12', '2011-04-13', '2011-06-08'
  ])
)
总以为 2024-12-04 19:23:47
from datetime import datetime, timedelta

dates = ['2011-02-27', '2011-02-28', '2011-03-01', '2011-04-12', '2011-04-13', '2011-06-08']
d = [datetime.strptime(date, '%Y-%m-%d') for date in dates]
test = lambda x: x[1] - x[0] != timedelta(1)
slices = [0] + [i+1 for i, x in enumerate(zip(d, d[1:])) if test(x)] + [len(dates)]
ranges = [{"start_date": dates[s], "end_date": dates[e-1]} for s, e in zip(slices, slices[1:])]

结果如下:

>>> pprint.pprint(ranges)
[{'end_date': '2011-03-01', 'start_date': '2011-02-27'},
 {'end_date': '2011-04-13', 'start_date': '2011-04-12'},
 {'end_date': '2011-06-08', 'start_date': '2011-06-08'}]

slices 列表理解获取前一个日期不是当前日期前一天的所有索引。在前面添加 0 ,在末尾添加 len(dates) ,每个日期范围可以描述为 dates[slices[i]:slices[i+ 1]-1]

from datetime import datetime, timedelta

dates = ['2011-02-27', '2011-02-28', '2011-03-01', '2011-04-12', '2011-04-13', '2011-06-08']
d = [datetime.strptime(date, '%Y-%m-%d') for date in dates]
test = lambda x: x[1] - x[0] != timedelta(1)
slices = [0] + [i+1 for i, x in enumerate(zip(d, d[1:])) if test(x)] + [len(dates)]
ranges = [{"start_date": dates[s], "end_date": dates[e-1]} for s, e in zip(slices, slices[1:])]

Results in the following:

>>> pprint.pprint(ranges)
[{'end_date': '2011-03-01', 'start_date': '2011-02-27'},
 {'end_date': '2011-04-13', 'start_date': '2011-04-12'},
 {'end_date': '2011-06-08', 'start_date': '2011-06-08'}]

The slices list comprehension gets all indices at which the previous date is not one day before the current date. Add 0 to the front and len(dates) to the end and each range of dates can be described as dates[slices[i]:slices[i+1]-1].

半窗疏影 2024-12-04 19:23:47

我对主题略有不同(我最初构建了开始/结束列表并将它们压缩以返回元组,但我更喜欢@Karl Knechtel 的生成器方法):

from datetime import date, timedelta

ONE_DAY = timedelta(days=1)

def find_date_windows(dates):
    # guard against getting empty list
    if not dates:
        return

    # convert strings to sorted list of datetime.dates
    dates = sorted(date(*map(int,d.split('-'))) for d in dates)

    # build list of window starts and matching ends
    lastStart = lastEnd = dates[0]
    for d in dates[1:]:
        if d-lastEnd > ONE_DAY:
            yield {'start_date':lastStart, 'end_date':lastEnd}
            lastStart = d
        lastEnd = d
    yield {'start_date':lastStart, 'end_date':lastEnd}

以下是测试用例:

tests = [
    ['2011-02-27', '2011-02-28', '2011-03-01', '2011-04-12', '2011-04-13', '2011-06-08'],
    ['2011-06-08'],
    [],
    ['2011-02-27', '2011-02-28', '2011-03-01', '2011-04-12', '2011-04-13', '2011-06-08', '2011-06-10'],
]
for dates in tests:
    print dates
    for window in find_date_windows(dates):
        print window
    print

打印:

['2011-02-27', '2011-02-28', '2011-03-01', '2011-04-12', '2011-04-13', '2011-06-08']
{'start_date': datetime.date(2011, 2, 27), 'end_date': datetime.date(2011, 3, 1)}
{'start_date': datetime.date(2011, 4, 12), 'end_date': datetime.date(2011, 4, 13)}
{'start_date': datetime.date(2011, 6, 8), 'end_date': datetime.date(2011, 6, 8)}

['2011-06-08']
{'start_date': datetime.date(2011, 6, 8), 'end_date': datetime.date(2011, 6, 8)}

[]

['2011-02-27', '2011-02-28', '2011-03-01', '2011-04-12', '2011-04-13', '2011-06-08', '2011-06-10']
{'start_date': datetime.date(2011, 2, 27), 'end_date': datetime.date(2011, 3, 1)}
{'start_date': datetime.date(2011, 4, 12), 'end_date': datetime.date(2011, 4, 13)}
{'start_date': datetime.date(2011, 6, 8), 'end_date': datetime.date(2011, 6, 8)}
{'start_date': datetime.date(2011, 6, 10), 'end_date': datetime.date(2011, 6, 10)}

My slight variation on the theme (I originally built start/end lists and zipped them to return tuples, but I preferred @Karl Knechtel's generator approach):

from datetime import date, timedelta

ONE_DAY = timedelta(days=1)

def find_date_windows(dates):
    # guard against getting empty list
    if not dates:
        return

    # convert strings to sorted list of datetime.dates
    dates = sorted(date(*map(int,d.split('-'))) for d in dates)

    # build list of window starts and matching ends
    lastStart = lastEnd = dates[0]
    for d in dates[1:]:
        if d-lastEnd > ONE_DAY:
            yield {'start_date':lastStart, 'end_date':lastEnd}
            lastStart = d
        lastEnd = d
    yield {'start_date':lastStart, 'end_date':lastEnd}

Here are the test cases:

tests = [
    ['2011-02-27', '2011-02-28', '2011-03-01', '2011-04-12', '2011-04-13', '2011-06-08'],
    ['2011-06-08'],
    [],
    ['2011-02-27', '2011-02-28', '2011-03-01', '2011-04-12', '2011-04-13', '2011-06-08', '2011-06-10'],
]
for dates in tests:
    print dates
    for window in find_date_windows(dates):
        print window
    print

Prints:

['2011-02-27', '2011-02-28', '2011-03-01', '2011-04-12', '2011-04-13', '2011-06-08']
{'start_date': datetime.date(2011, 2, 27), 'end_date': datetime.date(2011, 3, 1)}
{'start_date': datetime.date(2011, 4, 12), 'end_date': datetime.date(2011, 4, 13)}
{'start_date': datetime.date(2011, 6, 8), 'end_date': datetime.date(2011, 6, 8)}

['2011-06-08']
{'start_date': datetime.date(2011, 6, 8), 'end_date': datetime.date(2011, 6, 8)}

[]

['2011-02-27', '2011-02-28', '2011-03-01', '2011-04-12', '2011-04-13', '2011-06-08', '2011-06-10']
{'start_date': datetime.date(2011, 2, 27), 'end_date': datetime.date(2011, 3, 1)}
{'start_date': datetime.date(2011, 4, 12), 'end_date': datetime.date(2011, 4, 13)}
{'start_date': datetime.date(2011, 6, 8), 'end_date': datetime.date(2011, 6, 8)}
{'start_date': datetime.date(2011, 6, 10), 'end_date': datetime.date(2011, 6, 10)}
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文