确定一组日期的事件重复模式
我正在寻找一种模式、算法或库,它将采用一组日期并在其中一个退出时返回重复的描述,即集合 [11-01-2010, 11-08-2010, 11-15-2010 , 11-22-2010, 11-29-2010] 会产生类似“十一月的每个星期一”的结果。
有没有人以前见过这样的事情或者对实施它的最佳方法有任何建议?
I am looking for a pattern, algorithm, or library that will take a set of dates and return a description of the recurrence if one exits, i.e. the set [11-01-2010, 11-08-2010, 11-15-2010, 11-22-2010, 11-29-2010] would yield something like "Every Monday in November".
Has anyone seen anything like this before or have any suggestions on the best way to implement it?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(8)
语法进化 (GE) 适合此类问题,因为您正在寻找一个答案坚持某种语言。语法进化还用于程序生成、作曲、设计等。
我会这样处理这个任务:
用语法构建问题空间。
构造一个可以表示所有所需重复模式的上下文无关语法。考虑如下的生产规则:
由这些规则生成的模式的一个示例是:“2010 年每月的第二个和第三个星期三以及 2011 年的每个星期二”
实现此类的一种方法语法将通过类层次结构进行,稍后您将通过反射对其进行操作,就像我在下面的示例中所做的那样。
将此语言映射到一组日期
您应该创建一个函数,该函数从您的语言中获取一个子句,并递归地返回它所涵盖的所有日期的集合。这使您可以将您的答案与输入进行比较。
在语法的指导下,搜索潜在的解决方案
您可以使用遗传算法或模拟退火将日期与语法相匹配,使用动态编程试试运气,或者从简单的强力枚举所有可能的方法开始条款。
如果您使用遗传算法,您的突变概念应该包括根据您的生产规则之一的应用将一个表达式替换为另一个表达式。
请查看以下 GE 相关网站以获取代码和信息:
http://www.bangor.ac.uk/~eep201/jge/
http://nohejl.name/age/
http://www.genicprogramming.us/Home_Page.html
评估每个解决方案
函数可以考虑解决方案的文本长度、多次生成的日期数量、错过的日期数量以及生成的错误日期数量。
示例代码
根据请求,并且因为这是一个非常有趣的挑战,我编写了该算法的基本实现来帮助您入门。虽然它可以工作,但它还没有完成,但设计绝对应该得到更多的思考,一旦您从这个示例中收集到基本要点,我建议您考虑使用我上面提到的库之一。
希望这能让你走上正轨。请务必在某处分享您的实际解决方案;我认为它在很多场景下都会非常有用。
Grammatical Evolution (GE) is suitable for this kind of problem, because you are searching for an answer that adheres to a certain language. Grammatical Evolution is also used for program generation, composing music, designing, etcetera.
I'd approach the task like this:
Structure the problem space with a grammar.
Construct a Context-free Grammar that can represent all desired recurrence patterns. Consider production rules like these:
An example of a pattern generated by these rules is: 'every second and third wednesday of the month in the year 2010 and every tuesday in the year 2011'
One way to implement such a grammar would be through a class hierarchy that you will later operate on through reflection, as I've done in the example below.
Map this language to a set of dates
You should create a function that takes a clause from your language and recursively returns the set of all dates covered by it. This allows you to compare your answers to the input.
Guided by the grammar, search for potential solutions
You could use a Genetic algorithm or Simulated Annealing to match the dates to the grammar, try your luck with Dynamic Programming or start simple with a brute force enumeration of all possible clauses.
Should you go with a Genetic Algorithm, your mutation concept should consist of substituting an expression for another one based on the application of one of your production rules.
Have a look at the following GE-related sites for code and information:
http://www.bangor.ac.uk/~eep201/jge/
http://nohejl.name/age/
http://www.geneticprogramming.us/Home_Page.html
Evaluate each solution
The fitness function could take into account the textual length of the solution, the number of dates generated more than once, the number of dates missed, as well as the number of wrong dates generated.
Example code
By request, and because it's such an interesting challenge, I've written a rudimentary implementation of the algorithm to get you started. Although it works it is by no means finished, the design should definitively get some more thought, and once you have gleaned the fundamental take-aways from this example I recommend you consider using one the libraries I've mentioned above.
Hope this gets you on track. Be sure to share your actual solution somewhere; I think it will be quite useful in lots of scenarios.
如果您的目的是生成人类可读的模式描述(如“十一月的每个星期一”),那么您可能希望从枚举可能的描述开始。描述可以分为频率和界限,例如,
频率:
:
每种不会那么多,可以一一查。
If your purpose is to generate human-readable descriptions of the pattern, as in your "Every Monday in November", then you probably want to start by enumerating the possible descriptions. Descriptions can be broken down into frequency and bounds, for example,
Frequency:
Bounds:
There won't be all that many of each, and you can check for them one by one.
我会做什么:
考虑到这一点,您可能想要使用模拟退火或遗传算法。此外,如果您有描述,您可能需要比较描述以生成示例。
What I would do:
Looking at that you may want to use either Simulated Annealing, or an Genetic Algorithm. Also, if you have the descriptions, you may want to compare the descriptions to generate a sample.
您可以访问系统日期或系统日期和时间,并根据调用或函数结果返回的日期和星期几在内存中构造粗略的日历点。然后使用相关月份中的天数对其进行求和,并添加输入中日期变量的天数和/或访问从星期日或星期一开始的相关周的日历点,并计算或将索引向前增加到正确的值天。使用固定字符构造文本字符串,并根据需要插入相关变量,例如星期几的全名。可能需要多次遍历才能获得要显示或计数其出现次数的所有事件。
You could access the system date or system dateandtime and construct crude calendar points in memory based on the date and the day of the week as returned by the call or function result. Then use the number of days in relevant months to sum them and add on the number of days of the day variable in the input and/or access the calendar point for the relevant week starting sunday or monday and calculate or increment index forward to the correct day. Construct text string using fixed characters and insert the relevant variable such as the full name of the day of the week as required. There may be multiple traversals needed to obtain all the events of which the occurrences are to be displayed or counted.
首先,找一个序列,如果存在的话:
最后,处理“11月”、“2007年到2010年”等,应该是显而易见的。
华泰
First, find a sequence, if it exists:
Finally, deal with "in November", "from 2007 to 2010" etc., should be obvious.
HTH
我喜欢@arjen 的回答,但我认为不需要复杂的算法。这就是这么简单。如果有模式,就有模式……因此一个简单的算法就可以了。首先,我们需要考虑我们正在寻找的模式类型:每日、每周、每月和每年。
如何识别?
每日:每天都有记录
每周:每周有一条记录
每月:每月有一条记录
每年:每年都有记录
难吗?不会。只需数一下有多少次重复,然后进行分类即可。
这是我的实现
RecurrencePatternAnalyser.java
RecurrencePattern.java
RecurrencePatternMonthly.java
RecurrencePatternYearly.java
Main.java
I like @arjen answer but I don't think there is any need for complex algorithm. This is so so simple. If there is a pattern, there is a pattern... therefore a simple algorithm would work. First we need to think of the types of patterns we are looking for: daily, weekly, monthly and yearly.
How to recognize?
Daily: there is a record every day
Weekly: there is a record every week
Monthly: there is a record every month
Yearly: there is a record every year
Difficult? No. Just count how many repetitions you have and then classify.
Here is my implementation
RecurrencePatternAnalyser.java
RecurrencePattern.java
RecurrencePatternMonthly.java
RecurrencePatternYearly.java
Main.java
我认为你必须建造它,而且我认为这将是一个细节问题的项目。首先获得更彻底的要求。您想识别哪些日期模式?列出您希望算法成功识别的示例列表。编写你的算法来满足你的例子。将您的示例放入测试套件中,这样当您稍后收到不同的需求时,您可以确保没有破坏旧的需求。
我预测你会写 200 个 if-then-else 语句。
好吧,我确实有一个想法。熟悉
集合
、并
、覆盖
、交集
等概念。列出您要搜索的简短模式,例如“十月的每一天”、“十一月的每一天”和“十二月的每一天”。如果这些短模式包含在日期集中,则定义一个 Union 函数,该函数可以以智能方式组合较短的模式。例如,假设您匹配了我上面提到的三种模式。如果将它们联合
在一起,您会得到“十月到十二月的每一天”。您的目标可能是返回最简洁的集
联合
,覆盖
您的日期集或类似内容。I think you'll have to build it, and I think it will be a devil in the details kind of project. Start by getting much more thorough requirements. Which date patterns do you want to recognize? Come up with a list of examples that you want your algorithm to successfully identify. Write your algorithm to meet your examples. Put your examples in a test suite so when you get different requirements later you can make sure you didn't break the old ones.
I predict you will write 200 if-then-else statements.
OK, I do have one idea. Get familiar with the concepts of
sets
,unions
,coverage
,intersection
and so on. Have a list of short patterns that you search for, say, "Every day in October", "Every day in November", and "Every day in December." If these short patterns are contained within the set of dates, then define aunion
function that can combine shorter patterns in intelligent ways. For example, let's say you matched the three patterns I mention above. If youUnion
them together you get, "Every day in October through December." You could aim to return the most succinctset
ofunions
thatcover
your set of dates or something like that.看看您最喜欢的日历程序。查看它可以生成哪些事件重复模式。对它们进行逆向工程。
Have a look at your favourite calendar program. See what patterns of event recurrence it can generate. Reverse engineer them.