是否可以以与数据库无关的方式搜索字符串形式的日期?
我有一个带有 PostgreSQL 数据库的 Ruby on Rails 应用程序;一些表具有created_at和updated_at时间戳属性。显示时,这些日期会按照用户的区域设置进行格式化;例如,时间戳 2009-10-15 16:30:00.435
变为字符串 15.10.2009 - 16:30
(本示例的日期格式为 dd.mm.yyyy - hh.mm
)。
要求是用户必须能够按日期搜索记录,就好像它们是在当前区域设置中格式化的字符串一样。例如,搜索 15.10.2009
将返回日期为 2009 年 10 月 15 日的记录,搜索 15.10
将返回日期为任意年份 10 月 15 日的记录,搜索 < code>15 将返回与 15 匹配的所有日期(无论是日、月还是年)。由于用户可以使用日期的任何部分作为搜索词,因此无法将其转换为日期/时间戳进行比较。
一种(慢)方法是检索所有记录,格式化日期,然后对其执行搜索。可以通过首先仅检索 id 和日期,执行搜索,然后获取匹配记录的数据来加快速度;但对于大量行来说,它仍然可能很慢。
另一种(与数据库无关的)方法是使用 PostgreSQL 函数或运算符将日期转换/格式化为数据库中的正确格式,并让数据库进行匹配(使用 PostgreSQL 正则表达式运算符或其他)。
有没有一种方法可以以与数据库无关的方式有效地完成此操作(无需获取所有行)?或者您认为我走错了方向,应该以不同的方式处理问题?
I have a Ruby on Rails application with a PostgreSQL database; several tables have created_at and updated_at timestamp attributes. When displayed, those dates are formatted in the user's locale; for example, the timestamp 2009-10-15 16:30:00.435
becomes the string 15.10.2009 - 16:30
(the date format for this example being dd.mm.yyyy - hh.mm
).
The requirement is that the user must be able to search for records by date, as if they were strings formatted in the current locale. For example, searching for 15.10.2009
would return records with dates on October 15th 2009, searching for 15.10
would return records with dates on October 15th of any year, searching for 15
would return all dates that match 15 (be it day, month or year). Since the user can use any part of a date as a search term, it cannot be converted to a date/timestamp for comparison.
One (slow) way would be to retrieve all records, format the dates, and perform the search on that. This could be sped up by retrieving only the id and dates at first, performing the search, and then fetching the data for the matching records; but it could still be slow for large numbers of rows.
Another (not database-agnostic) way would be to cast/format the dates to the right format in the database with PostgreSQL functions or operators, and have the database do the matching (with the PostgreSQL regexp operators or whatnot).
Is there a way to do this efficiently (without fetching all rows) in a database-agnostic way? Or do you think I am going in the wrong direction and should approach the problem differently?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(4)
根据卡洛斯的答案,如果您在所有日期和日期部分字段上都有索引,那么这应该允许您进行所有搜索而无需全表扫描。基于函数的索引对于日期部分列会更好,但我没有使用它们,因为这不应该是特定于数据库的。
如果用户仅输入年份,请使用 WHERE Date BETWEEN 'YYYY-01-01' AND 'YYYY-12-31'
如果用户输入年份和月份,请使用 WHERE Date BETWEEN 'YYYY-MM-01' AND 'YYYY-MM-31 ' (可能需要针对 30/29/28 进行调整)
如果用户输入三个值,请使用 SELECT .... WHERE Date = 'YYYY-MM-DD'
如果用户输入月份和日期
如果用户输入月份或日期 (您可以优化为每月不检查值 > 12)
Building on the answer from Carlos, this should allow all of your searches without full table scans if you have indexes on all the date and date part fields. Function-based indexes would be better for the date part columns, but I'm not using them since this should not be database-specific.
If the user enters only Year use WHERE Date BETWEEN 'YYYY-01-01' AND 'YYYY-12-31'
If the user enters Year and Month use WHERE Date BETWEEN 'YYYY-MM-01' AND 'YYYY-MM-31' (may need adjustment for 30/29/28)
If the user enters the three values use SELECT .... WHERE Date = 'YYYY-MM-DD'
If the user enters Month and Day
If the user enters Month or Day (you could optimize to not check values > 12 as a month)
“与数据库无关的方式”通常是“慢速方式”的同义词,因此解决方案不太可能有效。
无论如何,在客户端解析所有记录都是效率最低的解决方案。
您可以在客户端处理您的区域设置字符串,并为
LIKE
、RLIKE
或REGEXP_SUBSRT
运算符形成正确的条件。客户端当然应该了解系统使用的数据库。然后,您应该将运算符应用于根据区域设置使用特定于数据库的格式化函数形成的字符串,如下所示(在 Oracle 中):
更有效的方法(仅在 PostgreSQL 中有效 ) >,但是)将在各个日期部分上创建一个
GIN
索引:并在查询中使用它:
这将选择具有所有三个数字(
1
,2
和2010
)在任何日期部分:例如,2010 年 11 月 19 日
的所有记录加上19:11
的所有记录>2010
等"Database agnostic way" is usually a synonym for "slow way", so the solutions will unlikely be efficient.
Parsing all records on the client side would be the least efficient solution in any case.
You can process your locale string on the client side and form a correct condition for a
LIKE
,RLIKE
orREGEXP_SUBSRT
operator. The client side of course should be aware of the database the system uses.Then you should apply the operator to a string formed according to the locale with database-specific formatting function, like this (in
Oracle
):More efficient way (that works only in
PostgreSQL
, though) would be creating aGIN
index on the individual dateparts:and use it in a query:
This will select records, having all three numbers (
1
,2
and2010
) in any of the dateparts: like, all records ofNovemer 19 2010
plus all records of19:11
in2010
, etc.无论用户输入什么,您都应该使用他的区域设置作为指导,提取三个值:
Year
、Month
和Day
。某些值可能为空。年份
,请使用WHERE Date BETWEEN 'YYYY-01-01' AND 'YYYY-12-31'
年份
> 和Month
使用WHERE Date BETWEEN 'YYYY-MM-01' AND 'YYYY-MM-31'
(可能需要针对 30/29/28 进行调整)SELECT .... WHERE Date = 'YYYY-MM-DD'
Month
和Day
,您必须使用“慢”的方式Watever the user enters, you should extract three values:
Year
,Month
andDay
, using his locale as a guide. Some values may be empty.Year
useWHERE Date BETWEEN 'YYYY-01-01' AND 'YYYY-12-31'
Year
andMonth
useWHERE Date BETWEEN 'YYYY-MM-01' AND 'YYYY-MM-31'
(may need adjustment for 30/29/28)SELECT .... WHERE Date = 'YYYY-MM-DD'
Month
andDay
, you'll have to use the 'slow' way恕我直言,简短的回答是否。但绝对要避免加载所有行。
几点注意:
DATE(YYYY-MM-DD,例如:2010-02-01)
或 DATETIME。但由于您似乎需要诸如“10 月 15 日的所有年份”之类的查询,因此无论如何您都需要自定义查询。WHERE
。这样您就可以避免加载所有记录。year
有大量查询,您可以创建一个COMPUTED COLUMN
,其中仅包含YEAR
并具有索引就在上面。IMHO, the short answer is No. But definitely avoid loading all rows.
Few notes:
DATE (YYYY-MM-DD, ex: 2010-02-01)
or DATETIME. But since you seem to need queries like "all years for October 15th", you need custom queries anyways.SQL WHERE
clause. I am certain that you will end up having less then a dozen of cases, so you can have optimalWHEREs
for each of them. This way you will avoid loading all records.year
, you might create aCOMPUTED COLUMN
which would contain only theYEAR
and have index on it.