如何在MySQL中标准化肮脏的日期字段
我有一个MySQL表,其中包含表示日期的列,并将其存储为字符串。
此列中的日期(日期)不是标准(肮脏),范围从
“ 2004年1月5日”或“ 6月22:45 AM”,
对于缺少一年的记录,我还有另一列(开放日期)可以为null或“ 2005年6月22日”和“截止日期”,这是一个肮脏的专栏,其价值(“ 2004年1月26日”,“ 2005年7月1日”,下午6点 关于:Bearingpoint,Inc。商法和经济法规 程序”)
我如何在日期字段中获得值的归一化表示。
对于其他表,我能够使用以下查询来归一化日期字段,但是对于此表,我提出的解决方案是太响了,甚至不接近准确。
SELECT DATE_FORMAT(STR_TO_DATE(DATE,'%M %d, %Y'), '%Y-%m-%d') FROM `data job posts`
I have a MySQL table that contains column representing a date and is stored as a string.
The dates in this column (date) are not standard (dirty) and can range from
"Jan 5, 2004" or "Jun 22 2:45 AM"
For the records that are missing the year I have another column (OpeningDate) that can be null or "22 June 2005" and "Deadline" which is a dirty column with values like ("26 January 2004", "01 July 2005, 6 pm
ABOUT: BearingPoint, Inc. Commercial Law and Economic Regulation
Program")
How do I go about to get a normalized representation of the values in the date field.
For other tables I've been able to normalize the date field by using the following queries but for this table the solutions I come up with are too convoluted and not even close to accurate.
SELECT DATE_FORMAT(STR_TO_DATE(DATE,'%M %d, %Y'), '%Y-%m-%d') FROM `data job posts`
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
我不确定是否有一种干净的方法可以做到这一点,因为字符串非常非归一化。最清洁的方法可能是要缩小正在修改的数据并确定可识别的模式,以便您可以将数据集的大小减少到一组较小的高度无标准字符串。
作为一个类似的示例:
在操作完成后将旧列丢弃时,将其重命名为新列并将其重命名为新列,或者如果当前表是有效的,则可以将其命名为可查询作为更新记录可能会锁定表。
I'm not sure there is a clean way to do this since strings are very much non normalized. The cleanest approach would likely be to chunk the data being modified and identify patterns that are identifiable so that you can reduce the size of the dataset to a smaller group of highly unnormalized strings.
As an example something similar to this:
It might help to create this as a new column and rename the new column when dropping the old one once the operation is complete or creating this as a new table if the current table is live and needs to be queryable as updating records may lock the table.