Java SimpleDateFormat 将问题解析为 WEKA

发布于 2024-11-02 11:52:20 字数 1835 浏览 5 评论 0原文

我发誓我使用了正确的日期格式,但在加载到 WEKA 时我不断收到解析错误。

"MonFeb2116:00:00+0000"
"EEEMMMddHH:mm:ssZ"

以下是一个示例数据集:

@RELATION example

@ATTRIBUTE tweetid STRING 
@ATTRIBUTE timestamp DATE "EEEMMMddhh:mm:ssZ"
@ATTRIBUTE I NUMERIC
@ATTRIBUTE a NUMERIC
@ATTRIBUTE cool NUMERIC
@ATTRIBUTE foo NUMERIC
@ATTRIBUTE bar NUMERIC
@ATTRIBUTE temp NUMERIC
@ATTRIBUTE class {POS,NEG}

@DATA
39715973388828673,"MonFeb2116:00:00+0000",0,0,0,0,2,2,?
39716148329197568,"MonFeb2116:00:42+0000",0,1,0,0,0,1,?
39715973388828673,"MonFeb2116:00:51+0000",1,0,0,0,0,0,?
39723030380941312,"MonFeb2116:28:03+0000",0,0,0,0,0,0,?
39723030531944448,"MonFeb2116:28:03+0000",0,0,0,0,0,0,?
39723031433707520,"MonFeb2116:28:03+0000",0,0,0,0,0,0,?

WEKA 错误:

unparseable date "MonFeb2116:00:00+0000, read Token[MonFeb2116:00:00+0000], line 21

已使用 API 文档进行双重检查 - 遗漏了什么?

http://download.oracle.com/javase /1.4.2/docs/api/java/text/SimpleDateFormat.html

编辑 --------------

@RELATION example

@ATTRIBUTE tweetid STRING 
@ATTRIBUTE timestamp DATE "EEE MMM dd hh:mm:ss Z"
@ATTRIBUTE I NUMERIC
@ATTRIBUTE a NUMERIC
@ATTRIBUTE cool NUMERIC
@ATTRIBUTE foo NUMERIC
@ATTRIBUTE love NUMERIC
@ATTRIBUTE temp NUMERIC
@ATTRIBUTE class {POS,NEG}

@DATA
39715973388828673,"Mon Feb 21 16:00:00 +0000",0,0,0,0,2,2,?
39716148329197568,"Mon Feb 21 16:00:42 +0000",0,1,0,0,0,1,?
39715973388828673,"Mon Feb 21 16:00:51 +0000",1,0,0,0,0,0,?
39723030380941312,"Mon Feb 21 16:28:03 +0000",0,0,0,0,0,0,?
39723030531944448,"Mon Feb 21 16:28:03 +0000",0,0,0,0,0,0,?
39723031433707520,"Mon Feb 21 16:28:03 +0000",0,0,0,0,0,0,?

格式化日期以用空格分隔标记。还没在WEKA打球...

I swear I'm using the correct date format but I keep getting a parse error when loading into WEKA.

"MonFeb2116:00:00+0000"
"EEEMMMddHH:mm:ssZ"

Here is an example dataset:

@RELATION example

@ATTRIBUTE tweetid STRING 
@ATTRIBUTE timestamp DATE "EEEMMMddhh:mm:ssZ"
@ATTRIBUTE I NUMERIC
@ATTRIBUTE a NUMERIC
@ATTRIBUTE cool NUMERIC
@ATTRIBUTE foo NUMERIC
@ATTRIBUTE bar NUMERIC
@ATTRIBUTE temp NUMERIC
@ATTRIBUTE class {POS,NEG}

@DATA
39715973388828673,"MonFeb2116:00:00+0000",0,0,0,0,2,2,?
39716148329197568,"MonFeb2116:00:42+0000",0,1,0,0,0,1,?
39715973388828673,"MonFeb2116:00:51+0000",1,0,0,0,0,0,?
39723030380941312,"MonFeb2116:28:03+0000",0,0,0,0,0,0,?
39723030531944448,"MonFeb2116:28:03+0000",0,0,0,0,0,0,?
39723031433707520,"MonFeb2116:28:03+0000",0,0,0,0,0,0,?

WEKA Error:

unparseable date "MonFeb2116:00:00+0000, read Token[MonFeb2116:00:00+0000], line 21

Have used the API documentation to double check - missing something?

http://download.oracle.com/javase/1.4.2/docs/api/java/text/SimpleDateFormat.html

EDIT -----------

@RELATION example

@ATTRIBUTE tweetid STRING 
@ATTRIBUTE timestamp DATE "EEE MMM dd hh:mm:ss Z"
@ATTRIBUTE I NUMERIC
@ATTRIBUTE a NUMERIC
@ATTRIBUTE cool NUMERIC
@ATTRIBUTE foo NUMERIC
@ATTRIBUTE love NUMERIC
@ATTRIBUTE temp NUMERIC
@ATTRIBUTE class {POS,NEG}

@DATA
39715973388828673,"Mon Feb 21 16:00:00 +0000",0,0,0,0,2,2,?
39716148329197568,"Mon Feb 21 16:00:42 +0000",0,1,0,0,0,1,?
39715973388828673,"Mon Feb 21 16:00:51 +0000",1,0,0,0,0,0,?
39723030380941312,"Mon Feb 21 16:28:03 +0000",0,0,0,0,0,0,?
39723030531944448,"Mon Feb 21 16:28:03 +0000",0,0,0,0,0,0,?
39723031433707520,"Mon Feb 21 16:28:03 +0000",0,0,0,0,0,0,?

Formatted date to separate tokens with space. Still not playing ball in WEKA...

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(2

灰色世界里的红玫瑰 2024-11-09 11:52:20

您使用的是哪个默认区域设置?使用英语区域设置时,字符串 "MonFeb2116:00:00+0000" 应可使用模式 "EEEMMMddHH:mm:ssZ" 进行解析。但请注意,如果模式或解析的字符串中不存在年份,则年份将默认为 1970 年。这可能不是您真正想要的。

Which default locale are you using? Using an English locale, the String "MonFeb2116:00:00+0000" should be parseable with the pattern "EEEMMMddHH:mm:ssZ". Note however, that the year will default to 1970, if not present in the pattern or parsed string. That is probably not what you really want.

泪痕残 2024-11-09 11:52:20

好吧,我不知道它是否能解决所有问题,但尝试将 hh (12 小时格式)更改为 HH (24 小时格式)。我不确定它是否能够读取没有任何空格的“星期几/月份名称”,即使如此......您获取该格式的值吗?如果你可以在第三个和第六个字符后面加一个空格,这会有所帮助......

Well, I don't know whether it'll sort everything out or not, but try changing hh (12-hour format) to HH (24-hour format). I'm not sure whether it'll be able to read a "day of the week / month name" without any spaces even so... do you have to get the value in that format? If you could put a space after the 3rd and 6th characters it would help...

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文