使用标准模式解析 CharSequence 中的日期

发布于 2024-10-16 00:09:52 字数 437 浏览 4 评论 0原文

我正在为外部工具的命令行界面编写一个解析器,并且正在使用 Scala 的解析器组合器库。作为此过程的一部分,我需要解析格式为 EEE MMM d HH:mm:ss yyyy Z 的标准日期。

Scala 的解析器组合器是“基于流的”,并且使用 CharSequence 而不是字符串。这使得我很难使用 JodaTime 中的 java.text.DateTimeFormatDateTimeFormat,因为它们都使用字符串。

到目前为止,我不得不像这样编写自己的正则表达式解析器来解析日期,但我更愿意将使用 JodaTime 完成的工作合并到我的解析器中。我真的不想重新发明轮子。我一直在查看 JodaTime 的源代码,但我不太确定为什么它需要使用字符串而不仅仅是 CharSequences。我错过了某些方面吗?

I'm writing a parser for a command line interface of an external tool and I'm using Scala's parser combinators library. As part of this I need to parse a standard date of the format EEE MMM d HH:mm:ss yyyy Z.

Scala's parser-combinators are "stream-based" and works with CharSequence's instead of Strings. That makes it hard for me to use either java.text.DateTimeFormat or DateTimeFormat from JodaTime since they both work with Strings.

As of now, I hade to write my own regex-parser like this to parse the date, but I would much rather incorporate the work that has been done with JodaTime into my parser. I really don't want to reinvent the wheel. I've been looking at the source-code of JodaTime and I'm not really sure why it needs to work with Strings instead of just CharSequences. Am I missing some aspect?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(3

何以笙箫默 2024-10-23 00:09:53

明白了,现在。好吧,有一个比分叉更简单的解决方案。这里:

trait DateParsers extends RegexParsers {
  def dateTime(pattern: String): Parser[DateTime] = new Parser[DateTime] {
    val dateFormat = DateTimeFormat.forPattern(pattern);

    def jodaParse(text: CharSequence, offset: Int) = {
      val mutableDateTime = new MutableDateTime
      val maxInput = text.source.subSequence(offset, dateFormat.estimateParsedLength + offset).toString
      val newPos = dateFormat.parseInto(mutableDateTime, maxInput, 0)
      (mutableDateTime.toDateTime, newPos + offset)
    }

    def apply(in: Input) = {
      val source = in.source
      val offset = in.offset
      val start = handleWhiteSpace(source, offset)
      val (dateTime, endPos) = jodaParse(source, start)
      if (endPos >= 0)
        Success(dateTime, in.drop(endPos - offset))
      else
        Failure("Failed to parse date", in.drop(start - offset))
    }
  }
}

Got it, now. Ok, there's a simpler solution than forking. Here:

trait DateParsers extends RegexParsers {
  def dateTime(pattern: String): Parser[DateTime] = new Parser[DateTime] {
    val dateFormat = DateTimeFormat.forPattern(pattern);

    def jodaParse(text: CharSequence, offset: Int) = {
      val mutableDateTime = new MutableDateTime
      val maxInput = text.source.subSequence(offset, dateFormat.estimateParsedLength + offset).toString
      val newPos = dateFormat.parseInto(mutableDateTime, maxInput, 0)
      (mutableDateTime.toDateTime, newPos + offset)
    }

    def apply(in: Input) = {
      val source = in.source
      val offset = in.offset
      val start = handleWhiteSpace(source, offset)
      val (dateTime, endPos) = jodaParse(source, start)
      if (endPos >= 0)
        Success(dateTime, in.drop(endPos - offset))
      else
        Failure("Failed to parse date", in.drop(start - offset))
    }
  }
}
墨落成白 2024-10-23 00:09:53

我不确定你在问什么。您是否在问为什么 RegexParser.parse()in 参数采用 CharSequence ?如果是这样,还有另一个重载的 RegexParser.parse() 需要一个 Reader,您可以编写一个简单的转换函数,如下所示:

def stringToReader(str: String): Reader = new StringReader(str)

至于日期格式,我发现它非常完美可以将其定义为解析器中的标记。

希望这有帮助。

I'm not sure what you are asking. Are you asking why RegexParser.parse()'s in parameter takes a CharSequence? If so there's another overloaded RegexParser.parse() that takes a Reader, which you can write a simple conversion function like so:

def stringToReader(str: String): Reader = new StringReader(str)

As to the date format, I find it perfectly fine to define it as a token in the parser.

Hope this helps.

夏夜暖风 2024-10-23 00:09:53

这是我现在的解决方案:

我分叉了 joda-time 并做了一些小更改,使其可以在 CharSequence 上工作,而不是在 String 上工作。就在这里 https://github.com/hedefalk/joda-time/commit/ef3bdafd89b334fb052ce0dd192613683b348 6a4< /a>

然后我可以像这样编写一个 DateParser

trait DateParsers extends RegexParsers {
  def dateTime(pattern: String): Parser[DateTime] = new Parser[DateTime] {
    val dateFormat = DateTimeFormat.forPattern(pattern);

    def jodaParse(text: CharSequence, offset: Int) = {
      val mutableDateTime = new MutableDateTime
      val newPos = dateFormat.parseInto(mutableDateTime, text, offset)
      (mutableDateTime.toDateTime, newPos)
    }

    def apply(in: Input) = {
      val source = in.source
      val offset = in.offset
      val start = handleWhiteSpace(source, offset)
      val (dateTime, endPos) = jodaParse(source, start)
      if (endPos >= 0)
        Success(dateTime, in.drop(endPos - offset))
      else
        Failure("Failed to parse date", in.drop(start - offset))
    }
  }
}

然后我可以使用这个特征来制定生产规则,例如:

private[this] def dateRow = "date:" ~> dateTime("EEE MMM d HH:mm:ss yyyy Z")

我是否过度劳累了?我现在真的很累了……

This is my solution right now:

I forked joda-time and made small changes for it to work on CharSequences instead of Strings. It's over here https://github.com/hedefalk/joda-time/commit/ef3bdafd89b334fb052ce0dd192613683b3486a4

Then I could write a DateParser like this:

trait DateParsers extends RegexParsers {
  def dateTime(pattern: String): Parser[DateTime] = new Parser[DateTime] {
    val dateFormat = DateTimeFormat.forPattern(pattern);

    def jodaParse(text: CharSequence, offset: Int) = {
      val mutableDateTime = new MutableDateTime
      val newPos = dateFormat.parseInto(mutableDateTime, text, offset)
      (mutableDateTime.toDateTime, newPos)
    }

    def apply(in: Input) = {
      val source = in.source
      val offset = in.offset
      val start = handleWhiteSpace(source, offset)
      val (dateTime, endPos) = jodaParse(source, start)
      if (endPos >= 0)
        Success(dateTime, in.drop(endPos - offset))
      else
        Failure("Failed to parse date", in.drop(start - offset))
    }
  }
}

Then I can use this trait to have production rules like:

private[this] def dateRow = "date:" ~> dateTime("EEE MMM d HH:mm:ss yyyy Z")

Am I overworking this? I'm really tired right now…

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文