我应该如何在 Scala 中指定类似 JSON 的非结构化数据的类型?

发布于 2024-07-16 19:49:30 字数 900 浏览 6 评论 0原文

我正在考虑将一个非常简单的文本模板库移植到 Scala,主要作为学习该语言的练习。 该库目前是用 Python 和 Javascript 实现的,其基本操作或多或少可以归结为(在 python 中):

template = CompiledTemplate('Text {spam} blah {eggs[1]}')
data = { 'spam': 1, 'eggs': [ 'first', 'second', { 'key': 'value' }, true ] }
output = template.render(data)

在 Scala 中,这一切都不是非常困难,但我不清楚的是如何最好地做到这一点表示data参数的静态类型。

基本上,这个参数应该能够包含您在 JSON 中找到的各种内容:一些基元(字符串、整数、布尔值、null),或者零个或多个项目的列表,或者零个或多个项目的映射。 (就这个问题而言,映射可以被限制为具有字符串键,这似乎是 Scala 喜欢的方式。)

我最初的想法只是使用 Map[string, Any] 作为一个顶级对象,但这对我来说似乎并不完全正确。 事实上,我不想在其中添加任何类型的任意对象; 我只想要上面概述的元素。 同时,我认为在 Java 中我真正能够得到的最接近的是 Map,而且我知道 Scala 作者之一设计了 Java 的泛型。

我特别好奇的一件事是具有类似类型系统的其他函数式语言如何处理此类问题。 我有一种感觉,我真正想做的是提出一组可以进行模式匹配的案例类,但我不太能够想象它会是什么样子。

我有用 Scala 编程,但说实话,我的眼睛开始对协变/逆变的东西有点茫然,我希望有人能更清楚、更简洁地向我解释这一点。

I'm considering porting a very simple text-templating library to scala, mostly as an exercise in learning the language. The library is currently implemented in both Python and Javascript, and its basic operation more or less boils down to this (in python):

template = CompiledTemplate('Text {spam} blah {eggs[1]}')
data = { 'spam': 1, 'eggs': [ 'first', 'second', { 'key': 'value' }, true ] }
output = template.render(data)

None of this is terribly difficult to do in Scala, but the thing I'm unclear about is how to best express the static type of the data parameter.

Basically this parameter should be able to contain the sorts of things you'd find in JSON: a few primitives (strings, ints, booleans, null), or lists of zero or more items, or maps of zero or more items. (For the purposes of this question the maps can be constrained to having string keys, which seems to be how Scala likes things anyways.)

My initial thought was just to use a Map[string, Any] as a top-level object, but that's doesn't seem entirely correct to me. In fact I don't want to add arbitrary objects of any sort of class in there; I want only the elements I outlined above. At the same time, I think in Java the closest I'd really be able to get would be Map<String, ?>, and I know one of the Scala authors designed Java's generics.

One thing I'm particularly curious about is how other functional languages with similar type systems handle this sort of problem. I have a feeling that what I really want to do here is come up with a set of case classes that I can pattern-match on, but I'm not quite able to envision how that would look.

I have Programming in Scala, but to be honest my eyes started glazing over a bit at the covariance / contravariance stuff and I'm hoping somebody can explain this to me a bit more clearly and succinctly.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(3

枫林﹌晚霞¤ 2024-07-23 19:49:30

您发现您需要某种案例类来对您的数据类型进行建模。 在函数式语言中,这些东西被称为“抽象数据类型”,你可以通过谷歌搜索来阅读有关 Haskell 如何使用它们的所有内容。 Scala 相当于 Haskell 的 ADT,使用密封特征和案例类。

让我们看一下 Scala 标准库或《Scala 编程》中对 JSON 解析器组合器的重写书。 它没有使用 Map[String, Any] 来表示 JSON 对象,也没有使用 Any 来表示任意 JSON 值,而是使用抽象数据类型 JsValue 来表示 JSON 值。 JsValue 有多个子类型,代表 JSON 值的可能类型:JsStringJsNumberJsObjectJsArrayJsBooleanJsTrueJsFalse)和 JsNull

操作这种形式的 JSON 数据涉及模式匹配。 由于 JsValue 是密封的,如果您没有处理所有情况,编译器会警告您。 例如,toJson 的代码(一种采用 JsValue 并返回该值的 String 表示形式的方法)如下所示

  def toJson(x: JsValue): String = x match {
    case JsNull => "null"
    case JsBoolean(b) => b.toString
    case JsString(s) => "\"" + s + "\""
    case JsNumber(n) => n.toString
    case JsArray(xs) => xs.map(toJson).mkString("[",", ","]")
    case JsObject(m) => m.map{case (key, value) => toJson(key) + " : " + toJson(value)}.mkString("{",", ","}")
  }

:两者都让我们确保我们正在处理每种情况,并且还从其 JsType 中“解开”底层值。 它提供了一种类型安全的方式来了解我们已经处理了每个案例。

此外,如果您在编译时知道正在处理的 JSON 数据的结构,您可以做一些非常酷的事情,例如 n8han 的提取器。 很强大的东西,大家看看吧。

You're spot on that you want some sort of case classes to model your datatypes. In functional languages these sorts of things are called "Abstract Data Types", and you can read all about how Haskell uses them by Googling around a bit. Scala's equivalent of Haskell's ADTs uses sealed traits and case classes.

Let's look at a rewrite of the JSON parser combinator from the Scala standard library or the Programming in Scala book. Instead of using Map[String, Any] to represent JSON objects, and instead of using Any to represent arbitrary JSON values, it uses an abstract data type, JsValue, to represnt JSON values. JsValue has several subtypes, representing the possible kinds of JSON values: JsString, JsNumber, JsObject, JsArray, JsBoolean (JsTrue, JsFalse), and JsNull.

Manipulating JSON data of this form involves pattern matching. Since the JsValue is sealed, the compiler will warn you if you haven't dealt with all the cases. For example, the code for toJson, a method that takes a JsValue and returns a String representation of that values, looks like this:

  def toJson(x: JsValue): String = x match {
    case JsNull => "null"
    case JsBoolean(b) => b.toString
    case JsString(s) => "\"" + s + "\""
    case JsNumber(n) => n.toString
    case JsArray(xs) => xs.map(toJson).mkString("[",", ","]")
    case JsObject(m) => m.map{case (key, value) => toJson(key) + " : " + toJson(value)}.mkString("{",", ","}")
  }

Pattern matching both lets us make sure we're dealing with every case, and also "unwraps" the underlying value from its JsType. It provides a type-safe way of knowing that we've handled every case.

Furthermore, if you know at compile-time the structure of the JSON data you're dealing with, you can do something really cool like n8han's extractors. Very powerful stuff, check it out.

冷弦 2024-07-23 19:49:30

嗯,有几种方法可以解决这个问题。 我可能只会使用 Map[String, Any] ,它应该可以很好地满足您的目的(只要地图来自 collection.immutable 而不是 集合.可变的)。 但是,如果您真的想要经历一些痛苦,可以为此给出一个类型:

sealed trait InnerData[+A] {
  val value: A
}

case class InnerString(value: String) extends InnerData[String]
case class InnerMap[A, +B](value: Map[A, B]) extends InnerData[Map[A, B]]
case class InnerBoolean(value: Boolean) extends InnerData[Boolean]

现在,假设您正在将 JSON data 字段读入 Scala名为 jsData 的字段,您可以为该字段指定以下类型:

val jsData: Map[String, Either[Int, InnerData[_]]

每次从 jsData 中提取字段时,您都需要进行模式匹配,检查该值是否为类型 Left[Int]Right[InnerData[_]]Either[Int, InnerData[_]] 的两个子类型)。 获得内部数据后,您可以对其进行模式匹配,以确定它是否表示 InnerStringInnerMapInnerBoolean< /代码>。

从技术上讲,无论如何,您都必须执行这种模式匹配,以便在从 JSON 中提取数据后使用该数据。 类型正确的方法的优点是编译器会检查您以确保您没有错过任何可能性。 缺点是您不能跳过不可能的事情(例如 'eggs' 映射到 Int)。 此外,所有这些包装对象都会产生一些开销,因此请注意这一点。

请注意,Scala 确实允许您定义一个类型别名,这应该减少为此所需的 LoC 数量:

type DataType[A] = Map[String, Either[Int, InnerData[A]]]

val jsData: DataType[_]

添加一些隐式转换以使 API 变得漂亮,您应该一切都很好。

Well, there are a couple ways to approach this. I would probably just use Map[String, Any], which should work just fine for your purposes (as long as the map is from collection.immutable rather than collection.mutable). However, if you really want to go through some pain, it is possible to give a type for this:

sealed trait InnerData[+A] {
  val value: A
}

case class InnerString(value: String) extends InnerData[String]
case class InnerMap[A, +B](value: Map[A, B]) extends InnerData[Map[A, B]]
case class InnerBoolean(value: Boolean) extends InnerData[Boolean]

Now, assuming that you were reading the JSON data field into a Scala field named jsData, you would give that field the following type:

val jsData: Map[String, Either[Int, InnerData[_]]

Every time you pull a field out of jsData, you would need to pattern match, checking whether the value was of type Left[Int] or Right[InnerData[_]] (the two sub-types of Either[Int, InnerData[_]]). Once you have the inner data, you would then pattern match on that to determine whether it represents an InnerString, InnerMap or InnerBoolean.

Technically, you have to do this sort of pattern matching anyway in order to use the data once you pull it out of JSON. The advantage to the well-typed approach is the compiler will check you to ensure that you haven't missed any possibilities. The disadvantage is that you can't just skip impossibilities (like 'eggs' mapping to an Int). Also, there is some overhead imposed by all of these wrapper objects, so watch out for that.

Note that Scala does allow you to define a type alias which should cut down on the amount of LoC required for this:

type DataType[A] = Map[String, Either[Int, InnerData[A]]]

val jsData: DataType[_]

Add a few implicit conversions to make the API pretty, and you should be all nice and dandy.

沩ん囻菔务 2024-07-23 19:49:30

在“Scala 编程”中有关组合器解析的章节中,使用 JSON 作为示例。

JSON is used as an example in "Programming in Scala", in the chapter on combinator parsing.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文