函数到关系的映射比对象到关系更容易吗?

发布于 2024-07-07 17:55:41 字数 1449 浏览 6 评论 0原文

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(8

若相惜即相离 2024-07-14 17:55:41

ORM 的目的是什么?

使用 ORM 的主要目的是在网络模型(面向对象、图形等)和关系模型之间架起桥梁。 这两个模型之间的主要区别出奇地简单。 是父母指向孩子(网络模型)还是孩子指向父母(关系模型)。

考虑到这种简单性,我相信 两种模型之间不存在“阻抗不匹配”。 人们通常遇到的问题纯粹是特定于实现的,如果客户端和服务器之间有更好的数据传输协议,应该可以解决。

SQL 如何解决我们在 ORM 方面遇到的问题?

特别是,第三个宣言试图通过允许嵌套集合,已在各种数据库中实现,包括:

  • Oracle(可能是最复杂的实现)
  • PostgreSQL(在某种程度上)
  • Informix
  • SQL Server、MySQL 等(通过 XML 或 JSON 进行“模拟”)

在我看来,如果所有数据库都实现了 SQL 标准 MULTISET() 运算符(例如 Oracle 就是这样做的),人们将不再使用 ORM 进行映射(也许仍然用于对象图持久化),因为他们可以直接从在数据库中,例如此查询:

SELECT actor_id, first_name, last_name,
  MULTISET (
    SELECT film_id, title
    FROM film AS f
    JOIN film_actor AS fa USING (film_id)
    WHERE fa.actor_id = a.actor_id
  ) AS films
FROM actor AS a

将生成所有演员及其电影作为嵌套集合,而不是非规范化的连接结果(其中每部电影重复演员)。

客户端的函数范式

客户端的函数式编程语言是否更适合数据库交互的问题实际上是正交的。 ORM 有助于对象图持久性,因此如果您的客户端模型是一个图,并且您希望它是一个图,那么您将需要一个 ORM,无论您是否使用函数式编程语言操作该图。

然而,由于面向对象在函数式编程语言中不太惯用,因此您不太可能将每个数据项硬塞到一个对象中。 对于编写 SQL 的人来说,投影任意元组是非常自然的。 SQL 支持结构类型。 每个 SQL 查询都定义自己的行类型,无需事先为其分配名称。 这与函数式程序员产生了很好的共鸣,尤其是当类型推断很复杂时,在这种情况下,您永远不会想到将 SQL 结果映射到某些先前定义的对象/类。

使用 jOOQ 的 Java 示例 摘自这篇博文 可能是:

// Higher order, SQL query producing function:
public static ResultQuery<Record2<String, String>> actors(Function<Actor, Condition> p) {
    return ctx.select(ACTOR.FIRST_NAME, ACTOR.LAST_NAME)
              .from(ACTOR)
              .where(p.apply(ACTOR)));
}

这种方法导致 SQL 语句的组合性比SQL 语言是否由某种 ORM 抽象,或者是否使用了 SQL 自然的“基于字符串”性质。 上面的函数现在可以像这样使用:

// Get only actors whose first name starts with "A"
for (Record rec : actors(a -> a.FIRST_NAME.like("A%")))
    System.out.println(rec);

FRM 对 SQL 的抽象

一些 FRM 尝试对 SQL 语言进行抽象,通常是出于以下原因:

  • 他们声称 SQL 的可组合性不够(jOOQ 反驳了这一点,它很难做到正确) 。
  • 他们声称 API 用户更习惯于“原生”集合 API,因此例如 JOIN 被翻译为 flatMap() ,而 WHERE 被翻译为filter()等。

要回答你的问题,

FRM并不比ORM“更容易”,它解决了一个不同的问题。 事实上,FRM 根本没有真正解决任何问题,因为 SQL 本身就是一种声明式编程语言(与函数式编程没有太大区别),与其他函数式客户端编程语言非常匹配。 因此,如果有的话,FRM 只是弥合了 SQL、外部 DSL 和客户端语言之间的差距。

(我在 jOOQ 背后的公司工作,所以这个答案有偏见)

What's the purpose of an ORM?

The main purpose of using an ORM is to bridge between the networked model (object orientation, graphs, etc.) and the relational model. And the main difference between the two models is surprisingly simple. It's whether parents point to children (networked model) or children point to parents (relational model).

With this simplicity in mind, I believe there is no such thing as an "impedance mismatch" between the two models. The problems people usually run into are purely implementation specific, and should be solvable, if there were better data transfer protocols between clients and servers.

How can SQL address the problems we have with ORMs?

In particular, the third manifesto tries to address the shortcomings of the SQL language and relational algebra by allowing for nested collections, which have been implemented in a variety of databases, including:

  • Oracle (probably the most sophisticated implementation)
  • PostgreSQL (to some extent)
  • Informix
  • SQL Server, MySQL, etc. (through "emulation" via XML or JSON)

In my opinion, if all databases implemented the SQL standard MULTISET() operator (e.g. Oracle does), people would no longer use ORMs for mapping (perhaps still for object graph persistence), because they could materialise nested collections directly from within the databases, e.g. this query:

SELECT actor_id, first_name, last_name,
  MULTISET (
    SELECT film_id, title
    FROM film AS f
    JOIN film_actor AS fa USING (film_id)
    WHERE fa.actor_id = a.actor_id
  ) AS films
FROM actor AS a

Would yield all the actors and their films as a nested collection, rather than a denormalised join result (where actors are repeated for each film).

Functional paradigm at the client side

The question whether a functional programming language at the client side is better suited for database interactions is really orthogonal. ORMs help with object graph persistence, so if your client side model is a graph, and you want it to be a graph, you will need an ORM, regardless if you're manipulating that graph using a functional programming language.

However, because object orientation is less idiomatic in functional programming languages, you are less likely to shoehorn every data item into an object. For someone writing SQL, projecting arbitrary tuples is very natural. SQL embraces structural typing. Each SQL query defines its own row type without the need to previously assign a name to it. That resonates very well with functional programmers, especially when type inference is sophisticated, in case of which you won't ever think of mapping your SQL result to some previously defined object / class.

An example in Java using jOOQ from this blog post could be:

// Higher order, SQL query producing function:
public static ResultQuery<Record2<String, String>> actors(Function<Actor, Condition> p) {
    return ctx.select(ACTOR.FIRST_NAME, ACTOR.LAST_NAME)
              .from(ACTOR)
              .where(p.apply(ACTOR)));
}

This approach leads to a much better compositionality of SQL statements than if the SQL language were abstracted by some ORM, or if SQL's natural "string based" nature were used. The above function can now be used e.g. like this:

// Get only actors whose first name starts with "A"
for (Record rec : actors(a -> a.FIRST_NAME.like("A%")))
    System.out.println(rec);

FRM abstraction over SQL

Some FRMs try to abstract over the SQL language, usually for these reasons:

  • They claim SQL is not composable enough (jOOQ disproves this, it's just very hard to get right).
  • They claim that API users are more used to "native" collection APIs, so e.g. JOIN is translated to flatMap() and WHERE is translated to filter(), etc.

To answer your question

FRM is not "easier" than ORM, it solves a different problem. In fact, FRM doesn't really solve any problem at all, because SQL, being a declarative programming language itself (which is not so different from functional programming), is a very good match for other functional client programming languages. So, if anything at all, an FRM simply bridges the gap between SQL, the external DSL, and your client language.

(I work for the company behind jOOQ, so this answer is biased)

雨夜星沙 2024-07-14 17:55:41

扩展关系数据库的难题是扩展事务、数据类型不匹配、自动查询转换以及诸如 N+1 Select 是离开关系系统的基本问题,并且——在我看来——不会通过改变接收编程范式而改变。

The hard problems of extending the relational database are extended transactions, data-type mismatches, automated query translation and things like N+1 Select that are fundamental problems of leaving the relational system and -- in my opinion -- do not change by changing the receiving programming paradigm.

飘然心甜 2024-07-14 17:55:41

这取决于您的需求

  1. 如果您想专注于数据结构,请使用像 JPA/Hibernate 这样的 ORM
  2. 如果您想阐明处理方法,请查看 FRM 库:QueryDSL 或 Jooq
  3. 如果您需要调整 SQL 请求对于特定的数据库,使用 JDBC 和本机 SQL 请求

各种“关系映射”技术的优势在于可移植性:确保您的应用程序能够在大多数 ACID 数据库上运行。
否则,当您手动编写 SQL 请求时,您将面临各种 SQL 方言之间的差异。

当然,您可以限制自己遵守 SQL92 标准(然后进行一些函数式编程),或者您可以使用 ORM 框架重用函数式编程的一些概念

ORM 的优势是建立在会话对象之上的,会话对象可以充当瓶颈:

  1. 它管理生命周期只要底层数据库事务正在运行,对象就会被保留。
  2. 它维护 java 对象和数据库行之间的一对一映射(并使用内部缓存来避免重复的对象)。
  3. 它自动检测关联更新和要删除的孤立对象
  4. 它使用乐观或悲观锁处理并发问题。

然而,它的优点也是它的缺点:

  1. 会话必须能够比较对象,因此您需要实现 equals/hashCode 方法。
    但对象相等性必须植根于“业务键”而不是数据库 ID(新的瞬态对象没有数据库 ID!)。
    然而,一些具体化的概念没有业务平等(例如操作)。
    常见的解决方法依赖于 GUID,这往往会让数据库管理员感到不安。

  2. 会话必须监视关系更改,但其映射规则会推动使用不适合业务算法的集合。
    有时你想使用 HashMap,但 ORM 需要密钥是另一个“丰富域对象”而不是另一个轻量级对象......
    然后,您必须在充当关键的丰富域对象上实现对象相等......
    但你不能,因为这个对象在商业世界中没有对应的对象。
    因此,您退回到必须迭代的简单列表(并由此导致性能问题)。

  3. ORM API 有时不适合实际使用。
    例如,现实世界的 Web 应用程序尝试通过在获取数据时添加一些“WHERE”子句来强制会话隔离......
    那么“Session.get(id)”就不够了,你需要转向更复杂的 DSL(HSQL、Criteria API)或返回到本机 SQL

  4. 数据库对象与专用于其他框架的其他对象发生冲突(例如 OXM 框架 = 对象/XML 映射)。
    例如,如果您的 REST 服务使用 jackson 库来序列化业务对象。
    但这个 Jackson 与 Hibernate One 完全对应。
    然后要么合并两者,API 和数据库之间就会出现强耦合
    或者您必须实现转换,并且从 ORM 保存的所有代码都会丢失...

另一方面,FRM 是“对象关系映射”(ORM) 和本机 SQL 查询(使用 JDBC)之间的权衡

。解释 FRM 和 ORM 之间差异的最佳方法是采用 DDD 方法。

  • 对象关系映射允许使用“富域对象”,这些 Java 类的状态在数据库事务期间是可变的
  • 功能关系映射依赖于不可变的“贫域对象”(如此之多,以至于每次都必须克隆一个新对象)你想改变它的内容)

它释放了对 ORM 会话的约束,并且大部分时间依赖于 DSL 而不是 SQL(所以可移植性并不重要)
但另一方面,你必须查看交易细节、并发问题

List<Person> persons = queryFactory.selectFrom(person)
  .where(
    person.firstName.eq("John"),
    person.lastName.eq("Doe"))
  .fetch();

That depends on your needs

  1. If you want to focus on the data-structures, use an ORM like JPA/Hibernate
  2. If you want to shed light on treatments, take a look at FRM libraries: QueryDSL or Jooq
  3. If you need to tune your SQL requests to specific databases, use JDBC and native SQL requests

The strengh of various "Relational Mapping" technologies is portability: you ensure your application will run on most of the ACID databases.
Otherwise, you will cope with differences between various SQL dialects when you write manually the SQL requests .

Of course you can restrain yourself to the SQL92 standard (and then do some Functional Programming) or you can reuse some concepts of functionnal programming with ORM frameworks

The ORM strenghs are built over a session object which can act as a bottleneck:

  1. it manages the lifecycle of the objects as long as the underlying database transaction is running.
  2. it maintains a one-to-one mapping between your java objects and your database rows (and use an internal cache to avoid duplicate objects).
  3. it automatically detects association updates and the orphan objects to delete
  4. it handles concurrenty issues with optimistic or pessimist lock.

Nevertheless, its strengths are also its weaknesses:

  1. The session must be able to compare objects so you need to implements equals/hashCode methods.
    But Objects equality must be rooted on "Business Keys" and not database id (new transient objects have no database ID!).
    However, some reified concepts have no business equality (an operation for instance).
    A common workaround relies on GUIDs which tend to upset database administrators.

  2. The session must spy relationship changes but its mapping rules push the use of collections unsuitable for the business algorithms.
    Sometime your would like to use an HashMap but the ORM will require the key to be another "Rich Domain Object" instead of another light one...
    Then you have to implement object equality on the rich domain object acting as a key...
    But you can't because this object has no counterpart on the business world.
    So you fall back to a simple list that you have to iterate on (and performance issues result from).

  3. The ORM API are sometimes unsuitable for real-world use.
    For instance, real world web applications try to enforce session isolation by adding some "WHERE" clauses when you fetch data...
    Then the "Session.get(id)" doesn't suffice and you need to turn to more complex DSL (HSQL, Criteria API) or go back to native SQL

  4. The database objects conflicts with other objects dedicated to other frameworks (like OXM frameworks = Object/XML Mapping).
    For instance, if your REST services use jackson library to serialize a business object.
    But this Jackson exactly maps to an Hibernate One.
    Then either you merge both and a strong coupling between your API and your database appears
    Or you must implement a translation and all the code you saved from the ORM is lost there...

On the other side, FRM is a trade-off between "Object Relational Mapping" (ORM) and native SQL queries (with JDBC)

The best way to explain differences between FRM and ORM consists into adopting a DDD approach.

  • Object Relational Mapping empowers the use of "Rich Domain Object" which are Java classes whose states are mutable during the database transaction
  • Functional Relational Mapping relies on "Poor Domain Objects" which are immutable (so much so you have to clone a new one each time you want to alter its content)

It releases the constraints put on the ORM session and relies most of time on a DSL over the SQL (so portability doesn't matter)
But on the other hand, you have to look into the transaction details, the concurrency issues

List<Person> persons = queryFactory.selectFrom(person)
  .where(
    person.firstName.eq("John"),
    person.lastName.eq("Doe"))
  .fetch();
爱的故事 2024-07-14 17:55:41

我猜想函数到关系的映射应该比面向对象到 RDBMS 更容易创建和使用。 只要你只查询数据库,就是这样。 我真的不明白(还)如何以一种很好的方式进行数据库更新而不产生副作用。

我看到的主要问题是性能。 如今的 RDMS 并非设计用于功能查询,并且在很多情况下可能表现不佳。

I'd guess functional to relational mapping should be easier to create and use than OO to RDBMS. As long as you only query the database, that is. I don't really see (yet) how you could do database updates without side effects in a nice way.

The main problem I see is performance. Todays RDMS are not designed to be used with functional queries, and will probably behave poorly in quite a few cases.

GRAY°灰色天空 2024-07-14 17:55:41

我本身没有进行函数关系映射,但我使用了函数编程技术来加速对 RDBMS 的访问。

从数据集开始,对其进行一些复杂的计算并存储结果是很常见的,例如,结果是带有附加值的原始数据的子集。 命令式方法要求您使用额外的 NULL 列存储初始数据集,进行计算,然后使用计算值更新记录。

看起来很合理。 但问题是它可能会变得非常慢。 如果您的计算除了更新查询本身之外还需要另一个 SQL 语句,或者甚至需要在应用程序代码中完成,那么您实际上必须在计算后(重新)搜索要更改的记录,以将结果存储在正确的行中。

您只需为结果创建一个新表即可解决此问题。 这样,您就可以始终插入而不是更新。 您最终会拥有另一个表,复制键,但您不再需要在存储 NULL 的列上浪费空间 – 您只存储您拥有的内容。 然后,您将结果加入到最终选择中。

我以这种方式(ab)使用了 RDBMS,最终编写了看起来大部分像这样的 SQL 语句...

create table temp_foo_1 as select ...;
create table temp_foo_2 as select ...;
...
create table foo_results as
  select * from temp_foo_n inner join temp_foo_1 ... inner join temp_foo_2 ...;

这本质上是创建一堆不可变的绑定。 不过,好处是您可以一次处理整套作品。 有点让你想起那些可以处理矩阵的语言,比如 Matlab。

我想这也将使并行性变得更加容易。

一个额外的好处是,不必指定以这种方式创建的表的列类型,因为它们是从从中选择的列中推断出来的。

I haven't done functional-relational mapping, per se, but I have used functional programming techniques to speed up access to an RDBMS.

It's quite common to start with a dataset, do some complex computation on it, and store the results, where the results are a subset of the original with additional values, for example. The imperative approach dictates that you store your initial dataset with extra NULL columns, do your computation, then update the records with the computed values.

Seems reasonable. But the problem with that is it can get very slow. If your computation requires another SQL statement besides the update query itself, or even needs to be done in application code, you literally have to (re-)search for the records that you are changing after the computation to store your results in the right rows.

You can get around this by simply creating a new table for results. This way, you can just always insert instead of update. You end up having another table, duplicating the keys, but you no longer need to waste space on columns storing NULL – you only store what you have. You then join your results in your final select.

I (ab)used an RDBMS this way and ended up writing SQL statements that looked mostly like this...

create table temp_foo_1 as select ...;
create table temp_foo_2 as select ...;
...
create table foo_results as
  select * from temp_foo_n inner join temp_foo_1 ... inner join temp_foo_2 ...;

What this is essentially doing is creating a bunch of immutable bindings. The nice thing, though, is you can work on entire sets at once. Kind of reminds you of languages that let you work with matrices, like Matlab.

I imagine this would also allow for parallelism much easier.

An extra perk is that types of columns for tables created this way don't have to be specified because they are inferred from the columns they're selected from.

嘿嘿嘿 2024-07-14 17:55:41

我认为,正如 Sam 提到的,如果应该更新数据库,则必须面对与 OO 世界相同的并发问题。 由于 RDBMS 的数据、事务等状态,程序的功能性质可能比对象性质更成问题。

但对于阅读而言,函数式语言对于某些问题域可能会更自然(因为它似乎与 DB 无关)。

函数式 RDBMS 映射与 OO-RDBMS 映射应该没有太大差异。 但我认为这在很大程度上取决于您想要使用哪种数据类型,如果您想使用全新的数据库模式开发程序或针对旧数据库模式执行某些操作等。

延迟获取等例如,关联可能可以通过一些与惰性求值相关的概念很好地实现。 (尽管它们也可以用 OO 很好地完成)

编辑:通过一些谷歌搜索,我发现 HaskellDB(Haskell 的 SQL 库)- 这值得尝试吗?

I'd think that, as Sam mentioned, if the DB should be updated, the same concurrency issues have to be faced as with OO world. The functional nature of the program could maybe be even a little more problematic than the object nature because of the state of data, transactions etc of the RDBMS.

But for reading, the functional language could be more natural with some problem domains (as it seems to be regardless of the DB)

The functional<->RDBMS mapping should have no big differences to OO<->RDMBS mappings. But I think that that depends a lot on what kind of data types you want to use, if you want to develop a program with a brand new DB schema or to do something against a legacy DB schema, etc..

The lazy fetches etc for associations for example could probably be implemented quite nicely with some lazy evaluation -related concepts. (Even though they can be done quite nicely with OO also)

Edit : With some googling I found HaskellDB (SQL library for Haskell) - that could be worth trying?

撧情箌佬 2024-07-14 17:55:41

数据库和函数式编程可以融合。

例如:

Clojure是一种基于关系数据库理论的函数式编程语言。

               Clojure -> DBMS, Super Foxpro
                   STM -> Transaction,MVCC
Persistent Collections -> db, table, col
              hash-map -> indexed data
                 Watch -> trigger, log
                  Spec -> constraint
              Core API -> SQL, Built-in function
              function -> Stored Procedure
             Meta Data -> System Table

注意:在最新的spec2中,spec更像是RMDB。
请参阅:spec-alpha2 wiki:架构和选择

我主张:在hash-map之上构建关系数据模型,实现NoSQL和RMDB优势的结合。 这实际上是 posgtresql 的反向实现。

鸭子打字:如果它看起来像鸭子并且嘎嘎叫起来像鸭子,那么它一定是鸭子。

如果 clojure 的数据模型像 RMDB,clojure 的设施像 RMDB,clojure 的数据操作像 RMDB,那么 clojure 一定是 RMDB。

Clojure 是一种基于关系数据库理论的函数式编程语言

一切都是 RMDB

基于hash-map(NoSQL)实现关系数据模型和编程

Databases and Functional Programming can be fused.

for example:

Clojure is a functional programming language based on relational database theory.

               Clojure -> DBMS, Super Foxpro
                   STM -> Transaction,MVCC
Persistent Collections -> db, table, col
              hash-map -> indexed data
                 Watch -> trigger, log
                  Spec -> constraint
              Core API -> SQL, Built-in function
              function -> Stored Procedure
             Meta Data -> System Table

Note: In the latest spec2, spec is more like RMDB.
see: spec-alpha2 wiki: Schema-and-select

I advocate: Building a relational data model on top of hash-map to achieve a combination of NoSQL and RMDB advantages. This is actually a reverse implementation of posgtresql.

Duck Typing: If it looks like a duck and quacks like a duck, it must be a duck.

If clojure's data model like a RMDB, clojure's facilities like a RMDB and clojure's data manipulation like a RMDB, clojure must be a RMDB.

Clojure is a functional programming language based on relational database theory

Everything is RMDB

Implement relational data model and programming based on hash-map (NoSQL)

心碎的声音 2024-07-14 17:55:41

函数式和面向对象是两个正交的概念。 将平面表映射到对象树的问题与功能性与命令性正交。

然而,函数式与命令式确实解决了一个特定的不匹配,即命令式更新和 MVCC 之间的不匹配。 在命令式编程中,在更新表时锁定正在使用的表是最直观的方法,任何非顺序的操作都是极其违反直觉的。

在FP中,MVCC比锁自然得多。 自然的写入方式是计算结果集,计算与读取数据的差异,写入(即选择更新的数据集作为新数据集,使用持久数据结构共享它们共有的数据),然后进行回滚和恢复。 如果存在写入冲突,请重试。 这与 MVCC 的作用完全一致。

Being functional and being OO are two orthogonal concepts. The issue of mapping flat tables to trees of objects is orthogonal to Functional vs Imperative.

However, functional vs imperative does solve one particular mismatch, namely the mismatch between imperative updates and MVCC. In imperative programming, locking the table you are working with while you update the tables is the most intuitive approach, and anything non-sequential is extremely counterintuitive.

In FP, MVCC is much more natural than locks. The natural way to write is to compute the result set, compute the diff with read data, write (i.e. pick the updated dataset as the new one, sharing the data they have in common using persistent data structures), and do a rollback & retry if there is a write-write conflict. This matches exactly what MVCC does.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文