语言与可扩展性之间的关系

发布于 2024-08-12 17:04:35 字数 308 浏览 9 评论 0原文

我在 Erlang 社区网站 Trapexit 上看到了以下声明:

Erlang 是一种使用的编程语言 构建大规模可扩展的软件 实时系统有以下要求 高可用性。

我还记得在某处读到 Twitter 从 Ruby 切换到 Scala 以解决可扩展性问题。

因此,我想知道编程语言和可扩展性之间有什么关系?

我认为可扩展性只取决于系统设计、异常处理等。是否是因为语言的实现方式、图书馆,还是其他一些原因?

希望赐教。谢谢。

I came across the following statement in Trapexit, an Erlang community website:

Erlang is a programming language used
to build massively scalable soft
real-time systems with requirements on
high availability.

Also I recall reading somewhere that Twitter switched from Ruby to Scala to address scalability problem.

Hence, I wonder what is the relation between a programming language and scalability?

I would think that scalability depends only on the system design, exception handling etc. Is it because of the way a language is implemented, the libraries, or some other reasons?

Hope for enlightenment. Thanks.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(8

静若繁花 2024-08-19 17:04:35

Erlang 针对电信环境进行了高度优化,正常运行时间约为 5 个 9。

它包含一组称为 OTP 的库,并且可以“即时”将代码重新加载到应用程序中,而无需关闭应用程序!此外,还有一个主管模块等框架,以便当出现故障时,它会自动重新启动,否则故障会逐渐沿着链条向上传播,直到到达可以处理它的主管模块。

当然,这在其他语言中也是可能的。在 C++ 中,您可以动态重新加载 dll、加载插件。在 Python 中,您可以重新加载模块。在 C# 中,您可以动态加载代码、使用反射等。

只是这个功能是内置于 Erlang 中的,这意味着:

  • 它更加标准,任何 erlang 开发人员都知道它是如何工作的,
  • 重新实现自己的东西更少

。解释型,有些运行字节码,有些是本机编译的,因此运行时的性能和类型信息的可用性等有所不同。

Python 在其运行时库周围有一个全局解释器锁,因此无法使用 SMP。

Erlang 最近才添加了一些更改来利用 SMP。

一般来说,我会同意你的观点,因为我认为显着的差异在于内置库,而不是语言本身之间的根本差异。

最终,我觉得任何变得非常大的项目都会有“陷入困境”的风险,无论它是用什么语言编写的。正如你所说,我觉得架构和设计对于可扩展性来说非常重要,选择一种语言而不是另一种语言不会神奇地给予很棒的可扩展性...

Erlang is highly optimized for a telecommunications environment, running at 5 9s uptime or so.

It contains a set of libraries called OTP, and it is possible to reload code into the application 'on the fly' without shutting down the application! In addition, there is a framework of supervisor modules and so on, so that when something fails, it gets automatically restarted, or else the failure can gradually work itself up the chain until it gets to a supervisor module that can deal with it.

That would be possible in other languages of course too. In C++, you can reload dlls on the fly, load plugsin. In Python you can reload modules. In C#, you can load code in on-the-fly, use reflection and so on.

It's just that that functionality is built in to Erlang, which means that:

  • it's more standard, any erlang developer knows how it works
  • less stuff to re-implement oneself

That said, there are some fundamental differences between languages, to the extent that some are interpreted, some run off bytecode, some are native compiled, so the performance, and the availability of type information and so on at runtime differs.

Python has a global interpreter lock around its runtime library so cannot make use of SMP.

Erlang only recently had changes added to take advantage of SMP.

Generally I would agree with you in that I feel that a significant difference is down to the built-in libraries rather than a fundamental difference between the languages themselves.

Ultimately I feel that any project that gets very large risks getting 'bogged down' no matter what language it is written in. As you say I feel architecture and design are pretty fundamental to scalability and choosing one language over another will not I feel magically give awesome scalability...

心在旅行 2024-08-19 17:04:35

Erlang 在思考可靠性以及如何实现可靠性方面来自另一种文化。了解文化很重要,因为 Erlang 代码不会仅仅因为它是 Erlang 就神奇地变得容错。

一个基本思想是,高正常运行时间不仅来自于非常长的平均故障间隔时间,而且还来自于发生故障时非常短的平均恢复时间。

然后人们意识到,当检测到故障时,需要自动重新启动。人们意识到,在第一次检测到某些事情不太正确时,应该“崩溃”以导致重新启动。恢复需要优化,并且可能的信息丢失需要最小化。

许多成功的软件都遵循这种策略,例如日志文件系统或事务日志数据库。但绝大多数情况下,软件往往只考虑平均故障间隔时间,并向系统日志发送有关错误指示的消息,然后尝试继续运行,直到不再可能为止。通常需要人工监控系统并手动重新启动。

大多数这些策略都是以 Erlang 库的形式出现的。语言特性的一部分是进程可以相互“链接”和“监视”。第一个是双向合约,“如果你崩溃了,那么我会收到你的崩溃消息,如果没有被困住,我就会崩溃”,第二个是“如果你崩溃了,我会收到有关它的消息”。

链接和监视是库用来确保其他进程尚未崩溃的机制。进程被组织成“监督”树。如果树中的工作进程失败,主管将尝试重新启动它,或树中该分支同一级别的所有工作进程。如果失败,它将升级,等等。如果顶级主管放弃应用程序崩溃并且虚拟机退出,此时系统操作员应该重新启动计算机。

进程堆之间的完全隔离是 Erlang 表现良好的另一个原因。除了少数例外,进程之间不可能“共享值”。这意味着所有进程都是非常独立的,并且通常不会受到另一个进程崩溃的影响。此属性也适用于 Erlang 集群中的节点之间,因此处理节点从集群中故障的风险较低。复制并发送更改事件,而不是出现单点故障。

Erlang 采用的哲学有很多名称,“快速失败”、“仅崩溃系统”、“面向恢复的编程”、“暴露错误”、“微重启”、“复制”……

Erlang comes from another culture in thinking about reliability and how to achieve it. Understanding the culture is important, since Erlang code does not become fault-tolerant by magic just because its Erlang.

A fundamental idea is that high uptime does not only come from a very long mean-time-between-failures, it also comes from a very short mean-time-to-recovery, if a failure happened.

One then realize that one need automatic restarts when a failure is detected. And one realize that at the first detection of something not being quite right then one should "crash" to cause a restart. The recovery needs to be optimized, and the possible information losses need to be minimal.

This strategy is followed by many successful softwares, such as journaling filesystems or transaction-logging databases. But overwhelmingly, software tends to only consider the mean-time-between-failure and send messages to the system log about error-indications then try to keep on running until it is not possible anymore. Typically requiring human monitoring the system and manually reboot.

Most of these strategies are in the form of libraries in Erlang. The part that is a language feature is that processes can "link" and "monitor" each other. The first one is a bi-directional contract that "if you crash, then I get your crash message, which if not trapped will crash me", and the second is a "if you crash, i get a message about it".

Linking and monitoring are the mechanisms that the libraries use to make sure that other processes have not crashed (yet). Processes are organized into "supervision" trees. If a worker process in the tree fails, the supervisor will attempt to restart it, or all workers at the same level of that branch in the tree. If that fails it will escalate up, etc. If the top level supervisor gives up the application crashes and the virtual machine quits, at which point the system operator should make the computer restart.

The complete isolation between process heaps is another reason Erlang fares well. With few exceptions, it is not possible to "share values" between processes. This means that all processes are very self-contained and are often not affected by another process crashing. This property also holds between nodes in an Erlang cluster, so it is low-risk to handle a node failing out of the cluster. Replicate and send out change events rather than have a single point of failure.

The philosophies adopted by Erlang has many names, "fail fast", "crash-only system", "recovery oriented programming", "expose errors", "micro-restarts", "replication", ...

蹲在坟头点根烟 2024-08-19 17:04:35

Erlang 是一种在设计时就考虑到并发性的语言。虽然大多数语言都依赖操作系统来实现多线程,但 Erlang 中内置了并发性。 Erlang 程序可以由数千到数百万个极其轻量级的进程组成,这些进程可以在单个处理器上运行,可以在多核处理器上运行,也可以在处理器网络上运行。 Erlang还对进程之间的消息传递、容错等提供语言级支持。Erlang的核心是函数式语言,函数式编程是构建并发系统的最佳范例。

简而言之,在 Erlang 中构建分布式、可靠且可扩展的系统很容易,因为它是专门为此目的而设计的语言。

Erlang is a language designed with concurrency in mind. While most languages depend on the OS for multi-threading, concurrency is built into Erlang. Erlang programs can be made from thousands to millions of extremely lightweight processes that can run on a single processor, can run on a multicore processor, or can run on a network of processors. Erlang also has language level support for message passing between processes, fault-tolerance etc. The core of Erlang is a functional language and functional programming is the best paradigm for building concurrent systems.

In short, making a distributed, reliable and scalable system in Erlang is easy as it is a language designed specially for that purpose.

趁微风不噪 2024-08-19 17:04:35

简而言之,“语言”主要影响缩放的垂直轴,但并不是您在问题中已经回避的所有方面。这里有两件事:

1)可扩展性需要根据有形的指标来定义。我建议

S = 用户数量/成本

如果没有足够的定义,我们将永远讨论这一点。使用我提出的定义,比较系统实现变得更加容易。对于一个可扩展的系统(即:盈利),那么:

可扩展性随 S 一起增长

2) 系统可以基于 2 个主轴进行扩展:

  • a) 垂直
  • b) 水平

a) 垂直扩展与增强隔离节点有关,即更大的服务器、更多的 RAM 等。

b) 水平扩展涉及通过添加节点来增强系统。这个过程更加复杂,因为它需要处理现实世界的属性,例如光速(延迟)、分区容忍度、多种故障等。

(Node => ;物理分离,不同的“命运共享”)

不幸的是,术语可扩展性经常被滥用


很多时候,人们将语言库和混淆。实施。这些都是不同的事情。使某种语言非常适合特定系统的因素通常更多地与该语言的支持有关:库、开发工具、实现效率(即内存占用、内置函数的性能等) .)

就 Erlang 而言,它的设计恰好以现实世界的约束(例如分布式环境、故障、满足违约金风险的可用性需求等)作为输入要求。

无论如何,我可能会在这里待太久。

In short, the "language" primarily affects the vertical axii of scaling but not all aspects as you already eluded to in your question. Two things here:

1) Scalability needs to be defined in relation to a tangible metric. I propose money.

S = # of users / cost

Without an adequate definition, we will discussing this point ad vitam eternam. Using my proposed definition, it becomes easier to compare system implementations. For a system to be scalable (read: profitable), then:

Scalability grows with S

2) A system can be made to scale based on 2 primary axis:

  • a) Vertical
  • b) Horizontal

a) Vertical scaling relates to enhancing nodes in isolation i.e. bigger server, more RAM etc.

b) Horizontal scaling relates to enhancing a system by adding nodes. This process is more involving since it requires dealing with real world properties such as speed of light (latency), tolerance to partition, failures of many kinds etc.

(Node => physical separation, different "fate sharing" from another)

The term scalability is too often abused unfortunately.


Too many times folks confuse language with libraries & implementation. These are all different things. What makes a language a good fit for a particular system has often more to do with the support around the said language: libraries, development tools, efficiency of the implementation (i.e. memory footprint, performance of builtin functions etc.)

In the case of Erlang, it just happens to have been designed with real world constraints (e.g. distributed environment, failures, need for availability to meet liquidated damages exposure etc.) as input requirements.

Anyways, I could go on for too long here.

指尖上得阳光 2024-08-19 17:04:35

首先,您必须区分语言及其实现。例如ruby语言支持线程,但在官方实现中,线程不会利用多核芯片。

然后,当一种语言/实现/算法支持并行计算(例如通过多线程)并且当 CPU 数量增加时表现出良好的加速增加时(参见阿姆达尔定律),它通常被称为可扩展的。

一些语言,如 ErlangScalaOz 等也有语法(或漂亮的库),有助于编写清晰漂亮的并行代码。

First you have to distinguish between languages and their implementations. For instance ruby language supports threads, but in the official implementation, the thread will not make use of multicore chips.

Then, a language/implementation/algorithm is often termed scalable when it supports parallel computation (for instance via multithread) AND if it exhibits a good speedup increase when the number of CPU goes up (see Amdahl Law).

Some languages like Erlang, Scala, Oz etc. have also syntax (or nice library) which help writing clear and nice parallel code.

此生挚爱伱 2024-08-19 17:04:35

除了这里关于 Erlang 的观点(我不知道)之外,还有一种感觉是,某些语言更适合脚本编写和较小的任务。

像 ruby​​ 和 python 这样的语言有一些功能对于原型设计和创造力来说非常有用,但对于大型项目来说却很糟糕。可以说,它们最好的特点是缺乏“正式性”,这在大型项目中会伤害你。

例如,静态类型对于小的脚本类型的东西来说是一个麻烦,并且使得像 java 这样的语言非常冗长。但在具有数百或数千个类的项目中,您可以轻松地看到变量类型。将此与可以保存异构集合的映射和数组进行比较,作为类的使用者,您无法轻松判断它保存的数据类型。随着系统变得越来越大,这种事情会变得更加复杂。例如,您还可以做一些真正难以跟踪的事情,例如在运行时向类动态添加位(这可能很有趣,但如果您试图找出一段数据来自哪里,那就是一场噩梦)或调用以下方法:引发异常而不被编译器强制声明异常。并不是说你不能通过良好的设计和严格的编程来解决这些问题——只是更难做到。

作为一个极端的情况,您可以(除了性能问题)用 shell 脚本构建一个大型系统,并且您可能可以通过非常 严格且谨慎地遵守编码和命名约定(在这种情况下,您可能会“按照约定”创建一个静态类型系统),但这并不是一个有趣的练习。

In addition to the points made here about Erlang (Which I was not aware of) there is a sense in which some languages are more suited for scripting and smaller tasks.

Languages like ruby and python have some features which are great for prototyping and creativity but terrible for large scale projects. Arguably their best features are their lack of "formality", which hurts you in large projects.

For example, static typing is a hassle on small script-type things, and makes languages like java very verbose. But on a project with hundreds or thousands of classes you can easily see variable types. Compare this to maps and arrays that can hold heterogeneous collections, where as a consumer of a class you can't easily tell what kind of data it's holding. This kind of thing gets compounded as systems get larger. e.g. You can also do things that are really difficult to trace, like dynamically add bits to classes at runtime (which can be fun but is a nightmare if you're trying to figure out where a piece of data comes from) or call methods that raise exceptions without being forced by the compiler to declare the exception. Not that you couldn't solve these kinds of things with good design and disciplined programming - it's just harder to do.

As an extreme case, you could (performance issues aside) build a large system out of shell scripts, and you could probably deal with some of the issues of the messiness, lack of typing and global variables by being very strict and careful with coding and naming conventions ( in which case you'd sort of be creating a static typing system "by convention"), but it wouldn't be a fun exercise.

莫言歌 2024-08-19 17:04:35

Twitter 将其架构的某些部分从 Ruby 切换到了 Scala,因为他们一开始就使用了错误的工具来完成这项工作。他们使用 Ruby on Rails(针对构建全新 CRUD Web 应用程序进行了高度优化)来尝试构建消息传递系统。 AFAIK,他们仍在使用 Rails 来处理 Twitter 的 CRUD 部分,例如创建新的用户帐户,但已将消息传递组件转移到更合适的技术。

Twitter switched some parts of their architecture from Ruby to Scala because when they started they used the wrong tool for the job. They were using Ruby on Rails—which is highly optimised for building green field CRUD Web applications—to try to build a messaging system. AFAIK, they're still using Rails for the CRUD parts of Twitter e.g. creating a new user account, but have moved the messaging components to more suitable technologies.

内心荒芜 2024-08-19 17:04:35

Erlang 的核心是基于异步通信(对于共置交互和分布式交互),这是该平台实现可扩展性的关键。您可以在许多平台上使用异步通信进行编程,但 Erlang 语言和 Erlang/OTP 框架提供了使其易于管理的结构 - 无论是在技术上还是在您的头脑中。例如:如果没有 erlang 进程提供的隔离,你就会搬起石头砸自己的脚。通过链接/监控机制,您可以更快地对故障做出反应。

Erlang is at its core based on asynchronous communication (both for co-located and distributed interactions), and that is the key to the scalability made possible by the platform. You can program with asynchronous communication on many platforms, but Erlang the language and the Erlang/OTP framework provides the structure to make it manageable - both technically and in your head. For instance: Without the isolation provided by erlang processes, you will shoot yourself in the foot. With the link/monitor mechanism you can react on failures sooner.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文