如何学习正式的自上而下的软件架构方法?

发布于 2024-08-12 13:29:56 字数 583 浏览 13 评论 0原文

我是一名对信息检索感兴趣的软件开发人员。目前,我正在开发我的第三个搜索引擎项目,并且对一次又一次编写的样板代码数量以及相同的错误等感到非常沮丧。

基本搜索引擎是一个非常简单的野兽可以用由两个“层”组成的形式语言来描述:

  1. “原语层”(或公理,内核语言 - 不知道如何命名它们)。它们由多个集合(作为一组资源 - 文件、网站)、集合上的关系(如“站点 A 链接到站点 B”)和简单操作(如“打开资源 A 的流”、“从流中读取记录”、 '合并N个流','按字段F索引记录集'等。此外,还有大量的数据转换,如'以YAML格式保存流','从XML格式加载流'等。

  2. “应用层”——构成搜索引擎生命周期的几个非常高级的操作,如“收获新资源”、“爬行收获的资源”、“将爬行的资源合并到数据库”、“索引爬行的资源”、“合并索引”等等。这些高级操作中的每一项都可以用 1 中的“原语”术语来表达。

这样的高级表示可以很容易地测试,甚至可以正式证明,并用所选的编程语言实现(或生成代码)。

那么,问题是:是否有人以这种方式设计系统——形式化、严格(甚至可能在代数/群论的水平上),采用严格的自上而下的方法?我可以阅读什么来了解什么?

I'm a software developer interested in information retrieval. Currently I'm working on my 3rd search engine project and am VERY frustrated about the amount of boilerplate code that is written again and again, with the same bugs, etc.

Basic search engine is a very simple beast that could be described in a formal language consisting of two "layers":

  1. "Layer of primitives" (or axioms, kernel language - don't know how to name them). They consist of several sets (as a set of resources - files, websites), relations on sets (as 'site A links to site B') and simple operations as 'open stream to resource A', 'read record from stream', 'merge N streams', 'index set of records by field F', etc. Also, there is a lot of data conversion, as 'save stream in YAML format', 'load stream from XML format', etc.

  2. "Application layer" - several very high-level operations that form a search engine lifecycle, as 'harvest new resources', 'crawl harvested resources', 'merge crawled resources to the database', 'index crawled resources', 'merge indexes', etc. Every one of this high-level operations could be expressed in the terms of "primitives" from 1.

Such a high-level representation could be easily tested, maybe even proved formally, and implemented (or code-generated) in the programming language of choice.

So, the question: does anybody design systems in this way - formally, rigorously ( maybe even at the level of algebra/group theory), in the strict top-down approach? What can I read to learn about ?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(6

卸妝后依然美 2024-08-19 13:29:57

我建议查看 IEEE-1471

I recommend looking at IEEE-1471.

对不⑦ 2024-08-19 13:29:57

我会质疑您的假设,即需要以这种方式编写可重用代码。

我见过工作场所的系统以重用代码为目标而设计,最终会重用 v.f.few & 。到处都有额外的复杂性。

我发现坚持 SOLID 原则、进行 TDD、牢记 DRY、YAGNI 和 KISS 对于实现合理的重用水平大有帮助。

您提到的操作是不同职责的完美示例,这些职责不应全部结束于同一类:

打开资源 A 的流', '读取
从流中记录', '合并N
Streams', '记录集的索引
字段 F' 等。此外,还有很多
数据转换,如“将流保存在
YAML 格式', '从 XML 加载流
格式”等

您推荐这本电子书 solid

在尝试自上而下设计它时,请小心地反复思考“如果 x 会怎样”、“如果 y 会怎样”……因为你很容易添加大量最终不需要的东西 - 或者是没有以可重用的方式建模(即使这是您添加它的原因......)。

I would challenge your assumption that reusable code needs to be written in such way.

I have seen workplaces with systems designed with a reuse code goal, that end up reusing v. few & have extra complexity all around.

I find sticking to principles in SOLID, doing TDD, having DRY, YAGNI and KISS in mind go a long way to achieving a reasonable level of reuse.

The operations you mentioned are perfect examples of different responsibilities that shouldn't all end in the same class:

open stream to resource A', 'read
record from stream', 'merge N
streams', 'index set of records by
field F', etc. Also, there is a lot of
data conversion, as 'save stream in
YAML format', 'load stream from XML
format', etc.

I recommend you this ebook on solid.

On trying to design it top down, careful on having repeated thoughts on 'what if x', 'what if y' ... as you fall too easy on adding plenty of stuff that you don't need at the end - or are not modeled in a reusable way (even if that was the reason you added it ...).

沙沙粒小 2024-08-19 13:29:56

关键系统(核电站、飞机、列车控制系统等)是采用自上而下的方法开发的,类似于您正在寻找的方法。但上层根本就不是程序化的。它不是关于内核层和应用层,而是关于细化为组件、子组件的高层设计,每个级别都有精确的规范。

规范可以是正式的(旨在在指定组件可用后自动验证)或非正式的(旨在通过测试、代码审查或任何适当的方法进行验证)。坦率地说,在2009年,它们大多数时候都不是正式的,尽管趋势显然是朝着这个方向发展。

既然您在问题标签中提到了正式方法,那么您一定对该主题感兴趣,但目前它还属于一个小众领域。我尤其不明白如何将这些方法经济地应用于搜索引擎项目。不管怎样,如果你想了解更多关于这些方法如何应用在它们工作的领域,这里有一些链接:

有人提到Z:Z是规范语言,你在其中不断完善规范直到它们变得可执行的框架是称为B。您可能还对合金感兴趣。最后,现有编程语言有正式的规范语言。这一趋势始于 Java 的 JML,并启发了许多其他人。我所在的团队为 C 定义了这样一种规范语言:ACSL

Critical systems (nuclear power plants, airplanes, train control systems, ...) are developed in a top-down approach similar to the one you are looking for. But the upper levels are not even programmatic at all. It's not about kernel layer and application layer, it is about high-level design refined into components, sub-components, with precise specifications at each level.

The specifications can be formal (intended to be verified automatically once the specified component is available) or not (intended to be verified by tests, code reviews or whatever method is appropriate). To be frank, in 2009, they aren't formal most of the time, although the trend is clearly to move in that direction.

Since you mention formal approaches in your question's tags, you must be interested in the topic, but it's a niche at the moment. I especially don't see how these methods could be applied economically to search engine projects. Anyway, if you want to learn more about how these methods are applied in fields where they work, here are a few links:

Someone mentioned Z: Z is the specification language, the framework in which you refine and refine specifications until they become executable is called B. You might also be interested in Alloy. And lastly, there are formal specification languages for existing programming languages. The trend started with JML for Java, and inspired many others. I work in a group of people who defined such a specification language for C, ACSL.

甩你一脸翔 2024-08-19 13:29:56

简短的回答是:“是的,在不同程度上。”

不同的组织以不同的严格程度进行软件开发,但是分层设计的概念是很好的,其中每一层都根据一个非常受限的、精确设计的接口来处理其职责,并与下一层提供的服务进行交互。已确立的。我想指出,人们越来越多地接受测试驱动开发、依赖注入和接口设计,作为这些想法正在慢慢成为软件开发标准的证据。

然而,软件开发的规模和目的多种多样。正如物理制造中的精密工程水平随着规模和复杂性的增加而提高(例如喷气发动机制造商与相框制造商)一样,一些软件开发人员处理的系统的性能和使用规模足够小,可以容忍精度的缺乏,甚至是长期存在的缺陷(例如,典型的 Web 开发人员与从事航空电子设备或嵌入式医疗设备工作的开发人员)。

我的观察是,只有当缺陷的后果足够严重时,精确和严格的分层通常被视为需要承担的成本。但我看到这种情况正在慢慢变得更好,至少在互联网规模的关键任务系统的开发中是这样。

The short answer is, "Yes, to varying degrees."

Different organizations approach software development with varying degrees of rigor, but the concept of layered design, in which each layer deals with its responsibilities in terms of a very constrained, precisely-designed interface to the services provided by the next layer down, is well-established. I would point to the growing acceptance of test-driven development, dependency injection, and design to interfaces as evidence that these ideas are slowly becoming established as standard in software development.

However, software development is pursued at a wide variety of scales and for a wide variety of purposes. Just as the level of precision engineering increases in physical fabrication as scale and complexity increase (e.g. jet engine manufacturer vs. picture framer), some software developers deal with systems whose performance and scale of use are small enough that they can tolerate lack of precision, or even long-lived defects (e.g. a typical web developer vs. a developer working on avionics or embedded medical devices).

My observation is that precision and strict layering have often been regarded as costs to be born only when the consequences of defects are sufficiently high. But I see that slowly changing for the better, at least in the development of mission-critical systems that work at Internet scale.

傲世九天 2024-08-19 13:29:56

对于我们的大多数项目,我们都有一个基于标准 3 层架构的架构:

  • UI:手动测试
  • 业务:使用模拟
  • 代理/数据访问进行测试:使用集成测试进行测试

要了解有关架构模式的更多信息,请参阅 http://en.wikipedia.org/wiki/Architectural_pattern_(computer_science)

For most of our projects we have an architecture based on a standard 3 layer architecture:

  • UI : Tested Manually
  • Business : Tested with Mocking
  • Proxy / Data Access : Tested with integration tests

To learn more about architectural patterns see http://en.wikipedia.org/wiki/Architectural_pattern_(computer_science)

明月松间行 2024-08-19 13:29:56

唔。不知道这是否有帮助,但是您看过Z 表示法吗?我在大学听说过它,但没有使用它(我没有选修该模块)。

Hmm. Don't know if this helps, but have you looked at Z notation? I heard about it at uni but have not used it (I didn't take that module).

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文