“分布式计算”是如何实现的?适用于网络开发或一般编程吗?

发布于 2024-10-08 00:14:42 字数 337 浏览 0 评论 0原文

我即将使用 Apache Hadoop,标题如下:

Apache Hadoop 项目开发 开源软件可靠, 可扩展的分布式计算。

我可以将“可扩展性”与编程联系起来,但我只是不知道这种“分布式”如何帮助我的开发。根据维基百科:

分布式系统包括 多台自治计算机 通过计算机进行通信 网络。计算机与计算机交互 彼此为了达到一个目的 共同目标

那么这是否意味着我可以在多台计算机上部署我的网络应用程序并进行某种“密集计算”?我想到的术语是内容交付网络和云计算。

I am about to use Apache Hadoop, the headlines read:

The Apache Hadoop project develops
open-source software for reliable,
scalable, distributed computing.

I can relate "scalability" to programming, but I just don't know how this "distributing" can help me in my development. According to wikipedia:

A distributed system consists of
multiple autonomous computers that
communicate through a computer
network. The computers interact with
each other in order to achieve a
common goal

So does this mean I can deploy my web apps across multiple computers and do some sort of "intense computing"? The terms that come into my mind are Content Delivery Networks and Cloud Computing.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(3

爱给你人给你 2024-10-15 00:14:42

Web 开发一直与分布式计算有关,因为客户端与其对话的服务器位于不同的计算机上,网页可以从许多服务器中提取资源来构建页面的内容,并且服务器可以与其他计算机对话以实现其目标。 CDN 使这一点比以前更加明显,但实际上它们只是一种演变,在您所要求的内容和用于提供它的硬件之间引入了虚拟化/间接层。

云是采用虚拟化的概念并将其应用于远程托管,包括低级操作系统和高级软件平台。它们真正有趣的是,这使得客户能够实现不同的业务模型(并且也具有不同的风险,但这主要与它是分布式计算这一事实无关,而是与它并不完全在您自己的控制之下)管辖权)。

我发现分布式计算最有效的使用是当你考虑将不同的服务连接在一起时,每个服务都具有不同的功能(这可能是出于技术原因,也可能不是;有时,这是出于商业或法律原因事物必须被划分),并且其中每项服务都可以由多个位置的许多组件提供。在通用能力图的整体背景下平衡性能需求(将组件聚集在一起的力量)和稳健性需求(这往往导致分布和复制)存在并且仍然存在问题。

天哪!那一段听起来就像是可怕的废话!我想说的是,这都是权衡的结果,你应该做好第一次就做不好的准备。

(Hadoop 是一种用于进行分布式文件存储以及在整个数据集上高效应用某些操作类别的机制,这些操作非常适合 MapReduce 或其他类似的分散收集算法。如果那只鞋适合,就使用它。但它并不能解决所有问题,谢天谢地!可以做所有事情的东西往往看起来非常像实际上根本不能做任何事情,并且有用性和可理解性来自于限制。)

Web development has always been about distributed computing, since clients have been on different machines to the servers they talk to, web pages can pull in resources from many servers to build a page's content, and servers may talk to other machines to achieve their goals. CDNs make this more obvious than before, but really they're just an evolution, an introduction of a virtualization/indirection layer between what you ask for and the hardware used to provide it.

Clouds are about taking the concepts of virtualization and applying them to remote hosting, both of low-level OSes and higher-level software platforms. The really interesting thing about them is that this enables different business models on the part of customers (and with different risks too, but that's mostly not related to the fact that it's distributed computing but rather that it is not wholly under your control in your own jurisdiction).

I've found that the most effective use of distributed computing is when you think in terms of connecting together distinct services, each of which with different capabilities (which might be for technical reasons, or might not; sometimes, it's for business or legal reasons that things have to be divided up) and where each of those services may be provided by many components in multiple locations. There are, and continue to remain, issues with balancing the need for performance (which is a force that brings components together) and the need for robustness (which tends to lead to distribution and replication) within the overall context of the general capabilities map.

My goodness! That paragraph sounds like terrible piffle! What I'm trying to say is that it's all trade-offs, and you should be prepared for not getting it right first time.

(Hadoop is a mechanism for doing a distributed file store, and for efficiently applying certain classes of operation – those that fit well with MapReduce or other similar scatter-gather algorithms – across that whole dataset. If that shoe fits, use it. But it doesn't solve all problems, and thank goodness for that! Things that can do everything tend to look very much like things that can't actually do anything at all, and usefulness and comprehensibility come in the restrictions.)

挽你眉间 2024-10-15 00:14:42

Hadoop 通常用于通过将数据集的处理分布在多台机器上来处理海量数据集。

这意味着您可能不想使用它来“部署应用程序”。但是,您可以使用它来处理应用程序的统计信息。例如,您可能有非常大的用户数据日志。如果您的用户数据变得太大而无法容纳在单个硬盘驱动器上,并且/或者一台机器需要太长时间来处理统计数据(使用 SQL 查询等标准方法),就会发生这种情况。

Hadoop is typically used to process massive data sets by distributing the processing of that data set across multiple machines.

What this means is you probably don't want to use it to "deploy an application". You might use it to process stats on your application, however. For instance, you might have very large logs of user data. This would happen if your user data grows to become too large to fit on a single hard drive, and/or would take too long for one machine to process stats on (using standard methods like an SQL query).

心清如水 2024-10-15 00:14:42

伊甘。虽然“客户端”和“服务器”的传统角色从 1960 年到 2005 年左右一直相当稳定。

我相信,分布式计算就是我们所有人口袋里都带着处理器。

手机负责计算工作。手机不需要集中式服务器,但它们确实从中受益。

手机、智能手机、平板电脑是分布式计算发展的一个例子。

现在你可以用 Android 设备制作一个 WiFi 基站。所以现在手机变成了某种服务器,就在咖啡店里的那一刻,你为旁边那个可爱的人打开了它,没有网络……现在我离题了……

Ygam. While the traditional role of "clients" and "servers" have been pretty stable from 1960 till about 2005.

I believe with every fiber of my being, that distributed computing is that we all carry processors around in our pockets.

Phones do computing work. Phones do NOT need centralized servers, but they DO benefit from them.

Phones , Smartphones, tablets are an example of where distributed computation is going.

You can make a wifi base-station out of an Android device now. So now a phone becomes a server of sorts for just that instant in the coffee shop that you turn it on for that cute person next to you without internet ....and now I digress.......

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文