Hadoop全分布式模式

发布于 2024-11-14 04:04:58 字数 194 浏览 0 评论 0原文

我是 Hadoop 的新手。我已经成功开发了一个简单的 Map/Reduce 应用程序,该应用程序在“伪分布式模式”下运行良好。我想在“完全分布式模式”下测试它。对此我有几个问题;

  1. 处理 1-10GB 的文件大小需要多少台机器(节点)(最少和推荐)?
  2. 硬件要求是什么(主要是我想知道核心数量、内存空间和磁盘空间)?

I am a newbie to Hadoop. I have managed to develop a simple Map/Reduce application that works fine in 'pseudo distributed mode'.I want to test that in 'fully distributed mode'. I have few questions regarding that;

  1. How many machines(nodes) do I need (minimum & recommended) for processing a file size of 1-10GB?
  2. what are the hardware requirements(mainly, I want to know the # of cores, Memory space and disk space)?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

感情旳空白 2024-11-21 04:04:58

我会查看 Cloudera 的硬件建议: http://www.cloudera.com/blog/2010/03/clouderas-support-team-shares-some-basic-hardware-recommendations/

该页面的片段

适用于不同工作负载的各种硬件配置,包括我们最初的“基础”推荐:

  • 光处理配置
    (1U/机器):
    两个四核 CPU,8GB
    内存和 4 个磁盘驱动器(1TB 或
    2TB)。注意CPU密集型工作
    比如自然语言处理
    涉及将大型模型加载到
    RAM在处理数据之前应该
    配置 2GB RAM/核心
    而不是 1GB RAM/核心。
  • 平衡计算配置(1U/机器):两个四核 CPU、16 至 24GB 内存以及使用主板控制器直接连接的 4 个磁盘驱动器(1TB 或 2TB)。这些通常以双胞胎形式提供,在单个 2U 机柜中具有两个主板和 8 个驱动器。
  • 存储重型配置(2U/机器):两个四核 CPU、16 至 24GB 内存和 12 个磁盘驱动器(1TB 或 2TB)。此类机器的功耗在空闲状态下约为 200W 左右,在活动状态下可高达 350W 左右。
  • 计算密集型配置(2U/机器):两个四核 CPU、48-72GB 内存和 8 个磁盘驱动器(1TB 或 2TB)。当需要大型内存模型和大量参考数据缓存的组合时,通常会使用它们。

I'd check out Cloudera's hardware recommendations: http://www.cloudera.com/blog/2010/03/clouderas-support-team-shares-some-basic-hardware-recommendations/

A snippet from that page

Various hardware configurations for different workloads, including our original “base” recommendation:

  • Light Processing Configuration
    (1U/machine):
    Two quad core CPUs, 8GB
    memory, and 4 disk drives (1TB or
    2TB). Note that CPU-intensive work
    such as natural language processing
    involves loading large models into
    RAM before processing data and should
    be configured with 2GB RAM/core
    instead of 1GB RAM/core.
  • Balanced Compute Configuration (1U/machine): Two quad core CPUs, 16 to 24GB memory, and 4 disk drives (1TB or 2TB) directly attached using the motherboard controller. These are often available as twins with two motherboards and 8 drives in a single 2U cabinet.
  • Storage Heavy Configuration (2U/machine): Two quad core CPUs, 16 to 24GB memory, and 12 disk drives (1TB or 2TB). The power consumption for this type of machine starts around ~200W in idle state and can go as high as ~350W when active.
  • Compute Intensive Configuration (2U/machine): Two quad core CPUs, 48-72GB memory, and 8 disk drives (1TB or 2TB). These are often used when a combination of large in-memory models and heavy reference data caching is required.
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文