如何开始学习hadoop

发布于 2024-12-03 14:06:42 字数 1435 浏览 1 评论 0 原文

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(8

始终不够 2024-12-10 14:06:42

您可以通过多种不同的语言访问 Hadoop,并且有大量资源可以为您设置 Hadoop。例如,您可以尝试 Amazon 的 Elastic MapReduce (EMR),而无需经历配置服务器、工作人员等的麻烦。这是一个很好的方法,可以让您了解 MapReduce 处理,同时延迟一点学习如何处理的问题。如何用好HDFS,如何管理你的调度程序等等。

搜索你喜欢的语言和语言并不难。找到它的 Hadoop API 或至少一些有关将其与 Hadoop 链接的教程。例如,以下是在 Hadoop 上运行的 PHP 应用程序的演练: http://www.lunchpauze.com/2007/10/writing-hadoop-mapreduce-program-in-php.html

You can access Hadoop from many different languages and a number of resources set up Hadoop for you. You could try Amazon's Elastic MapReduce (EMR), for instance, without having to go through the hassle of configuring the servers, workers, etc. This is a good way to get your head around MapReduce processing while delaying a bit the issues of learning how to use HDFS well, how to manage your scheduler, etc.

It's not hard to search for your favorite language & find Hadoop APIs for it or at least some tutorials on linking it with Hadoop. For instance, here's a walkthrough on a PHP app run on Hadoop: http://www.lunchpauze.com/2007/10/writing-hadoop-mapreduce-program-in-php.html

等风来 2024-12-10 14:06:42

答案1:

  • 非常希望了解Java。 Hadoop 是用 Java 编写的。其流行的序列文件格式依赖于Java。
  • 即使您使用 Hive 或 Pig,有一天您也可能需要编写自己的 UDF。有些人仍然尝试用其他语言编写它们,但我认为 Java 对它们有更强大和主要的支持。
  • 大多数 Hadoop 工具还不够成熟(如 Sqoop、HCatalog 等),因此您会看到许多 Java 错误堆栈跟踪,并且可能有一天您会想要破解源代码

答案 2

  • 这是您不需要了解 Java。
  • 正如其他人所说,这将非常有帮助,具体取决于您的处理的复杂程度。然而,仅使用 Pig 和 Hive 就可以完成令人难以置信的大量工作。
  • 我同意您最终很可能需要编写用户定义函数 (UDF),但是,我已经用 Python 编写了这些函数,并且用 Python 编写 UDF 非常容易。
  • 当然,如果您有非常严格的性能要求,那么基于 Java 的 MapReduce 程序将是您的最佳选择。然而,Pig 和 Hive 的性能一直在取得巨大进步。
  • 因此,您的问题的简短答案是“否”,您不需要了解 Java 即可执行 Hadoop 开发。

来源:
http://www.linkedin.com/groups/Is-it-必须-Hadoop-Developer-988957.S.141072851

Answer 1 :

  • It is very desirable to know Java. Hadoop is written in Java. Its popular Sequence File format is dependent on Java.
  • Even if you use Hive or Pig, you'll probably need to write your own UDF someday. Some people still try to write them in other languages, but I guess that Java has more robust and primary support for them.
  • Most Hadoop tools are not mature enough (like Sqoop, HCatalog and so on), so you'll see many Java error stack traces and probably you'll want to hack the source code someday

Answer 2

  • It is not required for you to know Java.
  • As the others said, it would be very helpful depending on how complex your processing may be. However, there is an incredible amount you can do with just Pig and say Hive.
  • I would agree that it is fairly likely you will eventually need to write a user defined function (UDF), however, I've written those in Python, and it is very easy to write UDFs in Python.
  • Granted, if you have very stringent performance requirements, then a Java based MapReduce program would be the way to go. However, great advancements in performance are being made all of the time in both Pig and Hive.
  • So, the short answer to your question is, "No", it is not required for you to know Java in order to perform Hadoop development.

Source :
http://www.linkedin.com/groups/Is-it-must-Hadoop-Developer-988957.S.141072851

佞臣 2024-12-10 14:06:42

1)学习Java。没有办法解决这个问题,抱歉。

2)利润!之后一切都会变得非常容易——Hadoop 非常简单。

1) Learn Java. No way around that, sorry.

2) Profit! It'll be very easy after that -- Hadoop is pretty darn simple.

凑诗 2024-12-10 14:06:42

听起来你走在正确的轨道上。我建议在您的家用计算机上设置一些虚拟机,以开始采用您在书中看到的内容并在您的虚拟机中实现它们。与许多事情一样,要想在某件事上变得更好,唯一的方法就是练习。一旦您进入,我相信您将拥有足够的知识来启动一个小项目来实施 Hadoop。以下是人们使用 Hadoop 构建的一些示例:由 Hadoop 提供支持

It sounds like you are on the right track. I recommend setting up some Virtual Machines on your home computer to start taking what you see in the books and implementing them in your VMs. As with many things the only way to become better at something is to practice it. Once you get into I am sure you will have enough knowledge to start a small project to implement Hadoop with. Here are some examples of things people have built with Hadoop: Powered by Hadoop

诗笺 2024-12-10 14:06:42

在阅读 Yahoo Hadoop 教程 .stackoverflow.com/amzn/click/com/1449311520" rel="nofollow noreferrer">Hadoop 权威指南。雅虎教程让您对架构有一个非常清晰和容易的理解。
我认为书中的概念安排不合理。这使得研究它有点困难。
所以不要一起学习。首先浏览网络教程。

Go through the Yahoo Hadoop tutorial before going through Hadoop the definitive guide. The Yahoo tutorial gives you a very clean and easy understanding of the architecture.
I think the concepts are not arranged properly in the Book. That makes it a little difficult to study it.
So do not study it together. Go through the web tutorial first.

风情万种。 2024-12-10 14:06:42

我刚刚整理了一篇关于这个主题的论文。上面的资源很好,但我想您会在这里找到一些额外的提示: http://images .globalknowledge.com/wwwimages/whitepaperpdf/WP_CL_Learning_Hadoop.pdf

I just put together a paper on this topic. Great resources above, but I think you'll find some additional pointers here: http://images.globalknowledge.com/wwwimages/whitepaperpdf/WP_CL_Learning_Hadoop.pdf

尸血腥色 2024-12-10 14:06:42

欢迎加入我关于大数据的博客 - https://oyermolenko.blog。我使用 Hadoop 已经有几年了,希望在这篇博客中分享我从早期开始的经验。我来自 .NET 环境,面临着从一种语言切换到另一种语言相关的一些挑战。我的博客面向那些没有使用过 Hadoop 但像您一样具有一些主要技术背景的人。我想逐步涵盖整个大数据服务系列,描述我在使用它们时遇到的概念和常见问题。希望你会喜欢它

Feel free to join my blog about Big Data - https://oyermolenko.blog. I’ve been working with Hadoop for a couple of years and in this blog want to share my experience from the early start. I came from .NET environment and faced a couple of challenges related to switching from one language into another. My blog is oriented on people who didn’t work with Hadoop but have some primary technical background like you. Step by step I want to cover the whole family of Big Data services, describe the concepts and common problems I met working with them. Hope you will enjoy it

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文