Hive QL 是否具有与直接在 Hadoop 上编写自己的 MapReduce 作业相同的表达能力?
换句话说,
是否有一个问题可以通过直接定义 MapReduce 作业来解决,但无法形成 Hive QL 查询?
如果是,则意味着 Hive QL 的表达能力有限,无法表达所有可能的 MapReduce 作业。
实际上,这意味着 Hive QL 并不能完全替代定义您自己的 MapReduce 作业。
To put in other words,
Is there a problem that can be solved by directly defining your map reduce jobs, but for which you cannot form a Hive QL query?
If yes, then it means that Hive QL is limited in it's expressive power and cannot express all possible map reduce jobs.
Practically, that means Hive QL is not a complete replacement for defining your own Map Reduce jobs.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
Hive QL 并不表达可以用 MapReduce 编写的所有内容。总会有这样的情况:您知道一些 Hive 无法推断的数据。
如果您正在考虑的话,我认为出于这个原因避免使用 Hive 是不公平的。 SQL 在通过平面文件编写算法时也存在同样的问题。如果设计得当,通过从头开始编写算法,您总是可以做得至少与查询语言一样好或更好。
Hive QL does not express everything that can be written with MapReduce. There will always be cases when you know something about the data that Hive cannot infer.
I don't think it would be fair to avoid using Hive for this reason, if that's what you're considering. SQL has the same problem with relation to writing an algorithm over flat files. Properly designed, you can always do at least as good or better than a query language by writing an algorithm from scratch.