流式数据实时处理 storm内为什么选择clojure,而类似的jstorm的性能也不输它?

关注者
49
被浏览
12,167

4 个回答

这种事情当然要问作者本人了[1]:

Why you choose Clojure as the development language of Storm? Could you talk about your long practical experience about using this language (like its advantages and disadvantages)? Which feature won't appear in the Storm, if you were not using Clojure?

Clojure is the best language I've ever used, by far. I use it because it makes me vastly more productive by allowing me to easily use techniques like immutability and functional programming. Its dynamic nature by being Lisp-based ensures that I can always mold Clojure as necessary to formulate the best possible abstractions. Storm would not be any different if I didn't use Clojure, it just would have been far more painful to build.


What do you do to improve your skills as a programmer?

I get better by doing a lot of programming and trying new things. One of the best ways to become a better programmer is to learn new programming languages. By learn I mean more than just learning the syntax of the language, I mean understanding the language's idioms and writing something substantial in it. For me, learning Clojure made me a much better programmer in all languages.

还有邮件组里的[2]:

"Is Storm mostly written in Java?"

If you look at the languages graph on Github, it says that Storm is
"64% Java". However, this is inaccurate because those numbers include
the Java code generated by the Thrift compiler. If you exclude the
generated code, you'll find that Storm is over 50% Clojure in terms of
line count. In terms of functionality though, Storm is around 98%
Clojure. The Java code I wrote is mostly interfaces and small classes
that a user of Storm would encounter in the public API (Java is, ahem,
verbose).

"Why isn't Storm written completely in Clojure?"

I want Storm to be as accessible to as wide an audience as possible. A
user's language preference or constraints shouldn't prevent them from
being able to use Storm to solve their realtime computation problems.
This is why I chose to define Storm's main interfaces in Java, and
this is also why Storm supports using any language (including non-JVM
ones) on top of Storm. That said, Storm has a Clojure DSL for
programming topologies which is what I personally use for developing
topologies.

Clojure was a magnificent language to use to build Storm. Storm is a
complex, intricate system, and Clojure helped a great deal in managing
the complexity of the implementation.

所以:

  1. Nathan喜欢Clojure
  2. Nathan认为Clojure的抽象机制非常强大,能够很好的构建他需要的系统

阿里也不是第一次将开源项目二次开发了,nginx不也被拿去改过么。jstorm的性能要是不如原版那才怪了。开发原因无外乎几个:不能满足特定需求;clojure的熟手不好招等等。


[1]

nathanmarz.com/blog/int

[2]

groups.google.com/forum

对storm不熟悉,但从spark和hadoop mr的对比来看,作为数据处理和并行计算平台选择本身支持传递函数的语言可以大大简化代码,也会降低用户使用时的难度。