当前位置：文江博客话题详情

有人可以向我解释为什么我需要函数式编程而不是面向对象编程吗？

发布于 2024-10-11 16:40:00 字数 320 浏览 10 评论 0原文

可能的重复：
函数式编程与面向对象编程

有人可以向我解释为什么我需要函数式编程吗？编程而不是OOP？

例如，为什么我需要使用 Haskell 而不是 C++（或类似的语言）？

函数式编程相对于 OOP 有哪些优势？

原文

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

爱人如己 2024-10-18 16:40:00

我在函数式编程中最喜欢的一大事情是缺乏“幽灵般的远距离动作”。所见即所得——仅此而已。这使得代码更容易推理。

让我们用一个简单的例子。假设我在 Java (OOP) 或 Erlang（函数式）中遇到了代码片段 X = 10。在 Erlang 中，我可以很快知道这些事情：

变量 X 位于我所处的直接上下文中。就这样。它要么是传递给我正在读取的函数的参数，要么是第一次（也是唯一的—参见下文）被分配的参数。
从此时起，变量 X 的值为 10。在我正在阅读的代码块内它不会再次改变。它不能。

在 Java 中，情况更复杂：

变量 X 可能被定义为参数。
它可能在方法的其他地方定义。
它可能被定义为该方法所在类的一部分。
无论如何，由于我没有在这里声明它，所以我正在更改它的值。这意味着我不知道 X 的值是什么，除非不断向后扫描代码以找到它被显式或隐式分配或修改的最后位置（例如在 for 循环中）。
当我调用另一个方法时，如果 X 碰巧是一个类变量，它可能会从我下面发生变化，如果不检查该方法的代码，我就无法知道这一点。
在线程程序的上下文中，情况甚至更糟。 X 可以被我在直接环境中看不到的东西改变。另一个线程可能正在调用 #5 中修改 X 的方法。

而Java是一种相对简单的OOP语言。在 C++ 中可以使用 X 的方式数量甚至更多，而且可能更加晦涩难懂。

事情是这样的？这只是一个简单示例，说明 OOP（或其他命令式）语言中的常见操作比函数式语言中的常见操作要复杂得多。它还没有解决不涉及可变状态等的函数式编程的好处，例如高阶函数。

回复收藏 0 原文

各空 2024-10-18 16:40:00

我认为 Haskell 有以下三点非常酷：

1) 它是一种静态类型语言，具有极强的表现力，可以让您快速构建高度可维护和可重构的代码。 Java 和 C# 等静态类型语言与 Python 和 Ruby 等动态语言之间存在着激烈的争论。 Python 和 Ruby 可让您快速构建程序，所需行数仅为 Java 或 C# 等语言的一小部分。因此，如果您的目标是快速进入市场，Python 和 Ruby 是不错的选择。但是，由于它们是动态的，因此重构和维护代码很困难。在Java中，如果您想向方法添加参数，可以很容易地使用IDE找到该方法的所有实例并修复它们。如果您错过了一个，编译器会捕获它。对于 Python 和 Ruby，重构错误只会被捕获为运行时错误。因此，对于传统语言，您一方面需要在快速开发和糟糕的可维护性之间做出选择，另一方面又要在缓慢的开发和良好的可维护性之间做出选择。这两种选择都不是很好。

但使用 Haskell，您不必做出这种选择。 Haskell 是静态类型的，就像 Java 和 C# 一样。因此，您可以获得所有的可重构性、IDE 支持的潜力以及编译时检查。但同时，编译器可以推断类型。因此，它们不会像传统静态语言那样妨碍您。此外，该语言还提供许多其他功能，使您只需几行代码即可完成大量工作。因此，您可以获得 Python 和 Ruby 的开发速度以及静态语言的安全性。

2）并行性。由于函数没有副作用，因此编译器可以更轻松地并行运行事物，而不需要开发人员做太多工作。考虑以下伪代码：

a = f x
b = g y
c = h a b

在纯函数语言中，我们知道函数 f 和 g 没有副作用。因此，没有理由 f 必须在 g 之前运行。顺序可以交换，也可以同时运行。事实上，我们实际上根本不需要运行 f 和 g，直到函数 h 需要它们的值为止。在传统语言中情况并非如此，因为对 f 和 g 的调用可能会产生副作用，可能需要我们以特定顺序运行它们。

随着计算机上的内核越来越多，函数式编程变得越来越重要，因为它允许程序员轻松地利用可用的并行性。

3) Haskell 最后一个很酷的事情也可能是最微妙的：惰性求值。要理解这一点，请考虑编写一个程序来读取文本文件并打印出文件每一行中单词“the”出现的次数的问题。假设您正在使用传统的命令式语言进行编写。

尝试 1：编写一个函数来打开文件并一次读取一行。对于每一行，您计算“the”的数量，然后将其打印出来。这很好，除了你的主要逻辑（计算单词数）与你的输入和输出紧密耦合。假设您想在其他上下文中使用相同的逻辑？假设您想从套接字读取文本数据并计算字数？或者您想从用户界面读取文本？你必须重新重写你的逻辑！

最糟糕的是，如果您想为新代码编写自动化测试怎么办？您必须构建输入文件、运行代码、捕获输出，然后将输出与预期结果进行比较。这是可行的，但是很痛苦。一般来说，当 IO 与逻辑紧密耦合时，测试逻辑就变得非常困难。

尝试2：那么，让我们将IO和逻辑解耦。首先，将整个文件读入内存中的一个大字符串。然后，将字符串传递给一个函数，该函数将字符串分成几行，对每行上的“the”进行计数，并返回计数列表。最后，程序可以循环计数并输出它们。现在很容易测试核心逻辑，因为它不涉及 IO。现在可以轻松地将核心逻辑与来自文件、套接字或 UI 的数据结合使用。所以，这是一个很好的解决方案，对吗？

错误的。如果有人传入 100GB 的文件怎么办？你会耗尽你的记忆，因为整个文件必须加载到一个字符串中。

尝试 3：围绕读取文件和生成结果构建抽象。您可以将这些抽象视为两个接口。第一个有方法 nextLine() 和 did()。第二个有outputCount()。您的主程序实现 nextLine() 和 did() 来从文件中读取，而 outputCount() 只是直接打印出计数。这允许您的主程序在恒定内存中运行。您的测试程序可以使用此抽象的替代实现，其中 nextLine() 和 did() 从内存中提取测试数据，而 outputCount() 检查结果而不是输出结果。

第三次尝试在分离逻辑和 IO 方面效果很好，并且它允许您的程序在恒定内存中运行。但是，它比前两次尝试要复杂得多。

简而言之，传统的命令式语言（无论是静态还是动态）经常让开发人员在以下两者之间做出选择：

a）IO 和逻辑的紧密耦合（难以测试和重用）

b）将所有内容加载到内存中（效率不高）

c）构建抽象（复杂，并且会减慢实现速度）

这些选择在读取文件、查询数据库、读取套接字等时出现。程序员似乎更倾向于选项 A，而单元测试因此而受到影响。

那么，Haskell 如何帮助解决这个问题呢？在 Haskell 中，您可以像尝试 2 一样解决这个问题。主程序将整个文件加载到一个字符串中。然后它调用一个函数来检查字符串并返回计数列表。然后主程序打印计数。由于核心逻辑与 IO 隔离，因此测试和重用它非常容易。

但是内存使用情况又如何呢？ Haskell 的惰性求值会帮你解决这个问题。因此，即使您的代码看起来将整个文件内容加载到字符串变量中，但实际上并未加载整个内容。相反，仅在消耗字符串时才读取文件。这允许一次读取一个缓冲区，并且您的程序实际上将在常量内存中运行。也就是说，你可以在100GB的文件上运行这个程序，并且它会消耗很少的内存。

同样，您可以查询数据库，构建包含大量行的结果列表，并将其传递给函数进行处理。处理函数不知道这些行来自数据库。因此，它与其 IO 解耦。在幕后，行列表将被延迟且高效地获取。因此，尽管您查看代码时看起来像这样，但完整的行列表永远不会同时全部位于内存中。

最终结果是，您可以测试处理数据库行的函数，甚至根本不需要连接到数据库。

惰性评估确实很微妙，你需要一段时间才能了解它的威力。但是，它允许您编写易于测试和重用的漂亮简单的代码。

这是最终的 Haskell 解决方案和方法 3 Java 解决方案。两者都使用常量内存并将 IO 与处理分开，以便测试和重用很容易。

哈斯克尔：

module Main
    where

import System.Environment (getArgs)
import Data.Char (toLower)

main = do
  (fileName : _) <- getArgs
  fileContents <- readFile fileName
  mapM_ (putStrLn . show) $ getWordCounts fileContents

getWordCounts = (map countThe) . lines . map toLower
    where countThe = length . filter (== "the") . words

Java：

import java.io.BufferedReader;
import java.io.File;
import java.io.FileReader;
import java.io.Reader;

class CountWords {
    public interface OutputHandler {
        void handle(int count) throws Exception;
    }

    static public void main(String[] args) throws Exception {
        BufferedReader reader = null;
        try {
            reader = new BufferedReader(new FileReader(new File(args[0])));

            OutputHandler handler = new OutputHandler() {
                public void handle(int count) throws Exception {
                    System.out.println(count);
                }
            };

            countThe(reader, handler);
        } finally {
            if (reader != null) reader.close();
        }
    }

    static public void countThe(BufferedReader reader, OutputHandler handler) throws Exception {
        String line;
        while ((line = reader.readLine()) != null) {
            int num = 0;
            for (String word: line.toLowerCase().split("([.,!?:;'\"-]|\\s)+")) {
                if (word.equals("the")) {
                    num += 1;
                }
            }
            handler.handle(num);
        }
    }
}

There are three things about Haskell that I think are really cool:

1) It's a statically-typed language that is extremely expressive and lets you build highly maintainable and refactorable code quickly. There's been a big debate between statically typed languages like Java and C# and dynamic languages like Python and Ruby. Python and Ruby let you quickly build programs, using only a fraction of the number of lines required in a language like Java or C#. So, if your goal is to get to market quickly, Python and Ruby are good choices. But, because they're dynamic, refactoring and maintaining your code is tough. In Java, if you want to add a parameter to a method, it's easy to use the IDE to find all instances of the method and fix them. And if you miss one, the compiler catches it. With Python and Ruby, refactoring mistakes will only be caught as run-time errors. So, with traditional languages, you get to choose between quick development and lousy maintainability on the one hand and slow development and good maintainability on the other hand. Neither choice is very good.

But with Haskell, you don't have to make this type of choice. Haskell is statically typed, just like Java and C#. So, you get all the refactorability, potential for IDE support, and compile-time checking. But at the same time, types can be inferred by the compiler. So, they don't get in your way like they do with traditional static languages. Plus, the language offers many other features that allow you to accomplish a lot with only a few lines of code. So, you get the speed of development of Python and Ruby along with the safety of static languages.

2) Parallelism. Because functions don't have side effects, it's much easier for the compiler to run things in parallel without much work from you as a developer. Consider the following pseudo-code:

a = f x
b = g y
c = h a b

In a pure functional language, we know that functions f and g have no side effects. So, there's no reason that f has to be run before g. The order could be swapped, or they could be run at the same time. In fact, we really don't have to run f and g at all until their values are needed in function h. This is not true in a traditional language since the calls to f and g could have side effects that could require us to run them in a particular order.

As computers get more and more cores on them, functional programming becomes more important because it allows the programmer to easily take advantage of the available parallelism.

3) The final really cool thing about Haskell is also possibly the most subtle: lazy evaluation. To understand this, consider the problem of writing a program that reads a text file and prints out the number of occurrences of the word "the" on each line of the file. Suppose you're writing in a traditional imperative language.

Attempt 1: You write a function that opens the file and reads it one line at a time. For each line, you calculate the number of "the's", and you print it out. That's great, except your main logic (counting the words) is tightly coupled with your input and output. Suppose you want to use that same logic in some other context? Suppose you want to read text data off a socket and count the words? Or you want to read the text from a UI? You'll have to rewrite your logic all over again!

Worst of all, what if you want to write an automated test for your new code? You'll have to build input files, run your code, capture the output, and then compare the output against your expected results. That's do-able, but it's painful. Generally, when you tightly couple IO with logic, it becomes really difficult to test the logic.

Attempt 2: So, let's decouple IO and logic. First, read the entire file into a big string in memory. Then, pass the string to a function that breaks the string into lines, counts the "the's" on each line, and returns a list of counts. Finally, the program can loop through the counts and output them. It's now easy to test the core logic since it involves no IO. It's now easy to use the core logic with data from a file or from a socket or from a UI. So, this is a great solution, right?

Wrong. What if someone passes in a 100GB file? You'll blow out your memory since the entire file must be loaded into a string.

Attempt 3: Build an abstraction around reading the file and producing results. You can think of these abstractions as two interfaces. The first has methods nextLine() and done(). The second has outputCount(). Your main program implements nextLine() and done() to read from the file, while outputCount() just directly prints out the count. This allows your main program to run in constant memory. Your test program can use an alternate implementation of this abstraction that has nextLine() and done() pull test data from memory, while outputCount() checks the results rather than outputting them.

This third attempt works well at separating the logic and the IO, and it allows your program to run in constant memory. But, it's significantly more complicated than the first two attempts.

In short, traditional imperative languages (whether static or dynamic) frequently leave developers making a choice between

a) Tight coupling of IO and logic (hard to test and reuse)

b) Load everything into memory (not very efficient)

c) Building abstractions (complicated, and it slows down implementation)

These choices come up when reading files, querying databases, reading sockets, etc. More often than not, programmers seem to favor option A, and unit tests suffer as a consequence.

So, how does Haskell help with this? In Haskell, you would solve this problem exactly like in Attempt 2. The main program loads the whole file into a string. Then it calls a function that examines the string and returns a list of counts. Then the main program prints the counts. It's super easy to test and reuse the core logic since it's isolated from the IO.

But what about memory usage? Haskell's lazy evaluation takes care of that for you. So, even though your code looks like it loaded the whole file contents into a string variable, the whole contents really aren't loaded. Instead, the file is only read as the string is consumed. This allows it to be read one buffer at a time, and your program will in fact run in constant memory. That is, you can run this program on a 100GB file, and it will consume very little memory.

Similarly, you can query a database, build a resulting list containing a huge set of rows, and pass it to a function to process. The processing function has no idea that the rows came from a database. So, it's decoupled from its IO. And under-the-covers, the list of rows will be fetched lazily and efficiently. So, even though it looks like it when you look at your code, the full list of rows is never all in memory at the same time.

End result, you can test your function that processes the database rows without even having to connect to a database at all.

Lazy evaluation is really subtle, and it takes a while to get your head around its power. But, it allows you to write nice simple code that is easy to test and reuse.

Here's the final Haskell solution and the Approach 3 Java solution. Both use constant memory and separate IO from processing so that testing and reuse are easy.

Haskell:

module Main
    where

import System.Environment (getArgs)
import Data.Char (toLower)

main = do
  (fileName : _) <- getArgs
  fileContents <- readFile fileName
  mapM_ (putStrLn . show) $ getWordCounts fileContents

getWordCounts = (map countThe) . lines . map toLower
    where countThe = length . filter (== "the") . words

Java:

import java.io.BufferedReader;
import java.io.File;
import java.io.FileReader;
import java.io.Reader;

class CountWords {
    public interface OutputHandler {
        void handle(int count) throws Exception;
    }

    static public void main(String[] args) throws Exception {
        BufferedReader reader = null;
        try {
            reader = new BufferedReader(new FileReader(new File(args[0])));

            OutputHandler handler = new OutputHandler() {
                public void handle(int count) throws Exception {
                    System.out.println(count);
                }
            };

            countThe(reader, handler);
        } finally {
            if (reader != null) reader.close();
        }
    }

    static public void countThe(BufferedReader reader, OutputHandler handler) throws Exception {
        String line;
        while ((line = reader.readLine()) != null) {
            int num = 0;
            for (String word: line.toLowerCase().split("([.,!?:;'\"-]|\\s)+")) {
                if (word.equals("the")) {
                    num += 1;
                }
            }
            handler.handle(num);
        }
    }
}

回复收藏 0 原文