如何将IO中的数据读取到数据结构中,然后对数据结构进行处理?
首先,我很抱歉做了“我从哪里开始”的典型事情,但我完全迷失了。
我已经读了“为伟大的利益而学习哈斯克尔”网站了,感觉已经有一个时代了(差不多半个学期了。我即将完成“输入和输出”一章,但我仍然没有知道如何编写多行程序
我已经看到了 do 语句,并且只能使用它将 IO 操作连接到单个函数中,但我不知道如何编写一个实际的应用程序。 我
有人能给我指出正确的方向吗?
有 C 背景,基本上我在大学的一个模块中使用了 Haskell,我想将 C++ 与 Haskell 进行比较(在很多方面)。我希望创建一系列搜索和排序程序,以便我可以评论它们在各自语言中的易用性及其速度。
但是,我真的开始失去使用 Haskell 的信心,因为已经六周了,并且。我仍然不知道如何编写一个完整的应用程序,而且我正在阅读的网站中的章节似乎越来越长。
我基本上需要创建一个将存储在结构中的基本对象(我知道该怎么做),更多我正在努力解决的是,如何创建一个从某些文本文件读取数据并填充的程序首先包含该数据的结构,然后继续处理它。由于 haskell 似乎分割了 IO 和其他操作,它不仅仅让我在程序中编写多行,我正在寻找这样的东西:
main = data <- getContent
let allLines = lines data
let myStructure = generateStruct allLines
sort/search/etc
print myStructure
我该怎么做?有什么好的教程可以帮助我开始实际的程序吗?
-一个
first off sorry for doing the typical thing of 'where do I begin', but I'm totally lost.
I've been reading the 'Learn you a haskell for great good' site for what feels like an age now (pretty much half a semester. I'm just about to finish the 'Input and Output' chapter, and I still have no clue how to write a multi line program.
I've seen the do statement, and that you can only use it to concat IO actions into a single function, but I can't see how I'm gonna go about writing a realistic application.
Can someone point me in the right direction.
I'm from a C background, and basically I'm using haskell for one of my modules this semester at uni, I want to compare C++ against haskell (in many aspects). I'm looking to create a series of searching and sorting programs so that I can comment on how easy they are in the respective languages versus their speed.
However, I'm really starting to loose my faith in using Haskell as its been six weeks, and I still have no idea how to write a complete application, and the chapters in the site I'm reading seem to be getting longer and longer.
I basically need to create a basic object which will be stored in the structure (which I know how to do), more what I'm struggling with is, how do I create a program which reads data in from some text file, and populates the structure with that data in the first place, then goes on to process it. As haskell seems to split IO and other operations and it won't just let me write multiple lines in a program, I'm looking for something like this:
main = data <- getContent
let allLines = lines data
let myStructure = generateStruct allLines
sort/search/etc
print myStructure
how do I go about this? any good tutorials which will help me get going with realistic programs?
-A
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(4)
您提到看到
do
表示法,现在是时候学习如何使用do
了。考虑您的示例main
是一个IO
,您应该使用 do 语法或绑定:所以现在您有一个获取
stdin
的 main,将其打开通过lines
转换为[String]
,大概将其解析为一个结构,并对该结构运行排序和搜索。请注意,有趣的代码都是纯的 -mySort
、mySearch
和generateStruct
不需要是 IO(也不可能是,因为在 let 绑定中),因此您实际上正确地一起使用了纯粹且有效的代码。我建议你看看bind是如何工作的(
>>=
)以及如何将符号脱糖到bind中。 这个问题应该有帮助。You mentioned seeing
do
notation, now it's time to learn how to usedo
. Consider your examplemain
is anIO
, you should be using do syntax or binds:So now you have a main that gets
stdin
, turns it into[String]
vialines
, presumably parses it into a structure and runs sorting and searches on that structure. Notice the interesting code is all pure -mySort
,mySearch
, andgenerateStruct
doesn't need to be IO (and can't be, being inside a let binding) so you are actually properly using pure and effectful code together.I suggest you look at how bind works (
>>=
) and how do notation desugars into bind. This SO question should help.另请参阅 Neil Mitchell 的解释没有 Monad 的 Haskell IO。
See also Explaining Haskell IO without Monads by Neil Mitchell.
我将尝试从一个简化的示例开始。假设这就是我们想要做的:
让我们也说我们有这些可以使用的函数:
为了让我们清楚,我将介绍一下这些函数:
getContent
:我编写了这个函数,但是如果有这样的函数,将是它的签名(您可以使用getContent = return [3,7,2,1]
测试目的)。我确信您以前见过这样的签名,并且至少模糊地理解,由于它确实进行 IO,因此它的签名不能只是getContent :: [Int]
。sort
:它是Data.List模块中定义的函数,用法很简单:sort [3,1,2]
返回[1,2,3]< /code>
reverse
:也在 Data.List 模块中定义:reverse [1,3,2]
返回[2,3,1]
show
:不需要导入任何东西,直接使用即可:show 11
返回字符串"11"
;show [1,2,3]
返回字符串"[1,2,3]"
等putStrLn :: Stiring -> ()
。好的,现在我们已经拥有了创建程序所需的一切,现在的问题是将这些函数连接在一起。让我们从连接函数开始:
getContent :: IO [Int]
和sort :: [Int] -> [Int]
我认为如果你得到了这一部分,你也将很容易得到其余的。因此,问题在于,由于
getContent
返回IO [Int]
而不仅仅是[Int]
,因此您不能忽略或摆脱IO
部分并将其推入sort
中。也就是说,您不能执行以下操作来连接这些函数:排序(getRidOfIO getContent)
这里是
>>= :: ma -> (a→mb)→ m b
操作可以解决这个问题。现在请注意,m
、a
和b
是类型变量,因此如果我们替换m
> 表示IO
,a
表示[Int]
,b
表示[Int]
,我们得到签名:>>= :: IO [Int] -> ([Int] -> IO [Int]) -> IO [Int]
再次查看这些
getContent
和sort
函数及其签名,并尝试考虑它们如何适合>>=
。我相信您会注意到,您可以直接使用getContent
作为>>=
的第一个参数。到目前为止,>>=
要做的就是将[Int]
取出getContent
并将其推入作为第二个参数提供的函数中。但是第二个参数中的函数是什么?我们不能使用 sort :: [Int] ->直接 [Int],我们可以尝试的下一个最好的方法是\listOfInts ->对 listOfInts 进行排序,
但仍然具有签名
[Int] -> [Int]
所以这没有多大帮助。这是另一个英雄登场的地方,return :: a ->; m a.
.同样,
a
和m
是类型变量,让我们替换它们,我们将得到return :: [Int] -> IO [Int]
因此添加
\listOfInts ->对 listOfInts
和return
一起排序,我们将得到:\listOfInts ->返回 $ sort listOfInts :: [Int] -> IO [Int]
这正是我们想要作为
>>=
的第二个参数的内容。因此,最后让我们使用粘合剂将getContent
和sort
连接在一起:getContent >>= (\listOfInts -> return $ sort listOfInts)
这与(使用
do
表示法)是一样的:到了,最可怕的部分就结束了。现在可能会出现一个顿悟时刻,尝试思考我们刚刚建立的连接的结果类型是什么。我会破坏它,...
getContent >>= (\listOfInts -> return $ sort listOfInts)
的类型是IO [Int]
再次。让我们总结一下:我们采用了
IO [Int]
类型和[Int] -> 类型的内容。 [Int]
,将这两个东西粘合在一起,并再次得到IO [Int]
类型的东西!现在继续尝试完全相同的事情:获取我们刚刚创建的 IO [Int] 对象并将其粘合在一起(使用
>>=
和return
) 和reverse :: [Int] -> [整数]
。我想我写得太多了,但是如果有任何不清楚的地方或者您是否需要其余部分的帮助,请告诉我。
到目前为止我所描述的内容可能如下所示:
I'll try to start with a simplified example. Let's say this is what we want to do:
Let's also say that we have these functions that we can use:
Just so we are clear, I'll have a word about these functions:
getContent
: I made up this function, but if there was such function that would be it's signature (you can usegetContent = return [3,7,2,1]
for testing purposes). I'm sure you've seen such signature before and at least vaguely understand that since it does IO its signature can not be justgetContent :: [Int]
.sort
: It's a function defined in Data.List module, usage is simple:sort [3,1,2]
returns[1,2,3]
reverse
: Also defined in Data.List module:reverse [1,3,2]
returns[2,3,1]
show
: don't need to import anything, just use it:show 11
returns the string"11"
;show [1,2,3]
returns the string"[1,2,3]"
, etc.putStrLn :: Stiring -> ()
.OK, now we have all we need to create our program, the problem now is about connecting these functions together. Let's start with connecting functions:
getContent :: IO [Int]
withsort :: [Int] -> [Int]
I think if you get this part, you'll easily get the rest as well. So, the problem is that since
getContent
returnsIO [Int]
and not just[Int]
, you can't just ignore or get rid of theIO
part and shove it intosort
. That is, this is what you can not do to connect these functions:sort (getRidOfIO getContent)
Here is where the
>>= :: m a -> (a -> m b) -> m b
operation comes to the rescue. Now notice thatm
,a
andb
are type variables so if we substitutem
forIO
,a
for[Int]
andb
for[Int]
, we get the signagure:>>= :: IO [Int] -> ([Int] -> IO [Int]) -> IO [Int]
Have a look again at those
getContent
andsort
functions and their signatures and try to think about how they'll fit into the>>=
. I'm sure you'll notice that you can usegetContent
directly as the first argument to>>=
. So far what>>=
will do is take the[Int]
outgetContent
and shoves it into the function provided as a second argument. But what will be the function in the second argument? We can't use thesort :: [Int] -> [Int]
directly, the next best thing we can try is\listOfInts -> sort listOfInts
but that still has signature
[Int] -> [Int]
so that did not help much. Here is where the other hero comes to the play, thereturn :: a -> m a
.Again,
a
andm
are type variables, lets substitute them and we will getreturn :: [Int] -> IO [Int]
so adding
\listOfInts -> sort listOfInts
andreturn
together we will get:\listOfInts -> return $ sort listOfInts :: [Int] -> IO [Int]
Which is exactly what we want to put as a second argument to
>>=
. So lets finaly connectgetContent
andsort
using our glue together:getContent >>= (\listOfInts -> return $ sort listOfInts)
which is the same thing as (using the
do
notation):There, that is the end of the most terrifying part. And now comes possibly one of the aha moments, try to think about what is the result type of the connection we just made up. I'll spoil it for you,... the type of
getContent >>= (\listOfInts -> return $ sort listOfInts)
isIO [Int]
again.Lets summarize: we took something of type
IO [Int]
and something of type[Int] -> [Int]
, glued those two things together and got again something of typeIO [Int]
!Now go ahead and try exactly the same thing: Take the
IO [Int]
object we have just created and glue it together (using>>=
andreturn
) withreverse :: [Int] -> [Int]
.I think I wrote way too much, but let me know if anything was not clear or if you need help with the rest.
Wha I've described so far can look something like this:
如果是从
stdin
读取并将结果写入stdout
的问题,而无需进一步干预用户输入 - 正如您提到的getContents
建议——然后是古老的interact :: (String -> String) -> IO()
,或者其他几个版本,例如Data.ByteString.interact :: (ByteString -> ByteString) -> IO()
或Data.Text.interact::(Text -> Text) -> IO()
就是所需要的全部。interact
基本上是“用这个函数制作一个小unix工具”函数——它将正确类型的纯函数映射到可执行操作(即类型的值) code>IO()。)所有 Haskell 教程都应该在第三页或第四页提到它,并附有编译说明。因此,如果您
使用 ghc --make -O2 Reverse.hs -overse 编写和编译,那么您通过管道传输到 ./reverse 的任何内容都将被理解为字符列表,并且出现逆转。同样,无论您通过管道
传输到什么,都会出现省略空行的情况。更有趣的是,
将获取由换行符分隔的字符流,将它们读取为 Int,删除奇数字符,并生成适当过滤的流。
将打印换行符分隔的数字的平方和。
在最后两种情况下,
luther
和emma
内部“数据结构”是 [Int],这非常乏味,并且应用于它的函数非常简单,课程。要点是让一种interact
形式来处理所有 IO,从而让“填充结构”和“处理它”之类的图像从您的脑海中消失。要使用interact
,您需要使用组合来使整体产生某种String ->字符串函数。但即使在这里,正如第一个例子中的那样
arthur:: String -> String
您正在定义一个更像数学意义上的真正函数。String
和ByteString
类型中的值与Bool
或Int
中的值一样纯粹。在这种基本
交互
类型的更复杂的情况下,您的任务首先是考虑如何将您将关注的函数的所需纯值映射到String
值(这里,对于 Int 来说只是show
,或者对于[Int]
来说是unlines .map show
) 。interact
知道如何处理字符串。 -- 然后弄清楚如何定义从字符串或 ByteString(将包含“原始”数据)到主函数作为参数的一个或多个类型中的值的纯映射。在这里我只是使用map read 。行
产生[Int]
。如果您正在处理一些更复杂的树结构,您需要一个从[Int]
到MyTree Int
的函数。当然,放置在这个位置的更复杂的函数是解析器。然后你就可以进城了,在这种情况下:根本没有理由认为自己是“编程”、“填充”和“处理”。这就是 LYAH 的所有炫酷功能发挥作用的地方。您的职责是在特定的定义规则内定义映射。在最后两种情况下,它们是从
[Int]
到[Int]
以及从[Int]
到Int
>,但这里有一个类似的示例,源自 优秀但仍不完整的教程 超级优秀的Vector
包,其中处理初始数字结构是Vector Int
这里,
roman
又是低能的,从 Int 向量到 Int 的任何函数,无论多么复杂,都可以取代它。写一个更好的roman
永远不会是“填充”“多行编程”“处理”等问题,尽管我们当然是这样说的;这只是一个通过 Data.Vector 和其他地方的函数组合来定义真正函数的问题。天空是极限,也请查看该教程。If it is a question of reading from
stdin
and writing a result tostdout
, with no further intevening user input -- as your mention ofgetContents
suggests -- then the ancientinteract :: (String -> String) -> IO ()
, or the several other versions, e.g.Data.ByteString.interact :: (ByteString -> ByteString) -> IO ()
orData.Text.interact :: (Text -> Text) -> IO()
are all that are needed.interact
is basically the 'make a little unix tool out of this function' function -- it maps pure functions of the right type to executable actions (i.e. values of the typeIO()
.) All Haskell tutorials should mention it on the third or fourth page, with instructions on compilation.So if you write
and compile with
ghc --make -O2 Reverse.hs -o reverse
then whatever you pipe to./reverse
will be understood as a list of characters and emerge reversed. Similarly, whatever you pipe towill emerge with the empty lines omitted. More interestingly,
will take a stream of characters separated by newlines, read them as
Int
s, removing the odd ones, and yielding the suitably filtered stream.will print the sum of the squares of the newline-separated numerals.
In these last two cases,
luther
andemma
the internal 'data structure' is [Int], which is pretty dull, and the function applied to it is idiot simple, of course. The main point is to let one of the forms ofinteract
take care of all of the IO, and thus get images like 'populating a structure' and 'processing it' out of your head. To useinteract
you need to use composition to make the whole yield some sort ofString -> String
function. But even here, as in the runt first examplearthur:: String -> String
you are defining a genuine function in something more like the mathematical sense. Values in the typesString
andByteString
are just as pure as those inBool
orInt
.In more complicated cases of this basic
interact
type, your task is thus, first, to think how the desired pure values of the function you will be focussing on can be mapped toString
values (here, it's justshow
for an Int orunlines . map show
for a[Int]
).interact
knows what to "do" with the string. -- And then to figure out how to define a pure mapping from Strings or ByteString (which will contain your 'raw' data) to values in the type or types your principal function takes as arguments. Here I was just usingmap read . lines
resulting in a[Int]
. If you are working on some more complicated, say tree structure you'd need a function from[Int]
toMyTree Int
. A more elaborate function to put in this position would be a Parser, of course.Then you can go to town, in this sort of case: there is really no reason to think of yourself as 'programming', 'populating' and 'processing' at all. This is where all the cool devices of
LYAH
kick in. Your duty is to define a mapping within the specific definitional discipline. In the last two cases, these are from[Int]
to[Int]
and from[Int]
toInt
, but here is a similar example derived from the excellent, still incomplete, tutorial on the super-excellentVector
package where the initial numerical structure one is dealing with isVector Int
Here again
roman
is moronic, any function from a Vector of Ints to an Int, however complex, can take its place. Writing a betterroman
will never be a question of "populating" "multi-line programming" "processing" etc., though of course we speak this way; it is just a question of defining a genuine function by composition of the functions in Data.Vector and elsewhere. The sky is the limit, check out that tutorial too.