F# 从序列中获取项目
我正在尝试学习 F#
我想做的是下载一个网页,将其拆分为一个序列,然后找到一个项目的索引,并获取其后的下 3 个项目。
这是代码——有人可以告诉我我做错了什么吗?
let find = "<head>"
let page = downloadUrl("http://www.stackoverflow.com")
let lines = seq ( page.Replace("\r", System.String.Empty).Split([|"\n"|], StringSplitOptions.RemoveEmptyEntries) )
let pos = lines |> Seq.findIndex(fun a -> a == find) // getting a Exception of type 'System.Collections.Generic.KeyNotFoundException' was thrown.
let result = // now to get the next 3 items
printfn "%A" (Seq.toList result);;
Im trying to learn F#
What I would like to do is download a webpage, split it into a sequence then find the index of an item and take the next 3 items after it.
Heres the code -- can someone show me what Im doing wrong please?
let find = "<head>"
let page = downloadUrl("http://www.stackoverflow.com")
let lines = seq ( page.Replace("\r", System.String.Empty).Split([|"\n"|], StringSplitOptions.RemoveEmptyEntries) )
let pos = lines |> Seq.findIndex(fun a -> a == find) // getting a Exception of type 'System.Collections.Generic.KeyNotFoundException' was thrown.
let result = // now to get the next 3 items
printfn "%A" (Seq.toList result);;
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
您正在执行一些 F# 文本处理。以下是一些可能出现的问题:
下载 HTML 页面后,您没有进行任何预处理,例如删除所有 HTML 标记。
page.Replace("\r", System.String.Empty).Split([|"\n"|]
是有问题的,因为我猜你想将项目/单词分开。此行仅拆分行。let pos =lines |> Seq.findIndex(fun a -> a == find)
更改==<。 /code> 到
=
。在 F# 中,=
是用于比较的布尔运算符。
let result =lines |> Seq.take pos
仅采用前pos
项,您应该跳过这些项,然后将pos
项作为。在::
So you are doing some F# text processing. Here are some possible problems:
After you downloaded the HTML page, you didn't do any preprocessing, say remove all HTML tags.
page.Replace("\r", System.String.Empty).Split([|"\n"|]
is problematic because I guess you want to split the items/words out. This line only splits lines out.let pos = lines |> Seq.findIndex(fun a -> a == find)
change==
to=
. In F#,=
is the boolean operator for comparison.let result = lines |> Seq.take pos
only takes the firstpos
items. You should skip these items and then takepos
items as in:.
此行会跳过找到的项目之前的所有内容,而不是其后的 3 个项目。
编辑: 如果搜索的项目不存在,
Seq.findIndex
将失败。您需要Seq.tryFindIndex
:This line skips everything before the found item, not takes the 3 items after it.
EDIT:
Seq.findIndex
fails if the item searched for doesn't exist. You wantSeq.tryFindIndex
: