在GO中刮擦网站时没有回复
我正在尝试使用Go和Colly来刮擦有关Zillow上一些列表的一些细节。这是我正在使用的脚本:
package main
import (
"encoding/csv"
"log"
"os"
"time"
"github.com/gocolly/colly"
"github.com/gocolly/colly/proxy"
)
func main() {
// filename for data
fName := "data.csv"
// create a file
file, err := os.Create(fName)
// check for errors
if err != nil {
log.Fatalf("Could not create file, error : %q", err)
return
}
// close file afterwards
defer file.Close()
// instantiate a csv writer
writer := csv.NewWriter(file)
// flush contents afterwards
defer writer.Flush()
// instantiate a collector
c := colly.NewCollector(
colly.AllowedDomains("https://www.zillow.com/austerlitz-ny/sold/"),
)
// point to the webpage structure you need to fetch
c.OnHTML(".list-card-info", func(e *colly.HTMLElement) {
// write the desired data into csv
writer.Write([]string{
e.ChildText("h1"),
e.ChildText("a"),
})
})
// show completion
log.Printf("Scraping Finished\n")
log.Println(c)
}
该脚本似乎没有错误,但也没有收集数据。终端将其记录为“提出的请求:0(0响应)|回调:onrequest:0,onhtml:1,onResponse:0,onerror:0”,data.csv也为空。
关于为什么会发生这种情况以及如何解决它的任何想法?
I'm trying to use Go and Colly to scrape a few details about some listings on Zillow. Here's the script I'm using:
package main
import (
"encoding/csv"
"log"
"os"
"time"
"github.com/gocolly/colly"
"github.com/gocolly/colly/proxy"
)
func main() {
// filename for data
fName := "data.csv"
// create a file
file, err := os.Create(fName)
// check for errors
if err != nil {
log.Fatalf("Could not create file, error : %q", err)
return
}
// close file afterwards
defer file.Close()
// instantiate a csv writer
writer := csv.NewWriter(file)
// flush contents afterwards
defer writer.Flush()
// instantiate a collector
c := colly.NewCollector(
colly.AllowedDomains("https://www.zillow.com/austerlitz-ny/sold/"),
)
// point to the webpage structure you need to fetch
c.OnHTML(".list-card-info", func(e *colly.HTMLElement) {
// write the desired data into csv
writer.Write([]string{
e.ChildText("h1"),
e.ChildText("a"),
})
})
// show completion
log.Printf("Scraping Finished\n")
log.Println(c)
}
The script seems to run with no errors, but also collects no data. Terminal records it as "Requests made: 0 (0 responses) | Callbacks: OnRequest: 0, OnHTML: 1, OnResponse: 0, OnError: 0" and the data.csv is empty as well.
Any idea on why this is happening and how to resolve it?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
data:image/s3,"s3://crabby-images/d5906/d59060df4059a6cc364216c4d63ceec29ef7fe66" alt="扫码二维码加入Web技术交流群"
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
您应该先阅读Colly示例。波纹管是一个演示例子。仅在使用
C.Visit
时,Colly开始请求并获取解析数据。You should read colly example first. Bellow is a demo example. Only when using
c.Visit
, the colly start request and get data for parse.