从Google电子表格中获取Wikipedia的特定信息(不是整个桌子)
我有一个来自Wikipedia的“ Lead Rolling演员”的桌子,我想在每个演员的出生日期,活跃的日期等日期中添加一些列。
这是我第一次使用importxml formula,但是对于Robert Downey来说,我正在尝试JR,我正在尝试尝试以下:
- 天生:= importxml(g1!,“ // span [@class ='bday']”)
< SPAN类=“ bday”> 1965-04-04</ span>
- 年活动:= importxml(g1!,“ // td [@class ='infobox-data']”)
< td class =“ infobox-data”> 1970 – present</ td>
在这两种情况下,它都会给我错误。我在做什么错?我看着 https://wwww.benlcollins.com/spreadsheets/google-legoogle--google--google--google--google--scro- Sheet-Web-Scraper/获得一些指导,但找不到错误。
I have a table from "Lead rolling actors" from Wikipedia and I want to add some columns to the table with the dates of birth, years active etc for every actor.
It's the first time I use IMPORTXML formula but for Robert Downey Jr I am trying the following:
-Born: =IMPORTXML(G1!,"//span[@class='bday']")
< span class="bday">1965-04-04</ span>
-Years Active: =IMPORTXML(G1!,"//td[@class='infobox-data']")
< td class="infobox-data">1970–present</ td>
In both cases it gives me errors. What am I doing wrong? I looked on https://www.benlcollins.com/spreadsheets/google-sheet-web-scraper/ to get some guidance but I can't find my error.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
data:image/s3,"s3://crabby-images/d5906/d59060df4059a6cc364216c4d63ceec29ef7fe66" alt="扫码二维码加入Web技术交流群"
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
从您的问题并显示图像,不幸的是,我看不到
Robert Downey Jr
的URL。但是,如果将URL称为https://en.wikipedia.org/wiki/wiki/robert_downey_jr
,我认为您的xpath> // span@class ='bday'] < /代码>返回
1965-04-04
。但是,// td [@class ='infobox-data']
返回多个值。在此答案中,
1965-04-04
和1970 – Present
的值是从https://en.wikipedia.org/wiki的URL中检索的。 /robert_downey_jr
。样本1:
在此样本中,
1965-04-04
是从https://en.wikipedia.org/wiki/robert_downey_jr
中检索的。样本2:
在此样本中,
1970 – Present
是从https://en.wikipedia.org/wiki/wiki/robert_downey_jr
中检索的。注意:
Robert Downey Jr
的URL,例如,如何再次检查URL?因为当我使用https://en.wikipedia.org/wiki/robert_downey_jr
的URL时,可以检索您的预期值。From your question and showing image, unfortunately, I cannot see the URL of
Robert Downey Jr
. But, if the URL is supposed ashttps://en.wikipedia.org/wiki/Robert_Downey_Jr
, I think that your xpath of//span[@class='bday']
returns1965-04-04
. But, your xpath of//td[@class='infobox-data']
returns multiple values.In this answer, the values of
1965-04-04
and1970–present
are retrieved from the URL ofhttps://en.wikipedia.org/wiki/Robert_Downey_Jr
.Sample 1:
In this sample,
1965-04-04
is retrieved fromhttps://en.wikipedia.org/wiki/Robert_Downey_Jr
.Sample 2:
In this sample,
1970–present
is retrieved fromhttps://en.wikipedia.org/wiki/Robert_Downey_Jr
.Note:
Robert Downey Jr
, for example, how about checking the URL again? Because when I use the URL ofhttps://en.wikipedia.org/wiki/Robert_Downey_Jr
, your expected values could be retrieved.