从Google电子表格中获取Wikipedia的特定信息(不是整个桌子)

发布于 2025-01-27 08:41:38 字数 833 浏览 1 评论 0原文

我有一个来自Wikipedia的“ Lead Rolling演员”的桌子,我想在每个演员的出生日期,活跃的日期等日期中添加一些列。

主角滚动演员

这是我第一次使用importxml formula,但是对于Robert Downey来说,我正在尝试JR,我正在尝试尝试以下:

- 天生:= importxml(g1!,“ // span [@class ='bday']”)

< SPAN类=“ bday”> 1965-04-04</ span>

- 年活动:= importxml(g1!,“ // td [@class ='infobox-data']”)

< td class =“ infobox-data”> 1970 – present</ td>

在这两种情况下,它都会给我错误。我在做什么错?我看着 https://wwww.benlcollins.com/spreadsheets/google-legoogle--google--google--google--google--scro- Sheet-Web-Scraper/获得一些指导,但找不到错误。

I have a table from "Lead rolling actors" from Wikipedia and I want to add some columns to the table with the dates of birth, years active etc for every actor.

Lead rolling actors

It's the first time I use IMPORTXML formula but for Robert Downey Jr I am trying the following:

-Born: =IMPORTXML(G1!,"//span[@class='bday']")

< span class="bday">1965-04-04</ span>

-Years Active: =IMPORTXML(G1!,"//td[@class='infobox-data']")

< td class="infobox-data">1970–present</ td>

In both cases it gives me errors. What am I doing wrong? I looked on https://www.benlcollins.com/spreadsheets/google-sheet-web-scraper/ to get some guidance but I can't find my error.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

说谎友 2025-02-03 08:41:38

从您的问题并显示图像,不幸的是,我看不到Robert Downey Jr的URL。但是,如果将URL称为https://en.wikipedia.org/wiki/wiki/robert_downey_jr,我认为您的xpath > // span@class ='bday'] < /代码>返回1965-04-04。但是,// td [@class ='infobox-data']返回多个值。

在此答案中,1965-04-041970 – Present的值是从https://en.wikipedia.org/wiki的URL中检索的。 /robert_downey_jr

样本1:

在此样本中,1965-04-04是从https://en.wikipedia.org/wiki/robert_downey_jr中检索的。

=IMPORTXML("https://en.wikipedia.org/wiki/Robert_Downey_Jr","//span[@class='bday']")

样本2:

在此样本中,1970 – Present是从https://en.wikipedia.org/wiki/wiki/robert_downey_jr中检索的。

=IMPORTXML("https://en.wikipedia.org/wiki/Robert_Downey_Jr","//td[@class='infobox-data' and ../th[contains(text(),'active')]]")

注意:

  • 尽管我不确定您当前的Robert Downey Jr的URL,例如,如何再次检查URL?因为当我使用https://en.wikipedia.org/wiki/robert_downey_jr的URL时,可以检索您的预期值。

From your question and showing image, unfortunately, I cannot see the URL of Robert Downey Jr. But, if the URL is supposed as https://en.wikipedia.org/wiki/Robert_Downey_Jr, I think that your xpath of //span[@class='bday'] returns 1965-04-04. But, your xpath of //td[@class='infobox-data'] returns multiple values.

In this answer, the values of 1965-04-04 and 1970–present are retrieved from the URL of https://en.wikipedia.org/wiki/Robert_Downey_Jr.

Sample 1:

In this sample, 1965-04-04 is retrieved from https://en.wikipedia.org/wiki/Robert_Downey_Jr.

=IMPORTXML("https://en.wikipedia.org/wiki/Robert_Downey_Jr","//span[@class='bday']")

enter image description here

Sample 2:

In this sample, 1970–present is retrieved from https://en.wikipedia.org/wiki/Robert_Downey_Jr.

=IMPORTXML("https://en.wikipedia.org/wiki/Robert_Downey_Jr","//td[@class='infobox-data' and ../th[contains(text(),'active')]]")

enter image description here

Note:

  • Although I'm not sure about your current URL of Robert Downey Jr, for example, how about checking the URL again? Because when I use the URL of https://en.wikipedia.org/wiki/Robert_Downey_Jr, your expected values could be retrieved.
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文