使用 R 通过标题获取 Pubmed 摘要

发布于 2025-01-15 10:49:52 字数 763 浏览 5 评论 0原文

一段时间以来,我一直在尝试使用其标题来获取 Pubmed 摘要。例如,如果我将以下标题放在 pubmPd 掩码上 @ https://pubmed.ncbi.nlm .nih.gov/

垂体衍生的 MEG3 同工型在肿瘤细胞中充当生长抑制剂

我获得了显示以下摘要的页面:

摘要 人垂体腺瘤是最常见的颅内肿瘤。通常起源于单克隆,体细胞突变是肿瘤发展的先决条件。为了确定肿瘤形成的潜在发病机制,我们通过 cDNA 代表性差异分析比较了正常人垂体组织和临床无功能垂体腺瘤之间基因表达的差异。我们克隆了一个 cDNA,其表达在这些肿瘤中不存在,它代表了先前描述的 MEG3(功能未知的母体印记基因)的新转录本。它在正常人促性腺激素中表达,临床上无功能的垂体腺瘤就是由该促性腺激素衍生的。 Northern blot 和 RT-PCR 的进一步研究表明,该基因在功能性垂体肿瘤以及许多人类癌细胞系中也不表达。此外,该基因的异位表达会抑制人类癌细胞(包括 HeLa、MCF-7 和 H4)的生长。基因组分析显示,MEG3 位于染色体 14q32.3 上,该位点预计含有与脑膜瘤发病机制有关的抑癌基因。综上所述,我们的数据表明 MEG3 可能代表一种新型生长抑制剂,它可能在人类垂体腺瘤的发展中发挥重要作用。

R 包中是否有任何命令可以执行相同的操作?我一直在使用一些工具,如“easyPubmed”、“Rentrez”等,但我对它们的复杂性有点害怕。 提前致谢。

I have been trying for a while to fetch Pubmed abstracts by using its title. For istance, if I put the following title on the pubmPd mask @ https://pubmed.ncbi.nlm.nih.gov/ :

A Pituitary-Derived MEG3 Isoform Functions as a Growth Suppressor in Tumor Cells

I obtain a page showing the following abstract:

Abstract
Human pituitary adenomas are the most common intracranial neoplasm. Typically monoclonal in origin, a somatic mutation is a prerequisite event in tumor development. To identify underlying pathogenetic mechanisms in tumor formation, we compared the difference in gene expression between normal human pituitary tissue and clinically nonfunctioning pituitary adenomas by cDNA-representational difference analysis. We cloned a cDNA, the expression of which was absent in these tumors, that represents a novel transcript from the previously described MEG3, a maternal imprinting gene with unknown function. It was expressed in normal human gonadotrophs, from which clinically nonfunctioning pituitary adenomas are derived. Additional investigation by Northern blot and RT-PCR demonstrated that this gene was also not expressed in functioning pituitary tumors as well as many human cancer cell lines. Moreover, ectopic expression of this gene inhibits growth in human cancer cells including HeLa, MCF-7, and H4. Genomic analysis revealed that MEG3 is located on chromosome 14q32.3, a site that has been predicted to contain a tumor suppressor gene involved in the pathogenesis of meningiomas. Taken together, our data suggest that MEG3 may represent a novel growth suppressor, which may play an important role in the development of human pituitary adenomas.

Is there any command in R packages that could do the same? I have been playing with some tools like 'easyPubmed', 'Rentrez', etc, but I was a little intimidated by their complexity.
Thanks in advance.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(2

愁以何悠 2025-01-22 10:49:52

我认为 easyPubMed 包相对易于使用,正如其名称所暗示的那样。这是一个完整的例子。

您可以创建一个查询字符值,在本例中只需使用与帖子中相同的标题。

您可以使用 get_pubmed_ids 执行 PubMed 查询,并使用 fetch_pubmed_data 检索记录。

然后,使用 table_articles_byAuth 您可以将结果放入 data.frame 中。通过将 included_authors 设置为“first”,您将仅获得有关记录的第一作者的信息。此外,使用 max_chars 您可以设置摘要中包含的字符数限制。

library(easyPubMed)

my_query <- paste(
  'A Pituitary-Derived MEG3 Isoform Functions as a Growth Suppressor in Tumor Cells',
  '[ti]'
)

my_pubmed_ids <- get_pubmed_ids(my_query)
my_data <- fetch_pubmed_data(my_pubmed_ids, encoding = "ASCII")

df <- table_articles_byAuth(my_data,
                            included_authors = "first",
                            max_chars = 2000,
                            encoding = "ASCII")

data.frame 的结果列如下:

names(df)

 [1] "pmid"      "doi"       "title"     "abstract"  "year"      "month"     "day"       "jabbrv"    "journal"   "keywords" 
[11] "lastname"  "firstname" "address"   "email" 

如果您想查看所有摘要,它们将位于 data.frame 的 abstract 列中:

df$abstract

[1] "Human pituitary adenomas are the most common intracranial neoplasm...

I think the easyPubMed package is relatively easy to use, as implied by the name. Here's a complete example.

You can create a query character value, in this case just used the same title as in the post.

You can perform the PubMed query using get_pubmed_ids and retrieve the records using fetch_pubmed_data.

Then, using table_articles_byAuth you can put your results into a data.frame. By setting included_authors to "first", you will only get info on the first authors of the records. Also, using max_chars you can set the limit to number of characters included from the abstract.

library(easyPubMed)

my_query <- paste(
  'A Pituitary-Derived MEG3 Isoform Functions as a Growth Suppressor in Tumor Cells',
  '[ti]'
)

my_pubmed_ids <- get_pubmed_ids(my_query)
my_data <- fetch_pubmed_data(my_pubmed_ids, encoding = "ASCII")

df <- table_articles_byAuth(my_data,
                            included_authors = "first",
                            max_chars = 2000,
                            encoding = "ASCII")

The resulting columns of your data.frame will be the following:

names(df)

 [1] "pmid"      "doi"       "title"     "abstract"  "year"      "month"     "day"       "jabbrv"    "journal"   "keywords" 
[11] "lastname"  "firstname" "address"   "email" 

If you want to see all your abstracts, they will be in the abstract column of your data.frame:

df$abstract

[1] "Human pituitary adenomas are the most common intracranial neoplasm...
A君 2025-01-22 10:49:52

我们可以使用rvest通过提交表单来获取摘要。

library(rvest)
library(dplyr)

# URL
url = 'https://pubmed.ncbi.nlm.nih.gov/'
ncbi <- html_session(url)

# Grab the Form
search <- ncbi %>% html_node("form") %>% html_form()
#fill the form 
search <- search %>%
  html_form_set("term" = "A Pituitary-Derived MEG3 Isoform Functions as a Growth Suppressor in Tumor Cells")

# submit the form and save as a new session
session <- submit_form(ncbi, search) 

# get abstract
abstract <- session %>% html_nodes('.abstract-content') %>% html_text()

We can use rvest to get the abstract by submitting form.

library(rvest)
library(dplyr)

# URL
url = 'https://pubmed.ncbi.nlm.nih.gov/'
ncbi <- html_session(url)

# Grab the Form
search <- ncbi %>% html_node("form") %>% html_form()
#fill the form 
search <- search %>%
  html_form_set("term" = "A Pituitary-Derived MEG3 Isoform Functions as a Growth Suppressor in Tumor Cells")

# submit the form and save as a new session
session <- submit_form(ncbi, search) 

# get abstract
abstract <- session %>% html_nodes('.abstract-content') %>% html_text()
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文