NCBI基因数据库问题
我正在尝试查找包含基因名和染色体位置的gene_info 文件。但是,我似乎无法在 NCBI FTP 站点上找到它。谁能给我指点一下吗?
I m trying to find gene_info file with genenames and chromosomal location. However, I can't seem to locate it on NCBI FTP site. Can anyone give me a pointer?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
请参阅:ftp://ftp.ncbi.nlm.nih.gov/gene/DATA /README 了解 NCBI ftp 站点上的文件内容的详细信息。
如果你想从 NCBI 本身获取数据,你将需要组合多个文件,可能是一个gene2accession(还包括位置信息)和一个gene_info文件,它将ID映射到符号和名称等。
访问UCSC 站点提供此信息,如果您想探索可用的内容,他们还提供公共 mysql 数据库:
http://workshops.arl.arizona.edu/sql1/sql_workshop/mysql /mysqlclient.html
如果您只想要人类、小鼠或大鼠数据,那么大鼠基因组数据库已经编译了您想要的数据(来自 NCBI 和 Ensembl 来源的新鲜数据):
ftp://rgd.mcw.edu/pub/data_release
例如,对于人类数据,请查看:< a href="ftp://rgd.mcw.edu/pub/data_release/GENES_HUMAN.txt" rel="noreferrer">ftp://rgd.mcw.edu/pub/data_release/GENES_HUMAN.txt
See: ftp://ftp.ncbi.nlm.nih.gov/gene/DATA/README for details of what is in what files at the NCBI ftp site.
If you want to get the data from NCBI itself you will need to combine multiple files, probably a gene2accession (which also includes position information) and a gene_info file which maps ids to symbols and names etc.
It is probably more convenient to go to the UCSC site for this information, they also provide a public mysql database if you want to explore what is available:
http://workshops.arl.arizona.edu/sql1/sql_workshop/mysql/mysqlclient.html
If you just want human, mouse or rat data then the Rat Genome Database has already compiled the data you want (fresh from the NCBI and Ensembl sources):
ftp://rgd.mcw.edu/pub/data_release
e.g. for human data look at: ftp://rgd.mcw.edu/pub/data_release/GENES_HUMAN.txt