有人用过pubchemdb吗?有类似的API吗?

发布于 2024-11-06 04:39:33 字数 987 浏览 0 评论 0原文

更新:答案中的链接既有趣又有用,但不幸的是没有解决对 java API 的需求,所以我仍然期待任何输入。

我正在构建一个化学数据库化合物。我需要所有同义词(IUPAC 和通用名称)以及每个同义词的安全数据。
我将使用 PubChem (http://pubchem.ncbi.nlm.nih.gov/) 上免费提供的数据,

有一种使用简单的 HTTP 获取来查询每个化合物的简单方法。例如,要获取甘油数据,URL 为:

http://pubchem.ncbi.nlm.nih.gov/summary/summary.cgi?cid=753

以下 URL 将返回易于解析的格式:

http://pubchem.ncbi.nlm.nih.gov/summary/summary.cgi?cid=753&disopt=DisplaySDF

但它只会响应非常基本的信息,缺乏安全数据,只有一些常见名称。

有一个 JAVA 公共域 API,看起来非常完整,由 Scripps 的一组开发(引用)。代码位于此处

不幸的是,这个 API 没有很好的文档记录,并且由于所涉及的数据的复杂性,很难遵循。 据我收集的信息,pubchemdb 使用 PubChem 高级用户网关 (PUG) XML API< /a>

有人使用过这个 API(或任何其他可用的 API)吗?我希望获得有关如何开始使用它的简短描述或教程。

Update: The link in the answer is both interesting and useful, but unfortunately does not address the need for a java API, so I am still looking forward to any input.

I'm building a database of chemical compounds. I need all the synonyms (IUPAC and common names) as well as safety data for each.
I'll be using the freely available data at PubChem (http://pubchem.ncbi.nlm.nih.gov/)

There's an easy way of querying each compound with simple HTTP gets. For example, to obtain glycerol data, the URL is:

http://pubchem.ncbi.nlm.nih.gov/summary/summary.cgi?cid=753

And the following URL would return an easy to parse format:

http://pubchem.ncbi.nlm.nih.gov/summary/summary.cgi?cid=753&disopt=DisplaySDF

but it will respond only very basic info, lacking safety data and only a few common names.

There is one public domain API for JAVA that seems a very complete, developed by a group at Scripps (citation). The code is here.

Unfortunately, this API is not very well documented and it's quite difficult to follow due to the complexity of the data involved.
For what I gathered, pubchemdb is using the PubChem Power User Gateway (PUG) XML API

Has anyone used this API (or any other one available)? I would appreciate a short description or tutorial on how to start with it.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

羁绊已千年 2024-11-13 04:39:33

Cactvs Chemoinformatics 工具包(免费供学术/教育使用)具有完整的 PubChem 集成。使用脚本环境,您可以轻松地执行类似的操作

cactvs>ens create 753

ens0

cactvs>ens get ens0 E_NAMESET

PROPANE-1,2,3-TRIOL GLYCEROL 8043-29-6 29796-42-7 30049-52-6 37228-54-9 75398-78-6 78630-16-7 8013-25-0 175385-78-1 25618-55-7 64333-26-2 56-81-5 {Tegin M} LS-1377 G8773_SIGMA 15523_RIEDEL {Glycerin, natural} NCGC00090950-03 191612_ALDRICH 15524_RIEDEL {Glycerol solution} L-glycerol 49767_FLUKA {Biodiesel impurity} 49770_FLUKA 49771_FLUKA NCGC00090950-01 49927_FLUKA Glycerol-Gelatine G7757_SIAL GOL D-glycerol G9012_SIAL {Polyhydric alcohols} c0066 MOON {NSC 9230} G2025_SIGMA ZINC00895048 49781_FLUKA {Concentrated glycerin} {Concentrated glycerin (JP15)} D00028 {Glycerin (JP15/USP)} 44892U_SUPELCO {Glycerin, concentrated (JAN)} CRY 49782_FLUKA NCGC00090950-02 G6279_SIAL W252506_ALDRICH G7893_SIAL {Glycerin, concentrated} 33224_RIEDEL Bulbold Cristal Glyceol G9281_SIGMA Glycerol-1,2,3-3H G1901_SIGMA G7043_SIGMA 1,2,3-trihydroxypropane 1,2,3-trihydroxypropanol glycerin G2289_SIAL G9406_SIGMA {Glycerol-[2-3H]} CHEBI:17754 Glyzerin Oelsuess InChI=1/C3H8O3/c4-1-3(6)2-5/h3-6H,1-2H {90 Technical glycerine} Dagralax {Glycerin, anhydrous} {Glycerin, synthetic} Glycerine Glyceritol {Glycyl alcohol} Glyrol Glysanin NSC9230 Ophthalgan Osmoglyn Propanetriol {Synthetic glycerin} {Synthetic glycerine} Trihydroxypropane Vitrosupos {WLN: Q1YQ1Q} Glycerol-1,3-14C {4-01-00-02751 (Beilstein Handbook Reference)} AI3-00091 {BRN 0635685} {CCRIS 2295} {Caswell No. 469} {Citifluor AF 2} {Clyzerin, wasserfrei [German]} {EINECS 200-289-5} {EPA Pesticide Chemical Code 063507} {FEMA No. 2525} {Glicerina [DCIT]} {Glicerol [INN-Spanish]} {Glycerin (mist)} {Glycerin [JAN]} {Glycerin mist} {Glycerine mist} Glycerinum {Glycerolum [INN-Latin]} Grocolene {HSDB 492} IFP {Incorporation factor} 1,2,3-Propanetriol C00116 Optim {Propanetriol (VAN)} {1,2,3-PROPANETRIOL, HOMOPOLYMER} {Glycerol polymer} {Glycerol, polymers} {HL 80} {PGL 300} {PGL 500} {PGL 700} Polyglycerin Polyglycerine Polyglycerol {Unigly G 2} {Unigly G 6} G5516_SIGMA MolMap_000024

cactvs>

,这隐藏了所有 PUG 的丑陋之处 - 但无论如何,我敢说 PUG 有详细记录。该工具包远远超出了简单的数据下载 - 如果您愿意,您甚至可以像本地 SD 文件一样打开和查询 PubChem。

不过,PubChem 不包含安全数据。安全数据取决于国家/地区,受到严格监管,您应该非常小心,不要承担责任。请法律人员检查您的做法!

The Cactvs Chemoinformatics toolkit (free for academic/educational use) has full PubChem integration. Using the scripting environment, you can easily do something like

cactvs>ens create 753

ens0

cactvs>ens get ens0 E_NAMESET

PROPANE-1,2,3-TRIOL GLYCEROL 8043-29-6 29796-42-7 30049-52-6 37228-54-9 75398-78-6 78630-16-7 8013-25-0 175385-78-1 25618-55-7 64333-26-2 56-81-5 {Tegin M} LS-1377 G8773_SIGMA 15523_RIEDEL {Glycerin, natural} NCGC00090950-03 191612_ALDRICH 15524_RIEDEL {Glycerol solution} L-glycerol 49767_FLUKA {Biodiesel impurity} 49770_FLUKA 49771_FLUKA NCGC00090950-01 49927_FLUKA Glycerol-Gelatine G7757_SIAL GOL D-glycerol G9012_SIAL {Polyhydric alcohols} c0066 MOON {NSC 9230} G2025_SIGMA ZINC00895048 49781_FLUKA {Concentrated glycerin} {Concentrated glycerin (JP15)} D00028 {Glycerin (JP15/USP)} 44892U_SUPELCO {Glycerin, concentrated (JAN)} CRY 49782_FLUKA NCGC00090950-02 G6279_SIAL W252506_ALDRICH G7893_SIAL {Glycerin, concentrated} 33224_RIEDEL Bulbold Cristal Glyceol G9281_SIGMA Glycerol-1,2,3-3H G1901_SIGMA G7043_SIGMA 1,2,3-trihydroxypropane 1,2,3-trihydroxypropanol glycerin G2289_SIAL G9406_SIGMA {Glycerol-[2-3H]} CHEBI:17754 Glyzerin Oelsuess InChI=1/C3H8O3/c4-1-3(6)2-5/h3-6H,1-2H {90 Technical glycerine} Dagralax {Glycerin, anhydrous} {Glycerin, synthetic} Glycerine Glyceritol {Glycyl alcohol} Glyrol Glysanin NSC9230 Ophthalgan Osmoglyn Propanetriol {Synthetic glycerin} {Synthetic glycerine} Trihydroxypropane Vitrosupos {WLN: Q1YQ1Q} Glycerol-1,3-14C {4-01-00-02751 (Beilstein Handbook Reference)} AI3-00091 {BRN 0635685} {CCRIS 2295} {Caswell No. 469} {Citifluor AF 2} {Clyzerin, wasserfrei [German]} {EINECS 200-289-5} {EPA Pesticide Chemical Code 063507} {FEMA No. 2525} {Glicerina [DCIT]} {Glicerol [INN-Spanish]} {Glycerin (mist)} {Glycerin [JAN]} {Glycerin mist} {Glycerine mist} Glycerinum {Glycerolum [INN-Latin]} Grocolene {HSDB 492} IFP {Incorporation factor} 1,2,3-Propanetriol C00116 Optim {Propanetriol (VAN)} {1,2,3-PROPANETRIOL, HOMOPOLYMER} {Glycerol polymer} {Glycerol, polymers} {HL 80} {PGL 300} {PGL 500} {PGL 700} Polyglycerin Polyglycerine Polyglycerol {Unigly G 2} {Unigly G 6} G5516_SIGMA MolMap_000024

cactvs>

This hides all PUG ugliness - but in any case, I dare say that PUG is well documented. The toolkit goes much beyond simple data downloads - you can even open and query PubChem like a local SD file if you want to.

PubChem does not contain safety data, though. And safety data is country/region-dependent, strictly regulated, and you should be really careful not to be hit with liabilities. Have your approach checked by legal personnel!

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文