难以确定文本数据库文件的文件类型

发布于 2024-09-03 06:11:43 字数 817 浏览 3 评论 0原文

因此,美国农业部有一些关于食物的一般营养成分的奇怪数据库,很自然,我们会窃取它以在我们的应用程序中使用。但无论如何,行的格式如下所示:

~01001~^~0100~^~Butter, salted~^~BUTTER,WITH SALT~^~~^~~^~Y~^~~^0^~~^6.38^4.27^8.79^3.87
~01002~^~0100~^~Butter, whipped, with salt~^~BUTTER,WHIPPED,WITH SALT~^~~^~~^~Y~^~~^0^~~^6.38^4.27^8.79^3.87
~01003~^~0100~^~Butter oil, anhydrous~^~BUTTER OIL,ANHYDROUS~^~~^~~^~Y~^~~^0^~~^6.38^4.27^8.79^3.87
~01004~^~0100~^~Cheese, blue~^~CHEESE,BLUE~^~~^~~^~Y~^~~^0^~~^6.38^4.27^8.79^3.87

用那些奇怪的 ~^ 分隔值,它也缺少标题行,但没关系,我可以认为从他们网站上的其他内容中得出: http://www.ars .usda.gov/Services/docs.htm?docid=8964

任何帮助都会很棒!如果重要的话,我们将使用 Ruby 制作一个开放/免费的 API 来查询这些数据。

此外,我很难提出这个问题,所以我将其设为社区维基,以便我们都可以参与其中!

So the USDA has some weird database of general nutrition facts about food, and well naturally we're going to steal it for use in our app. But anyhow the format of the lines is like the following:

~01001~^~0100~^~Butter, salted~^~BUTTER,WITH SALT~^~~^~~^~Y~^~~^0^~~^6.38^4.27^8.79^3.87
~01002~^~0100~^~Butter, whipped, with salt~^~BUTTER,WHIPPED,WITH SALT~^~~^~~^~Y~^~~^0^~~^6.38^4.27^8.79^3.87
~01003~^~0100~^~Butter oil, anhydrous~^~BUTTER OIL,ANHYDROUS~^~~^~~^~Y~^~~^0^~~^6.38^4.27^8.79^3.87
~01004~^~0100~^~Cheese, blue~^~CHEESE,BLUE~^~~^~~^~Y~^~~^0^~~^6.38^4.27^8.79^3.87

With those odd ~ and ^ separating the values, It also lacks a header row but thats ok, I can figure that out from the other stuff on their site: http://www.ars.usda.gov/Services/docs.htm?docid=8964

Any help would be great! If it matters we're making an open/free API with Ruby to query this data.

Additionally I'm having a tough time posing this question so I've made it a community wiki so we can all pitch in!

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(2

○愚か者の日 2024-09-10 06:11:43

这看起来像一个非常标准的 CSV(逗号分隔值)文件,除了字段分隔符从 , 更改为 ^ 以及引号字符从 "~

不幸的是,我不熟悉 Ruby 来推荐使用哪个库,但是在 Perl 中,有大量标准 CPAN 模块,其中最好的允许您配置字段分隔符和引号字符CSV 阅读器的...我希望 Ruby 也应该有类似的东西 - 如果是这样,那么你很幸运!

This looks like a very standard CSV (comma separated value) file, except the field separator character was changed from , to ^ and quote character from " to ~

Unfortunately, I'm not familiar with Ruby to recommend which library to use, but in Perl there's a boatload of standard CPAN modules the best of which allow you to configure both field separator and quote character of a CSV reader... I would expect Ruby should have something similar as well - if so, you're in luck!

不离久伴 2024-09-10 06:11:43

^ 似乎是字段分隔符,而 ~ 似乎是字符串分隔符。通常我希望在这些角色中看到 和 ",但是选择非常不常见的字符意味着像这样的字符串

Cheese, Bleu

不会被字符串解析器迷惑。

^ appears to be a field delimiter and ~ a string delimiter. Normally I'd expect to see , and " in those roles, but the choice of the very uncommon characters means that a string like

Cheese, Bleu

won't get all trippy with the string parser.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文