适用于松散结构数据的数据浏览/查询工具

发布于 2024-08-21 04:42:59 字数 482 浏览 3 评论 0原文

我有一组统计数据(大约 100M 大小),它以键值对的形式组织,有些值只是数字(例如,人的年龄或体重),有些是分层的(例如,人的就业情况 - 它可以有一组就业记录,每个记录又包含键/值对等)。真实的数据并不完全是这些,但结构是相似的。

我需要用任意一组标准查询这些​​数据 - 即我可能想问“3年前20名最年长的人在哪里工作”或“曾经在X公司工作过更多时间的所有人员的所有工资总和是多少”一年多”,或者“给我你所知道的关于最近找到新工作的人的所有信息”,等等。

我可以很容易地对每个单独的查询进行编程,但由于它们可能有很多,而且它们一直在变化,所以变得很乏味重新对每个程序进行编程,所以问题是是否有一个现有的工具可以让我更轻松地执行此类查询(如果它有一个漂亮的 GUI,那就是额外的好处:)。像 SQL 这样的东西不会很好地工作,因为数据字段并不是真正固定的,并且在 SQL 中使层次结构工作会很麻烦。那么有没有一个工具可以让我相对轻松地完成这项任务(即不需要为此学习一种全新的语言 - 那么我最好继续手动编码查询)?

I have a set of statistical data (about 100M size), which is organized in key-value pairs, some of the values are just numbers (e.g. like person's age or weight) and some are hierarchical (e.g. like person's employments - it can have a set of employment records, each again containing key/value pairs, etc.). The real data is not exactly these but the structure is similar.

I need to query these data with arbitrary set of criteria - i.e. I may want to ask something like "where 20 oldest persons worked 3 years ago" or "what is the sum of all salaries for all people that ever worked at company X for more than a year", or "give me all you know on people that found a new job recently", etc.

I can program each individual query pretty easily but since there can be many of them and they vary all the time it becomes tedious to program each one anew, so the question is if there's an existing tool that would make it easier for me to do such queries (if it has a nice GUI that's a bonus :). Something like SQL wouldn't work well because data fields aren't really fixed and making hierarchy work in SQL would be too much trouble IMHO. So is there a tool that I could use with relative ease for this task (i.e. not learning a whole new language for that - I'd better stay with hand-coding the queries then)?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

夏夜暖风 2024-08-28 04:42:59

您可能想看看 MongoDB。它是一个 JSON 数据存储,因此它本质上适用于键/值对,并且您可以将 JSON 嵌套在 JSON 中。它使用 JavaScript 作为查询语言。当然,您需要将数据转换为 JSON,但这并不困难。

另一种选择可能是像 Neo4j 这样的图形数据库。每个记录都是一个节点,您可以定义节点之间的关系(可视化为边)。

我认为它们都没有任何类型的 GUI,但它们很容易查询。 MongoDB 使用带有绑定的 JS,您可以使用它来调用数据库。 Neo4j 使用 Java,但也有一些针对其他语言的绑定。

SQL 查询将具有挑战性,但它会起作用。我还将把 PostgreSQL 作为一个选项,因为它在某种程度上是面向对象的,但我对其他的更熟悉。

You may want to look at MongoDB. It is a JSON data store, so it essentially works with key/value pairs, and you can nest JSON within JSON. It uses JavaScript as the query language. Of course, you'd need to convert your data to JSON, but this is not difficult.

Another option may be a graph database like Neo4j. Each record is a node and you can define relationships between nodes (visualized as edges).

I do not think either of these have any type of GUI, but they are pretty easy to query. MongoDB uses JS with bindings you can use to call the DB. Neo4j uses Java, but there are some bindings for other languages.

SQL queries would be challenging, but it would work. I will also throw PostgreSQL as an option since it is somewhat object oriented, but I am more familiar with the others.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文