有谁知道臭名昭著的 NSA 呼叫数据库曾经/现在使用什么作为 DBMS?

发布于 2024-08-24 11:25:25 字数 506 浏览 5 评论 0原文

关于SO的另一个问题突然让我想知道世界上最大的数据库是什么(以及它有多大)。在 Google 上快速搜索后发现:NSA 通话数据库,由美国国家安全局创建。据称,该数据库包含超过 1.9 万亿条记录,其中包含自 2001 年以来通过 AT&T 和 Verizon 拨打的电话的详细信息。

有谁知道该数据库使用了哪种数据库系统?在我看来,1.9 万亿条记录比典型的大型商业数据库要多得多。但也许我错了。我也没有以任何方式对此进行广泛的研究,所以也许国家安全局呼叫数据库是世界上最大的说法是完全错误的。

尽管如此,我还是有兴趣知道什么样的 DBMS(如果有的话)可以合理地处理这么多记录。

Another question on SO suddenly got me wondering what the largest database in the world is (and how big it could be). A quick Google search turned up this: the NSA call database, created by the U.S. National Security Agency. Supposedly this database contains over 1.9 trillion records containing details relating to phone calls placed through AT&T and Verizon from as far back as 2001.

Does anyone have any idea what kind of DB system was used for this database? 1.9 trillion records seems to me like a lot more than even your typical large-scale commercial databases would have. But maybe I'm wrong. I also didn't research this extensively by any means, so perhaps the claim that the NSA call database is the biggest in the world is flat-out false.

Still, I'm interested to know what kind of DBMS, if any, could reasonably deal with this many records.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

甲如呢乙后呢 2024-08-31 11:25:25

1.9 万亿行乘以 8000 字节/行,就是 15 PB?我这道算术题做对了吗?这仅比几个知名商业数据库大一个数量级。谷歌搜索“PB 数据库”给了我

  • eBay:一个 2+ PB 的数据仓库
    和一个 6+ PB 的数据仓库
    (2009)
  • facebook:2+ PB 数据仓库
    (2010)
  • 沃尔玛:2+ PB 数据仓库
    (2010)
  • 美国银行:1+ PB 数据
    仓库 (2010)
  • 戴尔:1+ PB 数据仓库
    (2010)

1.9 万亿行可以在 64 位无符号整数范围内轻松进行行寻址。

物理学家和天文学家似乎有最大的目标。斯坦福大学需要为其大型综合巡天望远镜管理约 155 PB 的数据。我附近的一个天文学项目每天生成大约 10 PB 的数据,但它们存储的数据远没有那么多。

哎呀,我差点忘了问题的重点了。 Greenplum 和 Teradata 出现的频率最高。但我认为任何知道国家安全局实际用途的人都不会谈论它。

@Tomislav Nakic-Alfirevic:一个打印每 1000 行的 awk 程序:

NR % 1000 == 0 {print $0}

你认为 NSA 会为此付钱给我吗?我的房子需要一个新屋顶。

1.9 trillion rows multiplied by, say, 8000 bytes/row is, ummm, 15 petabytes? Did I do that arithmetic right? That's just one order of magnitude bigger than several well-known business databases. Googling "petabyte databases" gave me

  • ebay: one 2+ petabyte data warehouse
    and one 6+ petabyte data warehouse
    (2009)
  • facebook: 2+ petabyte data warehouse
    (2010)
  • Walmart: 2+ petabyte data warehouse
    (2010)
  • Bank of America: 1+ petabyte data
    warehouse (2010)
  • Dell: 1+ petabyte data warehouse
    (2010)

1.9 trillion rows are easily (cough) row-addressable in the range of a 64-bit unsigned int.

Physicists and astronomers seem to have the biggest targets. Stanford needs to manage about 155 petabytes of data for their Large Synoptic Survey Telescope. An astronomy project down the street from me generates about 10 petabytes a day, but they don't store nearly that much.

Heck, I almost forgot the point of the question. Greenplum and Teradata showed up the most often. But I don't think anybody who knows what the NSA actually uses will talk about it.

@Tomislav Nakic-Alfirevic: An awk program to print every 1000th line:

NR % 1000 == 0 {print $0}

Do you think the NSA would pay me for that? My house needs a new roof.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文