谷歌使用什么数据库?
是 Oracle 或 MySQL 还是他们自己构建的东西?
Is it Oracle or MySQL or something they have built themselves?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
是 Oracle 或 MySQL 还是他们自己构建的东西?
Is it Oracle or MySQL or something they have built themselves?
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
接受
或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
发布评论
评论(8)
Bigtable
结构化数据的分布式存储系统
一些特性
架构
BigTable 不是关系数据库。 它不支持连接,也不支持丰富的类似 SQL 的查询。 每个表都是一个多维稀疏映射。 表格由行和列组成,每个单元格都有一个时间戳。 一个单元可以有多个版本,具有不同的时间戳。 时间戳允许执行诸如“选择此网页的'n'个版本”或“删除早于特定日期/时间的单元格”之类的操作。
为了管理巨大的表,Bigtable 在行边界处分割表并将它们保存为片剂。 一个tablet大约200MB,每台机器节省100个tablet左右。 此设置允许单个表中的平板电脑分布在许多服务器上。 它还允许细粒度的负载平衡。 如果一个表正在接收许多查询,它可以摆脱其他平板电脑或将繁忙的表移动到另一台不那么繁忙的机器上。 此外,如果一台机器出现故障,平板电脑可能会分布在许多其他服务器上,以便对任何给定机器的性能影响最小。
表存储为不可变的 SSTable 和日志尾部(每台机器一个日志)。 当机器耗尽系统内存时,它会使用 Google 专有的压缩技术(BMDiff 和 Zippy)来压缩一些平板电脑。 小压缩只涉及几个tablet,而大压缩则涉及整个表系统并回收硬盘空间。
Bigtable 片剂的位置存储在单元格中。 任何特定平板电脑的查找均由三层系统处理。 客户端获得一个指向 META0 表的点,该表只有一个。 META0 表跟踪许多 META1 片剂,其中包含正在查找的片剂的位置。 META0 和 META1 都大量使用预取和缓存来最大限度地减少系统瓶颈。
实现
BigTable 构建于 Google 文件系统 (GFS) 之上,用作日志和数据文件的后备存储。 GFS 为 SSTables 提供可靠的存储,SSTables 是一种用于保存表数据的 Google 专有文件格式。
BigTable 大量使用的另一个服务是 Chubby,这是一种高可用、可靠的分布式锁服务。 Chubby 允许客户端获取锁,可能将其与一些元数据相关联,它可以通过将保持活动消息发送回 Chubby 来更新元数据。 锁存储在类似文件系统的分层命名结构中。
Bigtable 系统中存在三种主要服务器类型:
Google 研究论文的示例:
API
BigTable 的典型操作是创建和删除表和列族、写入数据以及从行中删除列。 BigTable 通过 API 向应用程序开发人员提供此功能。 事务在行级别受支持,但不支持跨多个行键。
以下是研究论文 PDF 的链接。
在这里您可以找到 视频,其中展示了 Google 的 Jeff Dean在华盛顿大学的一次演讲中,讨论了 Google 后端使用的 Bigtable 内容存储系统。
Bigtable
A Distributed Storage System for Structured Data
Some features
Architecture
BigTable is not a relational database. It does not support joins nor does it support rich SQL-like queries. Each table is a multidimensional sparse map. Tables consist of rows and columns, and each cell has a time stamp. There can be multiple versions of a cell with different time stamps. The time stamp allows for operations such as "select 'n' versions of this Web page" or "delete cells that are older than a specific date/time."
In order to manage the huge tables, Bigtable splits tables at row boundaries and saves them as tablets. A tablet is around 200 MB, and each machine saves about 100 tablets. This setup allows tablets from a single table to be spread among many servers. It also allows for fine-grained load balancing. If one table is receiving many queries, it can shed other tablets or move the busy table to another machine that is not so busy. Also, if a machine goes down, a tablet may be spread across many other servers so that the performance impact on any given machine is minimal.
Tables are stored as immutable SSTables and a tail of logs (one log per machine). When a machine runs out of system memory, it compresses some tablets using Google proprietary compression techniques (BMDiff and Zippy). Minor compactions involve only a few tablets, while major compactions involve the whole table system and recover hard-disk space.
The locations of Bigtable tablets are stored in cells. The lookup of any particular tablet is handled by a three-tiered system. The clients get a point to a META0 table, of which there is only one. The META0 table keeps track of many META1 tablets that contain the locations of the tablets being looked up. Both META0 and META1 make heavy use of pre-fetching and caching to minimize bottlenecks in the system.
Implementation
BigTable is built on Google File System (GFS), which is used as a backing store for log and data files. GFS provides reliable storage for SSTables, a Google-proprietary file format used to persist table data.
Another service that BigTable makes heavy use of is Chubby, a highly-available, reliable distributed lock service. Chubby allows clients to take a lock, possibly associating it with some metadata, which it can renew by sending keep alive messages back to Chubby. The locks are stored in a filesystem-like hierarchical naming structure.
There are three primary server types of interest in the Bigtable system:
Example from Google's research paper:
API
Typical operations to BigTable are creation and deletion of tables and column families, writing data and deleting columns from a row. BigTable provides this functions to application developers in an API. Transactions are supported at the row level, but not across several row keys.
Here is the link to the PDF of the research paper.
And here you can find a video showing Google's Jeff Dean in a lecture at the University of Washington, discussing the Bigtable content storage system used in Google's backend.
这是他们自己构建的东西 - 称为 Bigtable。
http://en.wikipedia.org/wiki/BigTable
Google 发表了一篇论文,介绍了数据库:
http://research.google.com/archive/bigtable.html
It's something they've built themselves - it's called Bigtable.
http://en.wikipedia.org/wiki/BigTable
There is a paper by Google on the database:
http://research.google.com/archive/bigtable.html
Spanner 是 Google 的全球分布式关系数据库管理系统 (RDBMS),是 < a href="http://en.wikipedia.org/wiki/BigTable" rel="noreferrer">BigTable。 谷歌声称它不是一个纯粹的关系系统,因为每个表都必须有一个主键。
此处是该论文的链接。
Google 发明的另一个数据库是 Megastore。 这是摘要:
Spanner is Google's globally distributed relational database management system (RDBMS), the successor to BigTable. Google claims it is not a pure relational system because each table must have a primary key.
Here is the link of the paper.
Another database invented by Google is Megastore. Here is the abstract:
正如其他人提到的,谷歌使用了一种名为 BigTable 的本土解决方案,并且他们已经发布了几篇论文,将其描述到现实世界中。
Apache 人员实现了这些论文中提出的想法,称为 HBase。 HBase 是更大的 Hadoop 项目的一部分,根据他们的网站,该项目“是一个软件平台,可以让人们轻松编写和运行处理大量数据的应用程序。”一些基准测试非常令人印象深刻。 他们的网站位于 http://hadoop.apache.org。
As others have mentioned, Google uses a homegrown solution called BigTable and they've released a few papers describing it out into the real world.
The Apache folks have an implementation of the ideas presented in these papers called HBase. HBase is part of the larger Hadoop project which according to their site "is a software platform that lets one easily write and run applications that process vast amounts of data." Some of the benchmarks are quite impressive. Their site is at http://hadoop.apache.org.
尽管 Google 的所有主要应用程序都使用 BigTable,但他们也使用 MySQL其他(可能是次要的)应用程序。
Although Google uses BigTable for all their main applications, they also use MySQL for other (perhaps minor) apps.
而且知道 BigTable 不是关系数据库(如 MySQL)而是一个巨大的(分布式)散列也许也很方便表 具有非常不同的特征。 您可以在 Google AppEngine 平台上自行试用 BigTable(有限版本)。
除了上面提到的 Hadoop 之外,还有许多其他实现尝试解决与 BigTable 相同的问题(可扩展性、可用性)。 我昨天看到一篇不错的博客文章,列出了其中的大多数 这里。
And it's maybe also handy to know that BigTable is not a relational database (like MySQL) but a huge (distributed) hash table which has very different characteristics. You can play around with (a limited version) of BigTable yourself on the Google AppEngine platform.
Next to Hadoop mentioned above there are many other implementations that try to solve the same problems as BigTable (scalability, availability). I saw a nice blog post yesterday listing most of them here.
Google 主要使用 Bigtable。
Bigtable 是一个用于管理结构化数据的分布式存储系统,旨在扩展到非常大的规模。
有关详细信息,请从此处下载该文档。
Google 还在其某些应用程序中使用 Oracle 和 MySQL 数据库。
如果您能添加更多信息,我们将不胜感激。
Google primarily uses Bigtable.
Bigtable is a distributed storage system for managing structured data that is designed to scale to a very large size.
For more information, download the document from here.
Google also uses Oracle and MySQL databases for some of their applications.
Any more information you can add is highly appreciated.
Google 服务具有多语言持久性架构。 BigTable 被 YouTube、Google 搜索、Google Analytics 等大多数服务所利用。该搜索服务最初使用 MapReduce 作为其索引基础设施,但后来在 Caffeine 发布期间过渡到 BigTable。
Google Cloud 数据存储在 Google 生产环境中拥有 100 多个面向内部和外部用户的应用程序。 Gmail、Picasa、Google 日历、Android Market 等应用程序 AppEngine 使用 Cloud Datastore & 大型商店。
Google Trends 使用 MillWheel 进行流处理。 Google Ads 最初使用 MySQL,后来迁移到 F1 DB - 一种自定义编写的分布式关系数据库。 YouTube 使用 MySQL 和 Vitess。 Google 在 Google 文件系统的帮助下在商品服务器上存储了 EB 级的数据。
资料来源:Google 数据库: Google 服务如何存储 PB 至 EB 级数据?
YouTube 数据库 – 它如何在不耗尽存储空间的情况下存储如此多的视频?
Google services have a polyglot persistence architecture. BigTable is leveraged by most of its services like YouTube, Google Search, Google Analytics etc. The search service initially used MapReduce for its indexing infrastructure but later transitioned to BigTable during the Caffeine release.
Google Cloud datastore has over 100 applications in production at Google both facing internal and external users. Applications like Gmail, Picasa, Google Calendar, Android Market & AppEngine use Cloud Datastore & Megastore.
Google Trends use MillWheel for stream processing. Google Ads initially used MySQL later migrated to F1 DB - a custom written distributed relational database. Youtube uses MySQL with Vitess. Google stores exabytes of data across the commodity servers with the help of the Google File System.
Source: Google Databases: How Do Google Services Store Petabyte-Exabyte Scale Data?
YouTube Database – How Does It Store So Many Videos Without Running Out Of Storage Space?