导入维基百科数据库转储 - 杀死 navicat - 有人有什么想法吗？

发布于 2024-07-19 09:09:36 字数 256 浏览 4 评论 0原文

好吧，我已经下载了 wikipedia xml 转储，其中一个表的数据高达 12 GB，我想将其导入到本地主机上的 mysql 数据库中 - 然而它是一个 12GB 的巨大文件，显然 navicats 在其中度过了愉快的时光导入它或者它更有可能挂起:(。

有没有一种方法可以包含这个转储或至少部分地包含你一点一点地知道。

让我更正一下它的 21 GB 数据 - 不是说它有帮助:\ - 任何一个都可以吗知道如何将这样巨大的文件导入 MySQL 数据库吗？

原文

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

一萌ing 2024-07-26 09:09:36

请改用命令行，navicat 对于导入大文件来说非常糟糕，并且可能比使用 CLI 花费 20 倍的时间。

回复收藏 0 原文

地狱即天堂 2024-07-26 09:09:36

查看 Sax 解析器它允许您逐段阅读语料库，而不是阅读整个12GB进入内存。我不太确定你如何将它与 mysql 连接。

回复收藏 0 原文

爱殇璃 2024-07-26 09:09:36

这是一个相当老的问题，FWIW..用新的答案刷新。我也遇到过同样的问题，坐几个小时来运行一个巨大的sql文件可能是有风险的，遇到任何问题基本上意味着你要重新开始。我通过 CLI 降低了风险并获得了一些性能。

将庞大的 SQL 文件拆分为更小、更易于管理的块，例如“enwiki-20140811-page.sql”拆分为大约 75MB 大小的文件。
```
split -l 75 enwiki-20140811-page.sql split_ 
  
```
将生成相当数量的文件名中带有“split_”前缀的文件。

迭代此文件列表并一次导入一个...一个简单的 shell 脚本。

for $FILES 中的 f 
  做 
    echo“正在处理$f文件...” 
    mysql -h $HOST -u $USER -p$PSWD $DB <   $f 
  完毕

如果由于某种原因中断，您可以轻松地从上次中断的地方恢复。

通过行计数分割 SQL 文件可以防止破坏任何大型 INSERT 语句。但是，如果将行数减少得太低，则可以在 SQL 开头拆分 DROP 和 CREATE 语句。通过打开前几个分割文件并解决，可以轻松解决此问题。

this is a quite old question, FWIW.. refreshing with a new answer. i've encountered the same issues and sitting hours for a single massive sql file to run can be risky, and running into any issues basically means you start all over again. what i did to reduce the risk and gain some performance via CLI.

split the massive SQL file into smaller more manageable chunks, for example 'enwiki-20140811-page.sql' split into about 75MB sized files.
```
split -l 75 enwiki-20140811-page.sql split_
```
will produce a fair number of files prefixed with 'split_' in the file name.

iterate over this file list and import one at a time...a simple shell script as such.

for f in $FILES
do
  echo "Processing $f file..."
  mysql -h $HOST -u $USER -p$PSWD $DB < $f
done

if this ever breaks for some reason, you can easily resume where you left off.

Spliting the SQL file via line count prevents breaking any large INSERT statements. However if you drop the line count too low, you could split DROP and CREATE statements at the beginning of the SQL. This is easily fixed by opening the first few split files and resolving.

回复收藏 0 原文

~没有更多了~