如何从 MySQL 备份文件中删除空表

发布于 2024-08-22 22:38:16 字数 1238 浏览 0 评论 0原文

我有多个大型 MySQL 备份文件，全部来自不同的数据库并具有不同的架构。我想将备份加载到我们的 EDW 中，但我不想加载空表。

现在我正在使用 AWK 删除备份文件上的空表，但我想知道是否有更好的方法来做到这一点。

如果有人感兴趣，这是我的 AWK 脚本：

编辑：我今天注意到这个脚本有一些问题，如果您想实际尝试使用它，请注意。您的输出可能是错误的...我将在做出更改后发布它们。

# File: remove_empty_tables.awk
# Copyright (c) Northwestern University, 2010
# http://edw.northwestern.edu

/^--$/ {
  i = 0;
  line[++i] = $0; getline

  if ($0 ~ /-- Definition/) {
    inserts = 0;
    while ($0 !~ / ALTER TABLE .* ENABLE KEYS /) {
      # If we already have an insert:
      if (inserts > 0)
        print
      else {
        # If we found an INSERT statement, the table is NOT empty:
        if ($0 ~ /^INSERT /) {
          ++inserts
          # Dump the lines before the INSERT and then the INSERT:
          for (j = 1; j <= i; ++j) print line[j]
          i = 0
          print $0
        }
        # Otherwise we may yet find an insert, so save the line:
        else line[++i] = $0
      }
      getline # go to the next line
    }
    line[++i] = $0; getline
    line[++i] = $0; getline
    if (inserts > 0) {
      for (j = 1; j <= i; ++j) print line[j]
      print $0
    }
    next
  } else {
    print "--"
  }
}

{
  print
}

原文

I have multiple large MySQL backup files all from different DBs and having different schemas. I want to load the backups into our EDW but I don't want to load the empty tables.

Right now I'm cutting out the empty tables using AWK on the backup files, but I'm wondering if there's a better way to do this.

If anyone is interested, this is my AWK script:

EDIT: I noticed today that this script has some problems, please beware if you want to actually try to use it. Your output may be WRONG... I will post my changes as I make them.

# File: remove_empty_tables.awk
# Copyright (c) Northwestern University, 2010
# http://edw.northwestern.edu

/^--$/ {
  i = 0;
  line[++i] = $0; getline

  if ($0 ~ /-- Definition/) {
    inserts = 0;
    while ($0 !~ / ALTER TABLE .* ENABLE KEYS /) {
      # If we already have an insert:
      if (inserts > 0)
        print
      else {
        # If we found an INSERT statement, the table is NOT empty:
        if ($0 ~ /^INSERT /) {
          ++inserts
          # Dump the lines before the INSERT and then the INSERT:
          for (j = 1; j <= i; ++j) print line[j]
          i = 0
          print $0
        }
        # Otherwise we may yet find an insert, so save the line:
        else line[++i] = $0
      }
      getline # go to the next line
    }
    line[++i] = $0; getline
    line[++i] = $0; getline
    if (inserts > 0) {
      for (j = 1; j <= i; ++j) print line[j]
      print $0
    }
    next
  } else {
    print "--"
  }
}

{
  print
}

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

幽梦紫曦～ 2024-08-29 22:38:16

我想不出 mysqldump 中的任何选项可以跳过备份中的空表。也许是 -where 选项，但不确定你可以做通用的事情。恕我直言，第二个脚本的后期处理并没有那么糟糕。

回复收藏 0 原文

生生漫 2024-08-29 22:38:16

使用正则表达式和 Perl Oneliner。它的工作原理是匹配注释标头+空格+下一个标头的开头。一个用于有序转储，下一个用于非有序转储。

perl -0777 -pi -e 's/--\s*-- Dumping data for table \`\w+\`\s*--\s*-- ORDER BY\:  [^\n]+\s+(?=--)//g' "dump.sql"
perl -0777 -pi -e 's/--\s*-- Dumping data for table \`\w+\`\s*--\n(?!--)\s*(?=--)//g' "dump.sql"

Using regex and perl one liners. It works by matching the comment header + white space + start of next header. One is for ordered dumps and the next is for non-ordered dumps.

perl -0777 -pi -e 's/--\s*-- Dumping data for table \`\w+\`\s*--\s*-- ORDER BY\:  [^\n]+\s+(?=--)//g' "dump.sql"
perl -0777 -pi -e 's/--\s*-- Dumping data for table \`\w+\`\s*--\n(?!--)\s*(?=--)//g' "dump.sql"

回复收藏 0 原文

~没有更多了~