如何生成各种数据库转储
我有一个 CSV 文件,想要生成 sqlite、mysql、postgres、oracle 和 mssql 的数据转储。
是否有通用的 API(最好是基于 Python 的)来执行此操作?
我可以使用 ORM 将数据插入每个数据库,然后导出转储,但这需要安装每个数据库。这似乎也浪费了资源 - 这些 CSV 文件很大。
由于每个数据库的差异,我对尝试自己编写 SQL 持谨慎态度。理想情况下,有人已经完成了这项艰巨的工作,但我还没有找到。
I have a CSV file and want to generate dumps of the data for sqlite, mysql, postgres, oracle, and mssql.
Is there a common API (ideally Python based) to do this?
I could use an ORM to insert the data into each database and then export dumps, however that would require installing each database. It also seems a waste of resources - these CSV files are BIG.
I am wary of trying to craft the SQL myself because of the variations with each database. Ideally someone has already done this hard work, but I haven't found it yet.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(3)
SQLAlchemy 是一个数据库库(以及 ORM 功能)支持 SQL 生成 http://www.sqlalchemy.org/docs/dialects/index.html">您提到的所有不同数据库(以及更多)的方言。
在正常使用中,您可以创建一个 SQL 表达式/指令(使用 schema.Table 对象),创建一个数据库引擎,然后将指令绑定到引擎上,生成SQL。
然而,发动机并不是绝对必需的;每个方言都有一个编译器,可以在没有连接的情况下生成 SQL;唯一需要注意的是,您需要阻止它像默认情况下那样生成绑定参数:
上面的示例实际上有效;它假设您知道目标数据库表架构;它应该很容易适应从 CSV 导入并为多种目标数据库方言生成。
SQLAlchemy is a database library that (as well as ORM functionality) supports SQL generation in the dialects of the all the different databases you mention (and more).
In normal use, you could create a SQL expression / instruction (using a schema.Table object), create a database engine, and then bind the instruction to the engine, to generate the SQL.
However, the engine is not strictly necessary; the dialects each have a compiler that can generate the SQL without a connection; the only caveat being that you need to stop it from generating bind parameters as it does by default:
The above example actually works; it assumes you know the target database table schema; it should be easily adaptable to import from a CSV and generate for multiple target database dialects.
我不是数据库专家,但据我所知,Python 中没有一个通用的 API 可以开箱即用地满足您的要求。 PEP 249 定义了访问数据库的模块应该使用的 API并且 AFAIK 至少被 MySQL 和 Postgre python 模块使用(此处 和 此处),这也许是一个起点。
然而,我自己尝试走的路是另一条路:
用于处理转储文件的脚本可能非常紧凑,但如果您使用正则表达式来解析行,它们可能会有些棘手。下面是一个示例脚本 MySQL → SQLite,我只是从此页面粘贴:
您可以用 python 编写脚本(在这种情况下,您应该查看 re.编译以提高性能)。
我选择背后的理由是:
编辑:基于模板的方法
如果出于某种原因您没有足够的信心自己编写 SQL,您可以使用某种基于模板的脚本。我将这样做:
这种方法的明显缺点是您的“模板”仅适用于一张表。它的最大优点是编写这样的脚本将非常容易和快速。
至少有一点!
I am no database wizard, but AFAIK in Python there's not a common API that would do out-of-the-box what you ask for. There is PEP 249 that defines an API that should be used by modules accessing DB's and that AFAIK is used at least by the MySQL and Postgre python modules (here and here) and that perhaps could be a starting point.
The road I would attempt to follow myself - however - would be another one:
The scripts for processing the dump file could be very compact, although they might somehow be tricky if you use regex for parsing the lines. Here's an example script MySQL → SQLite that I simply pasted from this page:
You could write your script in python (in which case you should have a look to re.compile for performance).
The rationale behind my choice would be:
EDIT: A template-based approach
If for any reason you don't feel confident enough to write the SQL yourself, you could use a sort of template-based script. Here's how I would do it:
The obvious drawback of this approach is that your "template" will only work for one table. The strongest point of it is that writing such script would be extremely easy and quick.
HTH at least a bit!
您可以这样做 - 创建 SQL 表从 CSV 文件
或 生成从 CSV 文件插入语句
或尝试此从 .csv python 生成 .sql
当然你可能需要调整提到的脚本以满足您的需求。
You could do this - Create SQL tables from CSV files
or Generate Insert Statements from CSV file
or try this Generate .sql from .csv python
Of course you might need to tweak the scripts mentioned to suite your needs.