生成模拟数据的工具?
我正在寻找一个好的免费工具的建议,用于生成示例数据以加载到测试数据库中。 通过类比,可以为任何 RDBMS 生成“lorem ipsum”文本。 我正在寻找的功能包括:
- 为现有表定义生成数据的灵活性。
- 能够生成小型和大型数据集(> 100 万行或更多)。
- 以 SQL 脚本格式(
INSERT
语句)生成,或者以适合批量导入的平面文件格式生成(通常更快)。 - 用于轻松编写脚本的命令行界面。
- 可扩展、开源、用动态语言编写(这些是必备条件,而不是强要求)。
PS:我确实在 StackOverflow 上搜索了重复的问题,但没有找到。 如果有的话,我将不胜感激地得到一个指向它的指针。
感谢大家的精彩回应! 我应该修改我的要求,即使用 Mac OS X 作为我的主要开发环境,而不是 Windows(尽管我确实说过命令行界面是可取的,并且实际上排除了 Windows)。 不过,针对 Windows 的建议无疑对这个问题的其他读者有用,所以谢谢。
这是我的结论:
- 生成数据:
- PHP Web 应用程序界面,而不是命令行
- 仅限生成 200 条记录(或支付 20 美元以获得生成 5,000 条记录的许可证)
- RedGate SQL 数据生成器
- 不是免费的,价格 295 美元
- 需要 Windows、.NET、SQL Server
- Visual Studio 2008 数据库版本
- 需要 Windows
- 需要昂贵的 MSDN 或 ISV 订阅
- Banner Datadect
- 不是免费的,价格 595 美元
- 需要 Windows (?)
- 不支持 MySQL(?)
- GUI,非命令行或可编写脚本
- Ruby Faker gem
- 使用 ActiveRecord 进行批量数据加载速度太慢
- Super Smack
- 主要是一个负载测试工具,内置随机数据生成器
- 使用起来相当简单
- 总体来说是一个不错的亚军工具
- Databene Benerator
- 满足我需求的最佳解决方案
- XML 脚本,与 DbUnit 兼容
- 开源 (GPL) Java 代码
- 命令行使用
- 通过 JDBC 直接访问许多数据库
I'm looking for recommendations of a good, free tool for generating sample data for the purpose of loading into test databases. By analogy, something that produces "lorem ipsum" text for any RDBMS. Features I'm looking for include:
- Flexibility to generate data for an existing table definition.
- Ability to generate small and large data sets (> 1 million rows or more).
- Generate in SQL script format (
INSERT
statements) or else in a flat file format suitable for bulk import (which is usually faster). - A command-line interface for easy scripting.
- Extensible, open source, written in a dynamic language (these are nice-to-haves, not strong requirements).
PS: I did search for a duplicate question on StackOverflow, but I didn't find one. If there is one, I'll be grateful to get a pointer to it.
Thanks for the great responses everyone! I should amend my requirements that I use Mac OS X as my primary development environment, not Windows (though I did say command-line interface is desirable, and that practically rules out Windows). The Windows-specific suggestions will no doubt be useful to other readers of this question, though, so thanks.
Here is my conclusion:
- GenerateData:
- PHP web app interface, not command line
- limited to generating 200 records (or pay $20 for license to generating 5,000 records)
- RedGate SQL Data Generator
- not free, price $295
- requires Windows, .NET, SQL Server
- Visual Studio 2008 Database Edition
- requires Windows
- requires costly MSDN or ISV subscription
- Banner Datadect
- not free, price $595
- requires Windows (?)
- no support for MySQL (?)
- GUI, not command line or scriptable
- Ruby Faker gem
- way too slow to use ActiveRecord for bulk data load
- Super Smack
- chiefly a load-testing tool, with a random data generator built in
- pretty simple to use nevertheless
- overall a good runner-up tool
- Databene Benerator
- best solution for my needs
- XML scripts, compatible with DbUnit
- open source (GPL) Java code
- command-line usage
- access many databases directly via JDBC
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(16)
看看 databene benerator,这是一个看起来很接近您的要求的测试数据生成器。
我会尝试一下。
顺便说一句,类似产品列表可在 databene benerator 的网站上找到。
Take a look at databene benerator, a test data generator that looks close to your requirements.
I would give it a try.
BTW, a list of similar products is available on databene benerator's web site.
这看起来很有希望:generatedata.com。 开源,有很多内置数据类型。
这里列出了其他几个:测试(示例)数据生成器。 我对其中任何一个都没有经验,但列表中的一些看起来可能相当不错。
This looks quite promising: generatedata.com. Open-source, has lots of built-in data types.
There are several others listed here: Test (Sample) Data Generators. I don't have experience with any of them, but a few on that list look like they could be pretty decent.
尝试 http://www.mockaroo.com
这是我公司制作的一个工具,用于帮助测试我们自己的应用程序。 我们已将其免费提供给任何人使用。 它基本上是 Forgery ruby gem,周围有一个网络应用程序。 您可以生成 CSV、txt 或 SQL 格式的数据。 希望这可以帮助。
Try http://www.mockaroo.com
This is a tool my company made to help test our own applications. We've made it free for anyone to use. It's basically the Forgery ruby gem with a web app wrapped around it. You can generate data in CSV, txt, or SQL formats. Hope this helps.
我知道您说过您正在寻找一款免费工具,但在这种情况下,我建议您花费 295 美元,这样您就可以在节省的时间中快速获得回报。 去年我一直在使用 RedGate 工具 SQL 数据生成器简而言之,它是一个很棒的工具。 它允许设置列之间的依赖关系,为业务对象生成真实的数据,例如电话号码、URL、名称等。我可以诚实地说,这个工具一次又一次地物有所值。
I know you said you were looking for a free tool, but this is one case where I would suggest that spending $295 will pay you back quickly in time saved. I've been using the RedGate tool SQL Data Generator for the last year and it is, to be short, an awesome tool. It allows for setting dependencies between columns, generates realistic data for business objects such as phone numbers, urls, names, etc. I can honestly state that this tool has paid for itself time and time again.
如果您正在寻找或愿意使用 MySQL 特定的东西,您可以看看 Super Smack 。 目前由托尼·伯克 (Tony Bourke) 维护。
Super Smack 允许您生成随机数据以插入到数据库表中。 它是可定制的,允许您使用打包的words.dat 文件或您选择的任何测试数据。
它的好处之一是它的命令行是高度可定制的。 高性能 MySQL 书中有一些相当不错的使用示例还摘录于此处。
不确定这是否符合您正在寻找的内容,但这只是一个想法。
If you are looking or willing to use something MySQL-specific, you could take a look at Super Smack. It is currently maintained by Tony Bourke.
Super Smack allows you to generate random data to insert into your database tables. It is customizable, allowing you to use the packaged words.dat file, or any test data of your choice.
One of the nice things about it is that it is command-line is highly customizable. There is some fairly decent examples of usage in the book High Performance MySQL which is also excerpted here.
Not sure if that is along the lines of what you are looking for, but just a thought.
带有可用的假数据生成器之一的 Ruby 脚本应该可以满足您的要求。
http://faker.rubyforge.org/ 就是这样的瑰宝。 不幸的是,这并不能满足您的所有要求。
这是另一个: http://random-data.rubyforge.org/
以及使用教程Faker:http://www.rubyandhow.com /how-to-generate-fake-names-addresses-in-ruby/
RE:为现有表定义生成数据的灵活性。 将 Faker gem 与可用的 ORM 之一结合起来。 ActiveRecord 可能是最简单的。
A Ruby script with one of the available fake data generators should do you just fine.
http://faker.rubyforge.org/ is one such gem. Unfortunately, this doesn't fulfill all your requirements.
Here is another: http://random-data.rubyforge.org/
And a tutorial for using Faker: http://www.rubyandhow.com/how-to-generate-fake-names-addresses-in-ruby/
RE: Flexibility to generate data for an existing table definition. Combine the Faker gem with one of the available ORMs. ActiveRecord would probably be easiest.
通常非常昂贵,但如果您是小型 ISV,您可以获得 Visual Studio 2008 Database Edition 非常便宜,请参阅 empower 和 bizspark 促销活动。 它提供了比生成测试数据更多的功能(与 SCC 集成、单元测试、数据库重构等)。
由于我喜欢 Red-Grate 工具如此易于学习,所以我仍然会查看 SQL 数据生成器
Normally very costly, but if you are a small ISV you can get Visual Studio 2008 Database Edition very cheaply, see the empower and bizspark promotions. It provides a lot more functionality then just generating test data (Integration with SCC, Unit Testing, DB Refactoring, etc.)
As I like the fact that Red-Grate tools are so easy to learn, I would still look at SQL Data Generator
列表中确实不应该缺少的一个工具是 Datanamic 的数据生成器,它可以直接填充数据库或生成插入脚本,拥有大量预安装的生成器(并支持多个数据库...
http://www.datanamic.com/datagenerator/index.html
a tool that really should not be missing from the list is the Data Generator from Datanamic that populates databases directly or generates insert scripts, has a large collection of pre-installed generators ( and supports multiple databases...
http://www.datanamic.com/datagenerator/index.html
我知道您不是在寻找实际的 lorem ipsum 文本;而是在寻找实际的 lorem ipsum 文本; 但如果其他人搜索实际的 lorem ipsum 生成器并找到此线程: lipsum.com 做得很好它。
I know you're not looking for actual lorem ipsum text; but in case anyone else searches for an actual lorem ipsum generator and finds this thread: lipsum.com does a great job of it.
不是免费的,但是Visual Studio 2008 Database Edition 是一个很好的替代方案,它提供了更多功能(与 SCC 集成、单元测试、DB 重构等)。 .)
Not free, but Visual Studio 2008 Database Edition is a good alternative and it provides a lot more functionality (Integration with SCC, Unit Testing, DB Refactoring, etc...)
我使用名为 Datatect 的工具:
我使用此工具为 SQLServer 数据库生成了多达 40,000,000 行数据,为 Oracle 数据库生成了 8,000,000 行数据。
我与 Banner Systems 没有任何关系,只是一个满意的客户。
I use a tool called Datatect:
I've used this tool to generate as many as 40,000,000 rows of data to a SQLServer database, and 8,000,000 rows of data to an Oracle database.
I am in no way affiliated with Banner Systems, just a satisfied customer.
以下是此类工具的列表(免费和商业):
http://c2.com/cgi/wiki?TestDataGenerator
Here is the list of such tools (both free and commercial):
http://c2.com/cgi/wiki?TestDataGenerator
对于 OS X,有 Data Creator(7 美元)。 免费下载用于测试目的。 您可以使用它来评估软件及其功能。
它需要 OS X Lion 或后续版本。 它可以生成许多不同的字段类型,并具有自定义导出模式和一些预设(TSV、CSV、Html 表格、内含表格的网页)。
http://www.tensionsoftware.com/osx/datacreator/
位于 App Store:
https://itunes.apple.com/us/app/data -creator/id491686136?mt=12
For OS X there is Data Creator (US $ 7). Download is free for test purpose. You can use it to evaluate the software and its features.
It requires OS X Lion or successive. It can generate a lot of different field type and has a custom export mode plus some pre-set (TSV, CSV, Html table, web page with table inside).
http://www.tensionsoftware.com/osx/datacreator/
here at the App Store:
https://itunes.apple.com/us/app/data-creator/id491686136?mt=12
您可以使用 DbSchema,www.dbschema.com,它是一个数据库管理工具,它有一个随机数据生成器来填充您的数据库。
You can use DbSchema, www.dbschema.com it's a database management tool and it has a Random Data Generator to populate your database.
不是直接回答您的问题,但这对于某些类型的数据可能会有帮助:
假名称生成器可能很有用 - http:// /www.fakenamegenerator.com/ ,除了用户帐户或类似的东西之外,不适用于所有内容。 AFAIK 他们提供批量订单支持。
Not direct answer to your question but this can be helpful for certain kind of data :
Fake Name Generator can be useful - http://www.fakenamegenerator.com/ , not for everything but user accounts or stuff like that. AFAIK They provide support for bulk order.
Benerator +1:我尝试了 3 或 4 个提供的其他工具(包括 dbmonster),但发现 Benerator 非常快,能够提供真实的数据并且非常灵活。 我也很快& 当我在论坛上发帖时,该工具的创建者提供了有用的反馈。
+1 for Benerator: I tried 3 or 4 of the other tools on offer (including dbmonster) but found Benerator to be very quick, to deliver realistic data and to be flexible. I also got very quick & helpful feedback from the tool's creator when I posted on the forum.