执行时的 SQLAlchemy 模型定义
我正在使用 Elixir 和 SQLAlchemy 编写一个 ORM 来处理将类似电子表格的数据移动到 SQL 中。一般来说,类似电子表格的数据的内容是未知的,pyparsing 从文本文件中解析有关类似电子表格的数据的(元)数据
(例如:人口普查以固定的平面文件发布当前人口调查,并附有txt 文件描述了数据的内容,包括数据中每一列的列规范和文档)
正如我想象的那样,ORM 看起来像这样
class DataSet(entity)
"""a collection of many spreadsheet-like files"""
class DataFile(entity)
"""describes a single spreadsheet-like file"""
class Variable(entity)
"""describes a single column in spreadsheet-like file"""
所以,这个模型描述了一堆位于硬盘上的平面文件的内容驾驶。现在,如果我想将这些平面文件转换为 SQL,我应该
尝试将 SQL 编写为字符串并替换上面编写的模型中的信息
尝试定义一个新的 Elixir/SQLAlchemy 实体
第三个选项
最终,什么我认为我想要的是 SQL 中所有类似电子表格的数据文件作为类似电子表格的表,以及用于处理所有元数据的 Elixir/SQLAlchemy 魔法
我已经阅读了很多 SQLAlchemy 文档以及什么不是,但它们似乎都是为“所以你想写博客”类型的应用程序编写的,或者至少是在编写代码之前完全识别数据结构的应用程序。我想我正在尝试编写一个与其列规格无关的模型。
I'm writing an ORM using Elixir and SQLAlchemy to handle moving spreadsheet-like data into SQL. In general, the content of the spreadsheet-like data is unknown, and pyparsing parses (meta) data about the spreadsheet-like data from a text file
(example: the census publishes the Current Population Survey in fixed with flat files, accompanied by a txt file which describes contents of the data, including column specification and documentation for each column in data)
As I've imagined it, the ORM looks something like this
class DataSet(entity)
"""a collection of many spreadsheet-like files"""
class DataFile(entity)
"""describes a single spreadsheet-like file"""
class Variable(entity)
"""describes a single column in spreadsheet-like file"""
So, this model describes the contents of a bunch of flat files sitting on a hard drive. Now, if I want to take those flat files to SQL, should I
try to write the SQL as a string and substitute information from the model written above
try to define a new Elixir/SQLAlchemy entity
some third option
At the end of the day, what I think I want is all of the spreadsheet like data files in SQL as spreadsheet-like tables, and Elixir/SQLAlchemy magic for handling all the meta data
I've read a lot of SQLAlchemy docs and what not, but they all seem to be written for 'so you want to write a blog' type applications, or at least applications where the structure of the data is completely identified before writing code. I guess I'm trying to write a model which is agnostic about its column specifications.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
我的第一个想法是,长生不老药对你解决问题不会有太大帮助。
我的建议是选择 2),因为您尝试根据您拥有的元数据构建一个表。
(重新)阅读架构文档,了解如何以编程方式添加列,然后创建表:
http://www.sqlalchemy.org/docs/core/schema.html
例如
然后您可以开始插入到新创建的表中。
您还需要跟踪生成的表数据文件之间的映射。
如果生成的表与架构匹配,则可以将其重新用于另一个数据文件。
My first thought would be that elixir won't benefit you much towards a solution.
My advice would be to go with 2) in that you try to build up a table based on the metadata that you have.
(re)read the schema docs to see how you'd add columns programmatically and then create the table:
http://www.sqlalchemy.org/docs/core/schema.html
e.g.
Then you can start inserting into your newly created table.
You'll also want to keep track of the mapping between generated tables datafiles.
You could reuse a generated table for another datafile if it's schema matched.