将 SAS 数据(包括表结构)存储在单个平面文件中
我需要将 SAS 数据表转换为平面文件(或“ASCII 文件”,因为它们被称为一次,而不是二进制文件)。 每个原始 SAS 表只有一个平面文件。 具有挑战性的是,我希望平面文件还包含原始 SAS 表的一些结构信息,具体来说:
- 变量/列名称
- 变量/列标签
- 变量/列类型
- 变量/列长度
- 变量/列格式
- 变量/列信息
其他信息:
- 我只需要转换小数据(< 100 obs)。
- 性能不是问题(在合理范围内)。
- 平面文件应该构成重新创建原始 SAS 表的基础,我不需要能够直接使用该文件作为 DATA 或 PROC 步骤中的表。
标准 SAS 表、传输文件、XPORT 文件等都是二进制格式文件,SAS 和 CSV 文件中的标准 XML 表格式不保留表结构。 显然这些选项没有帮助。
我最好的选择是什么?
I need to convert SAS data tables into flat files (or "ASCII files" as they were called once, as opposed to binary files). And only one flat file for each original SAS table.
The challenging thing is that I want the flat file to contain some structural information of the original SAS table also, specifically:
- Variable/Column name
- Variable/Column label
- Variable/Column type
- Variable/Column length
- Variable/Column format
- Variable/Column informat
Additional information:
- I will only need to convert small data (< 100 obs).
- Performance is not an issue (within reasonable limits).
- The flat file should form a basis for recreating the original SAS table, I don't need to be able to use the file directly as a table in DATA or PROC steps.
The standard SAS tables, transport files, XPORT files, etc are all binary format files, and the standard XML table format in SAS and CSV-files don't preserve table structure. So obviously these options don't help.
What is my best option?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(5)
我不知道有任何简单的解决方案。
可能:
现在您已经获得了该表的 ASCII 描述(分布在两个 CSV 文件中)。 逆转这个过程会更加棘手。 基本上,您必须读取描述数据集,然后在循环中使用 CALL SYMPUT 创建一堆包含其中信息的宏变量,然后使用宏变量为 CSV 文件构建 PROC IMPORT...
I'm not aware of any easy solutions.
Possibly:
Now you've got your ASCII description of the table (spread over two CSV files). Reversing the process would be more tricky. Basically you'd have to read in the description data set, then use CALL SYMPUT in a loop to create a bunch of macro variables with the information in them, then use your macro variables to build a PROC IMPORT for the CSV file...
创建将表导出为文本的代码(这很简单,只需在 google 上搜索或查看“The Little SAS Book”(如果您有副本)。
然后附加来自 sashelp.vcolumn 的“元”信息,这是 sas 存储有关 sas 数据集的信息(元数据)的位置。 它本身就是一个 sas 表,因此您可以执行 proc sql union 操作将其与该表描述的实际列连接(尽管您需要执行转置类型操作,因为有关列的元数据位于行中,而不是列中)。
您并没有完全具体说明您希望如何查看文本文件中的元数据,所以这就是我所能做到的。
Create the code to export the table to text (this is straightforward, just google it or look at 'The Little SAS Book' if you have a copy).
Then append the 'meta' info from sashelp.vcolumn, which is where sas stores information (meta data) about sas datasets. It's a sas table itself, so you could do a proc sql union operation to join it with the actual columns that this table describes (though you will need to do a transpose type operation because the meta data about the columns is in rows, not columns).
You're not being completely specific about how you want to see the meta data in the text file, so that's as far as I can go.
proc sql 的描述语法可能很方便获取元数据部分,包括长度、类型、格式、索引等...
代码:
日志:
proc sql's describe syntax might be handy to get the metadata portion, including lengths, types, formats, indexes etc...
Code:
Log:
使用 SAS 9.2,您可以从数据集创建 XML 文件,并且 XML 包含变量/列元数据,如格式、标签等...请参阅 SAS 9.2 XML LIBNAME 引擎:用户指南中标题为“使用 XML跨操作环境传输 SAS 数据集的引擎”。 其链接位于:
http:// /support.sas.com/documentation/cdl/en/engxml/61740/HTML/default/a002594382.htm
以下是手册中的一段代码,显示如何使用 XML92 libname 引擎和 PROC COPY 创建 XML :
在 SAS 9.1.3 中,您可能必须创建自定义标记集才能获得相同的操作。 SAS 技术支持 ([电子邮件受保护])也许能够提供一些帮助。
With SAS 9.2, you can create an XML file from a data set and the XML contains variable/column metadata, like format, label, etc... See the section of the SAS 9.2 XML LIBNAME Engine: User's Guide titled "Using the XML Engine to Transport SAS Data Sets across Operating Environments". A link to it is here:
http://support.sas.com/documentation/cdl/en/engxml/61740/HTML/default/a002594382.htm
Here's a section of code from the manual that shows using the XML92 libname engine and PROC COPY to create the XML:
In SAS 9.1.3, you may have to create a custom tagset to get the same operation. SAS Technical Support ([email protected]) may be able to offer some help.
顺便说一句 - 你还没有说为什么你需要这样做。 在这种情况下,没有充分的理由(可能有一个令人信服的理由,例如某人有权势)
说“要么做,要么被解雇”,但没有充分的理由)。
我会放弃合并每个文件中的元数据和数据的想法,除非有一些令人难以置信的充分理由这样做。 将数据集A的元数据导出到名为metadata_A的文件中; 这将产生配对文件。 任何想要在数据库程序或统计程序中使用这些文件的人都会有一个明确标记的元数据文件可供使用。
BTW - you haven't said why you need to do this. In this case, there is no good reason (there might be a compelling reason, such as somebody with power
saying 'do it, or be fired', but there's no good reason).
I'd give up the idea of merging the metadata and data in each file, unless there's some incredibly strong reason to do so. Go with exporting the metadata for data set A into a file called metadata_A; this will result in paired files. Anybody looking to use those files in a a database program or statistical program would have a clearly-labeled metadata file to work with.