基于 SAS 中的 2 个变量合并数据集
我正在使用不同的数据库。它们全部包含 1000 多家公司的信息。公司由其股票代码(通常在股票报价板上看到的名称的缩写(即福特为 F)定义)。
除了要合并的股票代码之外,我还必须按时间合并。我在整个时间序列中使用月份作为计数变量。 的回归
最终目的是进行Y(jt) = c + X(jt) +X1(jt)
等与 j = company
(股票代码) 和t = 时间
(月)。
想象一下我有两个数据库,其中一个是基本数据库,其中包含股票、月份、公司贝塔值(风险度量)等变量,第二个数据库包含额外变量(比如说市值)。
然后我想做的是根据股票代码和月份合并这两个数据库。
示例: 基础数据库:
Ticker ____ Month ____ Betas AA ____ 4 ____ 1.2 BB ____ 8 ____ 1.18
第二个数据库:
Ticker ____ Month ____ MCAP AA ____ 4 ____ 8542 BB ____ 6 ____ 1245
然后合并后我希望得到这样的结果:
Ticker ____ Month ____ Betas ____ MCAP AA ____ 4 ____ 1.2 ____ 8542
因此,所有与日期和股票代码都不匹配的观察结果都必须被删除。我确信这是可能的,只是找不到正确类型的代码。
PS:我猜下划线与字体布局有关,但粗体和斜体都应该是正常的:)
I'm working with different databases. All of them contain information about 1000+ companies. A company is defined by its ticker code (the short version of the name (i.e. Ford as F) usually seen on stock quotation boards).
Aside from the ticker code to merge on I also have to merge on the time. I used month as a count variable throughout my time series. The final purpose is to have a regression in the kind of
Y(jt) = c + X(jt) +X1(jt)
etc with j = company
(ticker) and t = time
(month).
So imagine I have 2 databases, one of which is the base database with variables such as Tickers, months, betas of a company (risk measure) etc. and a second database which has an extra variable (let's say market capitalisation).
What I want to do then is to merge these 2 databases based on the ticker and the month.
Example: Base database:
Ticker ____ Month ____ Betas AA ____ 4 ____ 1.2 BB ____ 8 ____ 1.18
Second database:
Ticker ____ Month ____ MCAP AA ____ 4 ____ 8542 BB ____ 6 ____ 1245
Then after merge I would like to have something like this:
Ticker ____ Month ____ Betas ____ MCAP AA ____ 4 ____ 1.2 ____ 8542
So all observations that do not match BOTH the date and ticker have to be dropped. I'm sure this is possible, just can't find the right type of code.
PS: I'm guessing the underscores have something to do with font layout but both the bold as italic is supposed to be normal :)
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(4)
同意乔纳森的观点...在按代码和时间对两个数据集进行独立排序后,合并的数据步骤就是我将使用的......稍加修改
两个数据集中没有共同代码和时间的记录将被自动删除..
Agree with Jonathan... after sorting both datasets independently by ticker and time, the data step of merging is what I would use..... little modification
Records that don't have common ticker and time in both datasets would be dropped automatically..
调用两个数据集 base 和 mcap,并假设它们都已按代码和月份排序,您可以这样做:
子集 if 不会接受任何与 Bath 数据集中不匹配的行。
Calling the two datasets base and mcap, and assuming that they have both been sorted by ticker and month, you can do it this way:
The subsetting if will not accept any row that does not match in bath datasets.
好吧,看来你可以很容易地做到这一点:
这是我使用的代码,我在发布之前尝试过这个,因为我不想看起来像个水蛭,但碰巧我尝试的两个数据库没有任何共同点( 70.000 个观察值的几率是多少 :D),我重试了它并且它有效(现在!)
无论如何谢谢!
Ok so it appears you can just do it very easily by:
That's the code I used, I tried this before posting because I didn't want to look like a leecher but it so happens that the 2 databases i tried had nothing in common (what are the odds with 70.000 observations :D), I retried it and it works (for now!)
Thanks anyway!