计算交易的最高进先出
我正在尝试在交易中使用先进先出的会计方法。 “最高进先出”意味着当您出售时,您首先出售最昂贵的股票。
这是我的买入和卖出(示例借自R计算总收益或使用先进先出法的损失 - 这是一个类似但不同的问题):
buy = data.frame(BuyTransactionID = c(1:10),
Ticker=c(rep('MSFT',4),rep('AMZN',3),rep('DOCU',3)),
Date=c(rep('01-01-2018',2),rep('01-14-2020',2),rep('01-01-2018',2),rep('01-14-2020',1),'01-01-2018','03-15-2020','04-06-2020'),
Price=c(100,102,102,107,2000,2010,2011,197,182,167),
Quantity=c(10,10,5,5,1,1,2,12,15,15))
sell = data.frame(SellTransactionID=c(1:7),
Ticker=c('MSFT','MSFT','AMZN','AMZN','DOCU','DOCU','DOCU'),
Date=c('01-07-2020','01-20-2020','01-01-2020','01-30-2020','01-15-2020','04-10-2020','04-20-2020'),
Price=c(97,110,2100,2050,210,205,225),
Quantity=c(15,12,1,3,10,5,3))
规则如下:
- 您首先出售最昂贵(最高价格)的股票。
- 您不能在购买股票之前出售股票
- 您不能多次出售相同的股票
示例问题:
第一次销售 (SellTransactionID = 1) 是 01-07 的 15 股 MSFT -2020年。因此,在此日期之前购买的任何商品都可以出售。根据日期,符合资格出售的股票是来自 BuyTransactionID 1 和 2 的股票。BuyTransactionID 2 是最高价格。因此,BuyTransactionID 2 的所有 10 股均已售出,其余 5 股来自 BuyTransactionID 1。
期望输出:
“售出日期”= 售出日期(不言自明) ;
'Ticker' = 已售出的股票代码;
“收益”= 销售总额;
“成本基础”= 已售股票的加权平均值。
示例解决方案:
这是 SellTransactionID 1 的解决方案。正确的解决方案会自动执行此操作并计算所有 SellTransactionID。
result <- data.frame(SellDate = '01-07-2020', Ticker = "MSFT", Proceeds = 1455, CostBasis = 101.33)
成本基础示例:
成本基础按加权平均值计算。对于前面的示例,成本基础的计算方式如下:(Quantity1 * Price1 + Quantity2 * Price2 + .....)/所有数量的总和
因此例如上面的示例:(10 * 102 + 5 * 100)/ 15
I am trying to use the Highest In, First Out accounting method on trades.
Highest In, First Out means that when you sell, you sell your most expensive shares first.
Here are my buys and sells (example borrowed from R calculate aggregate gains or loss using FIFO method - this is a similar, but different problem):
buy = data.frame(BuyTransactionID = c(1:10),
Ticker=c(rep('MSFT',4),rep('AMZN',3),rep('DOCU',3)),
Date=c(rep('01-01-2018',2),rep('01-14-2020',2),rep('01-01-2018',2),rep('01-14-2020',1),'01-01-2018','03-15-2020','04-06-2020'),
Price=c(100,102,102,107,2000,2010,2011,197,182,167),
Quantity=c(10,10,5,5,1,1,2,12,15,15))
sell = data.frame(SellTransactionID=c(1:7),
Ticker=c('MSFT','MSFT','AMZN','AMZN','DOCU','DOCU','DOCU'),
Date=c('01-07-2020','01-20-2020','01-01-2020','01-30-2020','01-15-2020','04-10-2020','04-20-2020'),
Price=c(97,110,2100,2050,210,205,225),
Quantity=c(15,12,1,3,10,5,3))
Here are the rules:
- You sell the most expensive (highest price) shares first.
- You cannot sell shares before you purchased them
- You cannot sell the same shares multiple times
Example problem:
The first sale (SellTransactionID = 1) is 15 shares of MSFT on 01-07-2020. So, any purchase made before that date can be sold. Based on date, the eligible shares to be sold are those from BuyTransactionID 1 and 2. BuyTransactionID 2 is the highest price. Therefore, all 10 shares of BuyTransactionID 2 are sold and the remaining 5 shares come from BuyTransactionID 1.
Desired output:
'Date Sold' = the date sold (self-explanatory);
'Ticker' = the ticker sold;
'Proceeds' = the total dollar amount sold;
'Cost basis' = a weighted average of the shares sold.
Example solution:
This is the solution for SellTransactionID 1. A properly solution automates this and calculates for all SellTransactionIDs.
result <- data.frame(SellDate = '01-07-2020', Ticker = "MSFT", Proceeds = 1455, CostBasis = 101.33)
Cost Basis Example:
Cost basis is calculated as a weighted average. For the preceding example, Cost Basis is calculated as such: (Quantity1 * Price1 + Quanity2 * Price2 + .....)/sum of all Quantity(s)
So for example above: (10 * 102 + 5 * 100)/15
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(3)
@DPH 的答案非常好,但不幸的是不够准确。我会解释原因。
这是一个新数据集,其中所有购买都在销售之前:
如果应用 @DPH 的解决方案,您将得到以下结果:
没有改变,最近三笔交易的“Sales_Cost”也没有改变。发生这种情况是因为该函数确定首次出售后剩余的股票数量以及剩余股票的平均价格是多少。首次出售之前购买的股票不能再单独出售。它们现在被视为具有平均价格和剩余股份数量的单一实体。
例如,本例中总共购买了 76 股。首次出售出售 7 股。现在,如“Remain_Qtd”中所示,仍有 69 股保留。计算剩余股票的平均价格 - 该价格为 106.5652 美元。现在,该流程认为所有 69 股股票的定价为 106.5652 美元,剩余销售量会减少“Remain_Qtd”的数量,但不会更改“Remain_Price”。剩余股份不能再按购买时的价格考虑,它们共同构成剩余股份和平均剩余价格。
发生这种情况是因为对象
dfo
以及对象sdf
中dfo
的回收。特别是,该行计算平均剩余价格,然后通过 dfo 和 sdf 回收该价格。Quantity = sum(sdf$Quantity)
将所有剩余份额加在一起。我认为@DPH 的答案非常出色,但希望可以对其进行修改,以单独处理每次购买,而不是汇总过去的购买。
The answer by @DPH is excellent, but unfortunately not quite accurate enough. I will explain why.
Here is a new dataset where all the purchases precede the sales:
If you apply the solution from @DPH, you will get this result:

Notice that the 'Remain_Price' does not change, nor does the 'Sales_Cost' for the last three transactions. This happens because the function determines how many shares remain after the first sale and what the average price of the remaining shares is. The shares purchased preceding the first sale can no longer be sold individually. They are now treated as a single entity with an average price and the remaining number of shares.
For example, a total of 76 shares were bought in this example. The first sale sells 7 shares. Now, 69 shares remain as seen in 'Remain_Qtd'. An average price is calculated for those remaining shares - that price is $106.5652. Now, the process considers all 69 shares to be priced at $106.5652 and the remaining sales reduce the quantity of 'Remain_Qtd', but does not change the 'Remain_Price'. The remaining shares can no longer be considered at the price that they were bought at, they are collectively part of the remaining shares and the average remaining price.
This occurs because of the object
dfo
and the recycling ofdfo
in the objectsdf
. In particular, this line calculates an average remaining price that is then recycled throughdfo
andsdf
.and
Quantity = sum(sdf$Quantity)
adds together all the remaining shares.I think the answer by @DPH is brilliant, but hope that it can be modified to treat each purchase individually rather than aggregating past purchases.
如果我正确理解您的问题,这是一种可能的解决方案。在简历中,我将销售额和购买数据组合在一起,并将其分组到销售块中(由销售ID给出)。这假设销售ID的顺序是根据日期列的。然后,我顺序循环浏览这些销售块,并将中间结果写入单个数据框架。对于每个销售块处理,将对同一股票的最后一个销售块结果过滤此结果数据框。这意味着销售量不得根据时间表大于可用数量(因为您无法出售自己没有的东西,无论如何,我必须将其指出为可能的限制)
提议的循环解决方案1不是在R中使用数据的最佳方法,因为它是一个循环,它会生长一个data.frame。由于您列出了
purrr
标签,因此我为答案的第二部分调整了与Map()函数一起工作的代码。在进入实际编码之前,让我们首先准备数据(以相同方式对答案的两个部分需要):
1标准循环
2循环循环恢复为purrr解决方案(请注意全局分配操作员(&lt;&lt; - 而不是) &lt ;-)用于在功能结束时分配DFO)
编辑
要跟踪其余股票,我们需要第二个DF来保存当前的投资组合数据。我没有优化代码并仅编辑循环,但是
purrr
适应应非常直接。If I understood your problem correctly this is one possible solution. In resume I am combinig the sales and buys data and group it in sales blocks (given by the sales ID). This assumes that the order of sales IDs is according to the date column. I then loop over these sales blocks sequentially and write the intermediate result to a individual dataframe. For each sales block processing this result dataframe is filtered for the last sales block result of the same ticker. This means sales quantity must not be larger than available quantity according to the timeline (since you can not sell what you not have this should not be of concern anyhow I have to point it out as a possible limitation)
The proposed loop solution 1 is not the best way to work data in R since it is a loop, which grows a data.frame. Since you listed the
purrr
tag I adapted the code for the second part of the answer to work with the map() function.Before we get to the actual coding lets prepare the data first (need for both parts of the answer the same way):
1 Standard loop
2 loop rephrase as purrr solution (be aware of the global assignment operartor (<<- instead of <-) for assignment of dfo at end of function)
EDIT
To keep track of the remaining stocks we need a second df to hold the current portfolio data. I did not optimize the code and editted only the loop, the
purrr
adaption should be pretty straight foreward though.这是我在@DPH 的帮助下得出的最终工作解决方案。我对@DPH 编辑的解决方案做了一些更改。
df[]
)更改为 dplyr 子集化(即df %>% filter()
)。出于某种原因,基本子集化导致我的实际数据集中出现具有 NA 值的行,即使它并没有导致示例数据集。 NA 行导致解决方案无法正常工作:数据框准备:
函数和 purrr:
如果有人有任何问题,请告诉我,我想将其放入闪亮的应用程序中,如果您有兴趣合作,请告诉我。
Here is the final working solution that I have come to with the help of @DPH. I have made a couple of changes to @DPH's edited solution.
dfh
object. The updates solution does work with the modified dateset that I provided but not the original dateset. I have modified the answer so that it works when all shares are sold.df[]
) to dplyr subsetting (i.e.,df %>% filter()
. For some reason the base subsetting was resulting in rows with NA values in my actual dataset even though it did not cause that in the sample dataset. The NA rows caused the solution not to work.data frame prep:
function and purrr:
Let me know if anyone has any issues. I'd like to get this into a shinyapp and maybe develop it more. Let me know if you're interested in collaborating.