计算每年提供的评分记录数

发布于 2025-02-07 23:14:49 字数 739 浏览 3 评论 0原文

我的映射器是：

import sys

for line in sys.stdin:
    # split line into the four fields
    fields = line.strip().split("\t")

    value = fields[2] #rating
    key = fields[3] #timestamp in unix seconds
    
    print(key, value, sep="\t")

我的还原器是：

import sys
                                     
(last_key , count) = (None, 0)

for line in sys.stdin:

    (key, value) = line.strip().split("\t")
   
    if  (last_key  and last_key  !=key):
          print(last_key, count, sep="\t")
          count=0
    
    last_key  = key
    count += int(value)
    
          
print(last_key, count, sep="\t")

如何获得评级数？映射器工作正常。我什么时候应该转换时间戳（在这种情况下为last_key）

输出应为（年度\ t评级记录的数量）

原文

My mapper is:

import sys

for line in sys.stdin:
    # split line into the four fields
    fields = line.strip().split("\t")

    value = fields[2] #rating
    key = fields[3] #timestamp in unix seconds
    
    print(key, value, sep="\t")

My reducer is:

import sys
                                     
(last_key , count) = (None, 0)

for line in sys.stdin:

    (key, value) = line.strip().split("\t")
   
    if  (last_key  and last_key  !=key):
          print(last_key, count, sep="\t")
          count=0
    
    last_key  = key
    count += int(value)
    
          
print(last_key, count, sep="\t")

How to I get the number of ratings? The mapper works fine. And when should I convert the timestamp (last_key in this case)

Output should be (year-month \t number of rating records)

分享到QQ

分享到微博