Pyspark UDF检测“演员”
我有一个矩阵(dataframe),我想在其中找到所有行,列和列与“ 1”相交。 (“字符”行值匹配列名)
示例。山姆是演员。 (他在“演员”列中有一个“ 1”,而行是“演员”的“字符”值。)这将是我想要返回的行。
df = spark.createDataFrame(
[
("actor", "sam", "1", "0", "0", "0", "0"),
("villan", "jack", "0", "0", "0", "0", "0"),
("actress", "rose", "0", "0", "0", "1", "0"),
("comedian", "mike", "0", "1", "1", "0", "1"),
("musician", "young", "1", "1", "1", "1", "0")
],
["character", "name", "actor", "villan", "comedian", "actress", "musician"]
)
+---------+-----+-----+------+--------+-------+--------+
|character| name|actor|villan|comedian|actress|musician|
+---------+-----+-----+------+--------+-------+--------+
| actor| sam| 1| 0| 0| 0| 0|
| villan| jack| 0| 0| 0| 0| 0|
| actress| rose| 0| 0| 0| 1| 0|
| comedian| mike| 0| 1| 1| 0| 1|
| musician|young| 1| 1| 1| 1| 0|
+---------+-----+-----+------+--------+-------+--------+
I have a matrix(dataframe) I want to find all the rows there the row and columns intersect with a '1'. (The 'Character' row value matches the column name)
Example. Sam is an actor. (He has a '1' in the column 'actor' and the row the 'character' value of 'actor'.) This would be a row I'm would want returned.
df = spark.createDataFrame(
[
("actor", "sam", "1", "0", "0", "0", "0"),
("villan", "jack", "0", "0", "0", "0", "0"),
("actress", "rose", "0", "0", "0", "1", "0"),
("comedian", "mike", "0", "1", "1", "0", "1"),
("musician", "young", "1", "1", "1", "1", "0")
],
["character", "name", "actor", "villan", "comedian", "actress", "musician"]
)
+---------+-----+-----+------+--------+-------+--------+
|character| name|actor|villan|comedian|actress|musician|
+---------+-----+-----+------+--------+-------+--------+
| actor| sam| 1| 0| 0| 0| 0|
| villan| jack| 0| 0| 0| 0| 0|
| actress| rose| 0| 0| 0| 1| 0|
| comedian| mike| 0| 1| 1| 0| 1|
| musician|young| 1| 1| 1| 1| 0|
+---------+-----+-----+------+--------+-------+--------+
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
data:image/s3,"s3://crabby-images/d5906/d59060df4059a6cc364216c4d63ceec29ef7fe66" alt="扫码二维码加入Web技术交流群"
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)