如何提取特定子字符串并将文本与 pandas 数据框中的数字分开?
我在数据框中有一些以下格式的数据。请参阅下面的图片链接
我试图解决的问题有两个
- 方面工资列,我想将文本和数字分开并提取值。只要有一个范围,我就想取平均值
- 取决于工资是否为每小时/每周/每年等,我想根据是否存在子字符串字符(例如('年','月', 'week'、'hour' 等)
最终输出应如下图所示
谢谢!
I have some data in a dataframe in the below format. Please see image link below
The problem I'm trying to solve is two-fold
- For the salary column and I want to separate the text and numbers and extract the value. Wherever there is a range I want to take the average
- Depending on if the salary is hourly/weekly/yearly etc I want to add a column for salary type based on if there are substring characters such as ('year','month','week','hour' etc)
The final output should look like what is in the image below
Thanks!
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
这可以对你有用
This can work for you
这是一个有趣的问题,但下次请提供输入数据作为我们可以复制/粘贴的内容。
您需要一个函数,将工资数据的字符串转换为值和工资类型。
您可以解析字符串中的字符以查找数字,并在遇到
-
(破折号)字符时使用布尔开关,以防您需要计算平均值。输出
This is an interesting question, but next time please provide the input data as something we can copy/paste.
What you need is a function that converts the string for the salary data into the value and the salary type.
You parse over the characters in the string to find the numbers, and use a boolean switch when you encounter the
-
(dash) character, in case you need to calculate an average.output