将多个Python行转换为并发数据框架,并与源数据合并
如果这是一个基本问题,我深表歉意。我觉得这应该很容易,但我无法弄清楚。我有下面列出的代码,这些代码本质上查看了CSV文件中的两个列,并匹配了具有0.7相似性的作业标题。为此,我使用difflib.get_close_matches。但是,输出是多个单行,每当我尝试将其转换为数据框架时,每条线都是其自己的数据框架,我无法弄清楚如何合并/concat。所有代码以及当前和所需的输出都在下面。任何帮助都将不胜感激。
当前代码为:
import pandas as pd
import difflib
df = pd.read_csv('name.csv')
aLists = list(df['JTs'])
bLists = list(df['JT'])
n=3
cutoff = 0.7
for aList in aLists:
best = difflib.get_close_matches(aList, bLists, n, cutoff)
print(best)
当前输出为:
['SW Engineer']
['Manu Engineer']
[]
['IT Help']
所需的输出为:
Output
0 SW Engineer
1 Manu Engineer
2 (blank)
3 IT Help
我试图执行此操作的表是:
< img src =“ https://i.sstatic.net/8va9u.png” alt =“必需的表格格式快照”>
任何帮助都将不胜感激!
I apologize if this is a rudimentary question. I feel like it should be easy but I cannot figure it out. I have the code that is listed below that essentially looks at two columns in a CSV file and matches up job titles that have a similarity of 0.7. To do this, I use difflib.get_close_matches. However, the output is multiple single lines and whenever I try to convert to a DataFrame, every single line is its own DataFrame and I cannot figure out how to merge/concat them. All code, as well as current and desired outputs are below. Any help would be much appreciated.
Current Code is:
import pandas as pd
import difflib
df = pd.read_csv('name.csv')
aLists = list(df['JTs'])
bLists = list(df['JT'])
n=3
cutoff = 0.7
for aList in aLists:
best = difflib.get_close_matches(aList, bLists, n, cutoff)
print(best)
Current Output is:
['SW Engineer']
['Manu Engineer']
[]
['IT Help']
Desired Output is:
Output
0 SW Engineer
1 Manu Engineer
2 (blank)
3 IT Help
The table I am attempting to do this one is:
Any help would be greatly appreciated!
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
这是一种实现此目的的简单方法。我首先转换为字符串。然后将第一个和最后一个括号从该字符串中删除,然后将其附加到全局列表中。
输出
是实现这一目标的另一种更好的方法。
您只需使用numpy将多个数组加入单个阵列。然后,可以在需要的情况下将其转换为普通数组。
输出
谢谢
Here is a simple way to achieve this.I have converted first to a string.Then the first and last brackets are removed from that string and then is appended to a global list.
Output
Here is another better way to achieve this.
You can simply use numpy to concatenate multiple arrays into single one.And then you can convert it to normal array if you want.
Output
Thanks
您可以使用熊猫的
.apply()
在每个条目上运行您的功能。然后,可以将其添加为新列或创建的新数据框架。例如:
或用于新的数据帧:
给您:
或::
You could use Panda's
.apply()
to run your function on each entry. This could then either be added as a new column or a new dataframe created.For example:
Or for a new dataframe:
Giving you:
Or: