如何通过使用循环和 if 语句在字符串变量中搜索字符(子字符串)来创建变量
'大家好。该变量具有字母和字母数字字符。如果它有“m”字符,则为“百万”,如果它有“Th”。是千。
df['mkt_value']
0 €15.00m
1 €1.00m
2 €100Th.
3 €3.00m
4 €900Th.
5 Free
'我假装 i) 通过创建虚拟变量来识别字符串变量是数百万 (m) 还是数千 (Th.)。然后 ii) 使用这个虚拟变量来获取一个新的整数变量,其中百万是数千'
#Desire output
df['mi']
0 15000
1 1000
2 100
3 3000
4 900
5 nan
'所以,我首先进行设置,然后创建一个带有循环的虚拟变量,最后为数千创建一个整数:'
m = 'm'
th = 'Th'
dtype = {"money": "category"}
l_MKV = df['mkt_value'].tolist()
df['mi'] = df['mkt_value'].str.strip('mTh.€')
#loop for var dummy
for x in l_MKV:
if m in x:
df["money"]= 1
else:
df["money"]= 0
# var integer for thousands: 1 million , 0 thousand
if df["money"] == 1:
df["miles"] = int(df['mi']) * 100
else:
ALL['mi']
'循环(对于var dummy) 不起作用。我得到:'
df["money"]
0 0
1 0
2 0
3 0
4 0
'并且我得到 var 整数的语法错误,没有更多规范
我错过了什么?
感谢您的帮助'。
'Hi, everyone. This variable has alphabetic and alphanumeric characters. If it has 'm' character is million and if it has 'Th.' is thousand.
df['mkt_value']
0 €15.00m
1 €1.00m
2 €100Th.
3 €3.00m
4 €900Th.
5 Free
'I pretend to i) identify if string variable is millions (m) or thousands (Th.) by creating a dummy variable. And then ii) use this dummy to get a new integer variable which millions be thousands'
#Desire output
df['mi']
0 15000
1 1000
2 100
3 3000
4 900
5 nan
'So, I first do a set up, then create a dummy with a loop and finally create a integer for the thousands:'
m = 'm'
th = 'Th'
dtype = {"money": "category"}
l_MKV = df['mkt_value'].tolist()
df['mi'] = df['mkt_value'].str.strip('mTh.€')
#loop for var dummy
for x in l_MKV:
if m in x:
df["money"]= 1
else:
df["money"]= 0
# var integer for thousands: 1 million , 0 thousand
if df["money"] == 1:
df["miles"] = int(df['mi']) * 100
else:
ALL['mi']
'The loop (for var dummy) is not working. I get:'
df["money"]
0 0
1 0
2 0
3 0
4 0
'And I get a syntax error for var integer without more specification
What I have missed?
Thanks for any help'.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
您的代码的问题在于您尝试修改系列中的单行的方式。例如,
df["money"] = 0
实际上会将所有行设置为零。我不会乱搞虚拟列,而是创建一个单独的函数来解析值并使用 DataFrame.apply():
The issue with your code is the way you try to modified a single row in a series. For example,
df["money"] = 0
will actually set all rows to zero.Rather than messing around with dummy columns, I would create a separate function for parsing the values and use DataFrame.apply():