我有一本有键和值对的字典。我还有一个包含包含各种键的字符串的列的数据框架。如果一个键出现在数据帧的列中,我想在相邻列中追加相应的值。
my_dict = {'elon' : 'is awesome', 'jeff' : 'is not so awesome, but hes ok, ig', 'mustard' : 'is gross', 'pigs' : 'can fly'}
my_dict
import pandas as pd
import numpy as np
pd.DataFrame({'Name (Key)' : ['elon musk', 'jeff bezos and elon musk', 'jeff bezos', 'she bought mustard for elon'], 'Corresponding Value(s)' : [np.nan, np.nan, np.nan, np.nan]})
期望产出:
# Desired output:
pd.DataFrame({'Name (Key)' : ['elon musk', 'jeff bezos and elon musk', 'jeff bezos', 'she bought mustard for elon'],
'Corresponding Value(s)' : [['is awesome'], ['is not so awesome, but hes ok, ig', 'is awesome'], ['is not so awesome, but hes ok, ig'], ['is gross', 'is awesome']]})
我对python还不熟悉,但假设这里将使用apply函数。或者可能是map()?如果声明是可信的,还是有更好的方法来解决这个问题?
发布于 2022-10-24 15:49:05
下面是一种使用.apply()
创建附加列的方法。除了if
之外,还需要遍历Name (Key)
列的单词,以便在列表中创建多个项,即新的DataFrame列的值。
import pandas as pd
df = pd.DataFrame({'Name (Key)' : ['elon musk', 'jeff bezos and elon musk', 'jeff bezos', 'she bought mustard for elon']})
my_dict = {'elon' : 'is awesome',
'jeff' : 'is not so awesome, but hes ok, ig',
'mustard' : 'is gross',
'pigs' : 'can fly'}
def create_corr_vals_column(row_value):
cvc_row = []
for word in row_value.split():
if word in my_dict:
cvc_row.append(my_dict[word])
return cvc_row
df['Corresponding Value(s)'] = df['Name (Key)'].apply( create_corr_vals_column )
print(df)
给予:
Name (Key) Corresponding Value(s)
0 elon musk [is awesome]
1 jeff bezos and elon musk [is not so awesome, but hes ok, ig, is awesome]
2 jeff bezos [is not so awesome, but hes ok, ig]
3 she bought mustard for elon [is gross, is awesome]
https://stackoverflow.com/questions/74187627
复制相似问题