处理具有 2 行数据的行
数据挖掘
Python
熊猫
缺失数据
2022-01-19 19:27:24
1个回答
下面的代码应该达到预期的结果:
import pandas as pd
import numpy as np
import re
df2 = pd.DataFrame([[1, 'plugs: $3.00'],
[4, np.NaN],
[7, 'quarts: $3.00']],
columns=['name', 'price'])
df2
name price
0 1 plugs: $3.00
1 4 NaN
2 7 quarts: $3.00
def price(x):
rprice = re.search('(plugs:|quarts:)\s*\$([\d\.]*)', x)
if rprice == None:
return ('','0')
else:
return rprice.groups()
df2.fillna("", inplace=True)
df2['price'].map(lambda x: price(x))
df2['Plugs'] = df2['price'].map(lambda x: float(price(x)[1]) if price(x)[0] == 'plugs:' else 0)
df2['Quart'] = df2['price'].map(lambda x: float(price(x)[1]) if price(x)[0] == 'quarts:' else 0)
df2(下面有新列)
name price Plugs Quart
0 1 plugs: $3.00 3.0 0.0
1 4 0.0 0.0
2 7 quarts: $3.00 0.0 3.0
一些注意事项:我使用正则表达式来提取类型和成本,并将 NA 替换为空白文本。