我的目标:使用熊猫,检查匹配文本的列[不精确],如果为真,则更新新列。
从 csv 文件中,创建了一个数据框,并检查特定列的值 - COLUMN_to_Check 是否匹配文本模式 - 'PEA'。根据模式是否匹配,在数据框中创建一个新列,使用 YES 或 NO。
我在文件 DATA2.csv 中有以下数据
ASSIGNMENT,Open date,Resolved date,COLUMN_to_Check,NUMBER,Open Time,RESOLVED_GROUP,RESOLVED_TIME,SUBCATEGORY
GBL_IS_GRC_PROCESSCONTROL,3/1/2017 13:39,11/1/2017 13:09,APAC_LT-ERP-FICO-BOKADABISH_PRD,IM-17-001200,3/1/2017 13:39,GBL_GSO_MQG,11/1/2017 13:09,Security (breach or weakness)
RSP_SERVICEDESK,12/1/2017 0:08,12/1/2017 0:27,APAC_LT-ERP-SALES-PEA_PRD,IM-17-006462,12/1/2017 0:08,RSP_SERVICEDESK,12/1/2017 0:27,failure
RSP_SERVICEDESK,10/1/2017 5:27,12/1/2017 0:52,APAC_LT-ERP-SUPPLY-PEA_PRD,IM-17-004667,10/1/2017 5:27,RSP_PCS_INCIDENTS,12/1/2017 0:52,failure
RSP_SERVICEDESK,12/1/2017 2:35,12/1/2017 3:03,APAC_LT-ERP-SALES-PEA_PRD,IM-17-006483,12/1/2017 2:35,RSP_SERVICEDESK,12/1/2017 3:03,access
RSP_SAP_BI,10/1/2017 21:04,12/1/2017 6:01,APAC_LT-ERP-SALES-PEA_PRD,IM-17-005498,10/1/2017 21:04,RSP_SAP_SALES,12/1/2017 6:01,SAP Sales
并使用此代码....
import pandas as pd
df=pd.read_csv('DATA2.csv')
Search_for_These_values = ['PEA', 'DEF', 'XYZ'] #creating list
pattern = '|'.join(Search_for_These_values) # joining list for comparision
IScritical=df['COLUMN_to_Check'].str.contains(pattern)
for CHECK in IScritical:
if not CHECK:
print CHECK
df['NEWcolumn']='NO'
else:
print CHECK
df['NEWcolumn']='YES'
df.to_csv('OUPUT.csv')
打印 'CHECK' 的值返回正确的值,即第一行返回 false。
C:\Users\ME\Documents\SandBox (master)
λ python numpytest_pub.py
False
True
True
True
True
但输出 csv 文件将“NEWColumn”的所有值显示为“YES”,其中“NEWcolumn”行 [0] 上的值应为“NO”,因为此处的“COLUMN_to_Check”不应与模式匹配。
,ASSIGNMENT,Open date,Resolved date,COLUMN_to_Check,NUMBER,Open Time,RESOLVED_GROUP,RESOLVED_TIME,SUBCATEGORY,NEWcolumn
0,GBL_IS_GRC_PROCESSCONTROL,3/1/2017 13:39,11/1/2017 13:09,APAC_LT-ERP-FICO-BOKADABISH_PRD,IM-17-001200,3/1/2017 13:39,GBL_GSO_MQG,11/1/2017 13:09,Security (breach or weakness),YES
1,RSP_SERVICEDESK,12/1/2017 0:08,12/1/2017 0:27,APAC_LT-ERP-SALES-PEA_PRD,IM-17-006462,12/1/2017 0:08,RSP_SERVICEDESK,12/1/2017 0:27,failure,YES
2,RSP_SERVICEDESK,10/1/2017 5:27,12/1/2017 0:52,APAC_LT-ERP-SUPPLY-PEA_PRD,IM-17-004667,10/1/2017 5:27,RSP_PCS_INCIDENTS,12/1/2017 0:52,failure,YES
3,RSP_SERVICEDESK,12/1/2017 2:35,12/1/2017 3:03,APAC_LT-ERP-SALES-PEA_PRD,IM-17-006483,12/1/2017 2:35,RSP_SERVICEDESK,12/1/2017 3:03,access,YES
4,RSP_SAP_BI,10/1/2017 21:04,12/1/2017 6:01,APAC_LT-ERP-SALES-PEA_PRD,IM-17-005498,10/1/2017 21:04,RSP_SAP_SALES,12/1/2017 6:01,SAP Sales,YES
我可以感觉到 CHECK 部分缺少某些东西,但无法弄清楚是什么。任何人都可以帮忙吗?
让我知道该问题是否需要重新表述以更好地理解或未来的社区使用。