数据挖掘 - 在清理文本数据时面临一个困难的正则表达式问题 - 吾爱随笔录

我正在尝试用出现在多个文档中的长字符串中的一些符号替换一系列单词。例如，假设我要删除：

Decision and analysis and comments

从一长串。让字符串为：

s = Management's decision and analysis and comments is to be removed.

我想Decision and analysis and comments从s. 问题是，在Decision, and, analysis, and,之间comments，s可能有 0、1 或多个空格和换行符(\n)出现在不同的文档中，没有任何模式，例如，一个文档显示：

Management's decision  \n \n and analysis\n and \n comments is to be removed

而另一个有不同的模式。我该如何解决这个问题并仍然将其从字符串中删除？

我尝试了以下方法，当然没有成功：

st = 'Management's decision  \n \n and analysis\n and  \n comments is to be removed'    
re.sub(r'Decision[\s\n]and[\s\n]analysis[\s\n]and[\s\n]comments','',s)

在清理文本数据时面临一个困难的正则表达式问题

代码：

测试代码：

结果：