数据挖掘 - 如何使用 unigram 和 bigram 作为 SVM 或逻辑回归的特征 - 吾爱随笔录

如何使用 unigram 和 bigram 作为特征在 SVM 或逻辑回归上构建自然语言推理模型？在我的数据集上，我有前提、假设和标签列。我计划使用 premis 或 hipotesis 的一元和二元或两者作为我训练的特征之一。例如：

 premise                                      |hipotesis                         |hypothesis bigram
===============================================================================================
I am planning to use the unigram and bigram   |I am planning to use the unigram  |[(i, am), (am, planning), (planning, to), (to, use), (use, the), (the, unigram)]

假设二元组是二元组（单词）的列表，所以我不能将它用作我的支持向量机或逻辑的输入。我可以将假设二元组转换为向量吗？