数据挖掘 - 为什么在将循环神经网络用于序列到序列模型时，我们需要添加 START <s> + END </s> 符号？ - 吾爱随笔录

在Sequence-to-Sequence 模型中，我们经常看到在训练模型和推理/解码看不见的数据之前，将START（例如<s>）和END（例如</s>）符号添加到输入和输出中。

例如http://pytorch.org/tutorials/intermediate/seq2seq_translation_tutorial.html

SOS_token = 0
EOS_token = 1


class Lang:
    def __init__(self, name):
        self.name = name
        self.word2index = {}
        self.word2count = {}
        self.index2word = {0: "SOS", 1: "EOS"}
        self.n_words = 2  # Count SOS and EOS

    def addSentence(self, sentence):
        for word in sentence.split(' '):
            self.addWord(word)

    def addWord(self, word):
        if word not in self.word2index:
            self.word2index[word] = self.n_words
            self.word2count[word] = 1
            self.index2word[self.n_words] = word
            self.n_words += 1
        else:
            self.word2count[word] += 1

是否有必要的技术定义或学术解释？
或者是否需要添加 END 符号仅适用于句子生成需要结束的自然语言处理任务？
但是 START 符号有什么作用呢？除了采用经过训练的网络将开始推断的初始状态。