Abstract:
Sentiment factors, including sentiment words, phrases and structures, are necessary but not sufficient conditions for identifying sentiment sentences."Pseudo-sentiment sentences" contain sentiment factors but do not convey any sentiment meanings, and the effective identification of such sentences is a crucial step in improving the accuracy of sentiment sentence recognition.In this paper, we first summarize seven types of semantic features for identifying pseudo-sentiment sentences based on corpus induction and synonym expansion, namely subjective desire, subjective conjecture, hypothesis and concession, purpose and plan, question and inquiry, suggestion and request, and objective reference.Next, specific words (tokens) for each type are added to the semantic lexicon, given the semantic mark of "XJC" (sentiment dissolving word), and sentiment dissolving rules such as "sentiment dissipation factor + sentiment factor = pseudo-sentiment sentence" are formulated to eliminate the sentiment bias of sentiment factors governed by sentiment dissolving factors.Finally, knowledge ontology (sentiment lexicon, semantic lexicon, and sentiment dissolving rules) is programmed in Python to implement the pseudo-sentiment sentence filtering module of CUCsas, a Chinese sentiment analysis system.The experimental accuracy, recall rates, and F1 value is 91.0%, 87.7%, and 89.3%, respectively.