Title
Obrada negacije u kratkim neformalnim tekstovima u cilju poboljšanja klasifikacije sentimenta
Creator
Ljajić, Adela B. 1982-
Copyright date
2019
Object Links
Select license
Autorstvo-Nekomercijalno-Bez prerade 3.0 Srbija (CC BY-NC-ND 3.0)
License description
Dozvoljavate samo preuzimanje i distribuciju dela, ako/dok se pravilno naznačava ime autora, bez ikakvih promena dela i bez prava komercijalnog korišćenja dela. Ova licenca je najstroža CC licenca. Osnovni opis Licence: http://creativecommons.org/licenses/by-nc-nd/3.0/rs/deed.sr_LATN. Sadržaj ugovora u celini: http://creativecommons.org/licenses/by-nc-nd/3.0/rs/legalcode.sr-Latn
Language
Serbian
Cobiss-ID
Theses Type
Doktorska disertacija
description
Datum odbrane: 04.10.2019.
Other responsibilities
mentor
Stojković, Suzana 1966-
član komisije
Stanković, Milena
član komisije
Janković, Dragan
član komisije
Stoimenov, Leonid
član komisije
Kajan, Ejub
Academic Expertise
Tehničko-tehnološke nauke
University
Univerzitet u Nišu
Faculty
Elektronski fakultet
Group
Katedra za računarstvo
Alternative title
Processing negation in short informal text for improving the sentiment classification
Publisher
[A. B. Ljajić]
Format
109 listova
description
Bibliografija: listovi 98-106.
description
Natural Language Processing; Text mining
Abstract (en)
In this dissertation, the method for classifying short informal
texts by sentiment was proposed. The improvement was achieved by
processing the rule of syntactic negation in the Serbian language. The
complexity of the grammar of the Serbian language imposes the need
to systematically approach the phenomena of negation and to use the
linguistic resources involved in the creation of rules for the negation
treatment in its processing. The resources used are negation signals,
negative quantifiers, negation intensifiers, and negation neutralizers.
In addition to language resources for the application of the rules of
negation, the general sentiment lexicon of positive and negative terms
was used in the classification by sentiment. The evaluation of the used
method was performed over a set of tweets in Serbian. Lexicon based
method, as well as the supervised method of machine learning, were
used for evaluation. The method presented in both cases is compared
with two baseline methods: the first one that does not process the
negation and the other that processes the negation, but without the
rules for processing a syntactic negation. In the case where a method
based on sentiment lexicon was used, the accuracy of the
classification is considerably higher in relation to the two baseline
methods, and the relative improvements of this method with respect
to the first baseline method are the following: for the entire dataset -
up to 10.62%, for a set of tweets containing negation - up to 26.63%
and for a set of tweets containing negations that were processed using
the rules - up to 31.16%. When using the machine learning method,
higher accuracy of the classification is obtained than in the case of the
lexicon-based method: for three classes - up to 69.76% and for two
classes - up to 91.15%. However, the method of machine learning
produces fewer improvements: for three classes up to 2.65% and for
two classes up to 1.65%. The results showed a statistically significant
improvement if the detected rules of negation are included in the short
informal text classification method by sentiment. The results showed
a statistically significant improvement if the detected rules of
negation are included in the short informal text classification method
by sentiment.
Authors Key words
analiza sentimenta, analiza teksta, detekcija negacije, pravila negacije,
srpski jezik, Tviter, mašinsko učenje
Authors Key words
sentiment analysis, text mining, negation detection, negation rules,
Serbian language, Twitter, machine learning
Classification
004.738.5:004.77]:81'322+811.163.41(043.3)
Subject
P 176
Type
Tekst
Abstract (en)
In this dissertation, the method for classifying short informal
texts by sentiment was proposed. The improvement was achieved by
processing the rule of syntactic negation in the Serbian language. The
complexity of the grammar of the Serbian language imposes the need
to systematically approach the phenomena of negation and to use the
linguistic resources involved in the creation of rules for the negation
treatment in its processing. The resources used are negation signals,
negative quantifiers, negation intensifiers, and negation neutralizers.
In addition to language resources for the application of the rules of
negation, the general sentiment lexicon of positive and negative terms
was used in the classification by sentiment. The evaluation of the used
method was performed over a set of tweets in Serbian. Lexicon based
method, as well as the supervised method of machine learning, were
used for evaluation. The method presented in both cases is compared
with two baseline methods: the first one that does not process the
negation and the other that processes the negation, but without the
rules for processing a syntactic negation. In the case where a method
based on sentiment lexicon was used, the accuracy of the
classification is considerably higher in relation to the two baseline
methods, and the relative improvements of this method with respect
to the first baseline method are the following: for the entire dataset -
up to 10.62%, for a set of tweets containing negation - up to 26.63%
and for a set of tweets containing negations that were processed using
the rules - up to 31.16%. When using the machine learning method,
higher accuracy of the classification is obtained than in the case of the
lexicon-based method: for three classes - up to 69.76% and for two
classes - up to 91.15%. However, the method of machine learning
produces fewer improvements: for three classes up to 2.65% and for
two classes up to 1.65%. The results showed a statistically significant
improvement if the detected rules of negation are included in the short
informal text classification method by sentiment. The results showed
a statistically significant improvement if the detected rules of
negation are included in the short informal text classification method
by sentiment.
“Data exchange” service offers individual users metadata transfer in several different formats. Citation formats are offered for transfers in texts as for the transfer into internet pages. Citation formats include permanent links that guarantee access to cited sources. For use are commonly structured metadata schemes : Dublin Core xml and ETUB-MS xml, local adaptation of international ETD-MS scheme intended for use in academic documents.