Now of irony to make or convey

Now a day, most of people are using twitter, facebook and micro blogging sites. They share their opinion, feeling for particular topic through comment, review. The volume of data generated daily is very large. So, it is important to analyse the data for gaining information from that. Sentimental analysis is used for mining various types of data for opinion through text analytics.

It can be positive, negative or neutral. Twitter became one of the biggest platform for people to express opinion, share their thoughts and regularly updated about any organization, events etc. So, data collected is huge somewhat called bigdata. To process such a big data we need framework that manages this entire thing. Now a day, people are using sarcasm in their daily life.

We Will Write a Custom Essay Specifically
For You For Only $13.90/page!

order now

Sarcasm refers to opposite of what person want to say and it is used to make fun of others, to annoy someone and to show your anger. So it is important to detect it for more accuracy of the system. Sentimental analysis is positive, negative or neutral.

In positive sentiment also either it is actually positive or sarcastic and for negative sentiment either it is actually negative or sarcastic. If we ignored sarcasm it impact in sentiment analysis and may be reverse the polarity of sentence. So it is important to detect it for accurate sentiment analysis of any company or organization.The online Oxford dictionary defines sarcasm as “the use of irony to make or convey contempt”. Collins dictionary defines it as “mocking, contemptuous, or ironic language intended to convey scorn or insult”.

According to Macmillan English dictionary, sarcasm is “the activity of saying or writing the opposite of what you mean, or of speaking in a way intended to make someone else feel stupid or show them that you are angry”.There are many difficulties are present in detecting sarcasm makes it more interesting task. For example, “Wow, there is huge amount discount.” This sentence considered as compliment. However, considering following sentence: “Wow, there is huge amount of discount but I don’t buy anything.” This sentence clarify that person did not mean what he/she said. For normal people it becomes difficult to detect it.

There are different features  present which is used to detect sarcasm efficiently. Bharti et al1 proposed different types of feature available to detect sarcasm easily. First, Lexical feature is used to detect sarcasm in only text data in which uni-grams, bi-grams and n-grams parameters used to detect sarcasm. Bi-grams and n-grams have more impact on sentimental analysis. Second, Hyperbole feature is used to emphasize meaning of text. In that, Interjection words have more tendencies to become sarcastic. So interjection words play important role to detect sarcasm.

Another features under hyperbole are punctuation mark, quotes, intensifier is used to improve performance of system. For example, “excellent marks” has high impact rather than “good marks”. So, intensifier makes task easy to detect sarcasm. Third, pragmatic feature is used to express emotions more accurately using smiles, emoticons, replies.

So, we need to identify which type of feature is used so that accordingly algorithm is applied. In our research, we are hybrid two feature that is lexical and hyperbole to improve accuracy of system.Negation words have impact on sentimental analysis. We have to consider it to detect sarcasm because it reverses the polarity of sentence.

Here, we are considered two feature lexical, hyperbole and hybrid them to improve the accuracy of system. We are considered negation feature to improve precision of sarcasm detection. Mapreduce is used to reduce execution time. It is parallel computing platform to build reliable, cost-effective, flexible application.

 There are different types of sarcasm are present: (1) contradiction between positive sentiment and negative situation. For example, “I feels great being ignored” (2) Contradiction between negative sentiment and positive situation. For example, “I hate new Zeeland team because it always win” (3) Tweet starts with an interjection word. For example, “Wow, there is huge amount of discount but I don’t buy anything!!” (4) Likes and Dislikes contradiction (5) Tweet contradicting universal facts (6) Tweet contains positive sentiment with antonym pair (7) Tweet contradicting time dependent facts.There are many challenges present to detect sarcasm. Twitter is used as dataset for sarcasm detection. Twitter limits 140 characters for posting message that creates more ambiguity. Also, tweets contain uncommon words, slangs, abbreviation more of informal nature to make difficult for sarcasm detection.

There is no predefined structure available for sarcasm. It becomes easy to detect sarcasm if #sarcasm tag is present either at the end of tweet or middle of tweet. But, it creates difficulty if no #sarcasm tag is available. Joshi et al. 2 highlighted 3 main challenges which are i) the identification of common knowledge, ii) the intent to ridicule, and iii) the speaker-listener (or reader in the case of written text) context.  2. Related WorkThere are many approaches available for sarcasm detection.

Different authors consider various feature and approaches to improve accuracy of system. There are mainly two approaches available: (1) Machine Learning (2) Rule based approach. The machine learning approach is a method of analysis that forms a model to predict, arrange or classify data through the statistical process. Meanwhile, rule-based approach is a technique which exploits semantic, syntactic and stylistic properties of sentences in any language such as phrase pattern, lexical and structural attributes to analyse the sentiment of a sentence.Bouazizi and Ohtsuki et al. 3 proposed supervised machine learning approach. They focus on importance of proposed set of feature to detect sarcasm and for each feature they identified different set of parameters to train the data set and tested them.

Sentiment, punctuation, syntactic, semantic, pattern based feature are considered to train classifier. For classification, Random forest, maximum entropy, SVM, naïve Bayes is used. Rajadesingan et al. 4 aims to address the difficult task of sarcasm detection on Twitter by leveraging behavioral aspects to users expressing sarcasm. They employ theories from behavioral and psychological studies to construct a behavioral modeling framework for detecting sarcasm. SCUBA (Sarcasm classification using behavioral modeling approach) framework is used. Different forms of sarcasm like Sarcasm as a contrast of sentiments, Sarcasm as a complex form of expression, Sarcasm as a means of conveying emotion, Sarcasm as a possible function of familiarity, Sarcasm as a form of written expression are considered. Tungthamthiti et al.

5 use concept level knowledge to identify contradiction between sentiment and situation. For example, “I love going to work on holidays” has positive sentiment love but it is actually sarcastic sentence. So, apply concept level knowledge that is holidays have relaxed situation while work has stressful situation so contradiction between them present and it considered as sarcastic. Also, focus on coherency that is correlation among sentences while multiple sentences are present to detect sarcasm.

Bharti et al. 1 proposed algorithm for different types of sarcasm and also considered lexical and interjection feature to detect sarcasm. They captured and processed real time tweets using Apache Flume and Hive under the Hadoop framework, proposed a set of algorithms to detect sarcasm in tweets under the Hadoop framework and proposed another set of algorithms to detect sarcasm in tweets. Riloff et al. 6 proposed bootstrapping algorithm that automatically learns phrases corresponding to positive sentiments and phrases corresponding to negative situations. They use tweets that contain a sarcasm hashtag as positive instances for the learning process.

They use the learned lists of sentiment and situation phrases to recognize sarcasm in new tweets by identifying contexts that contain a positive sentiment in close proximity to a negative situation phrase.Peter et al. 7 apply string matching against positive sentiment and interjection lexicons to test if the presence of both can be used to classify content as being sarcastic. By focusing only on the positive sentiment, which would suggest a negative feeling, those tweets which contained negative sentiment and therefore positive feeling were ignored. Additionally, the use of interjections is not unique to sarcastic texts and many tweets may contain them where an author wishes to enhance the expressed sentiment. Vijayalaksmi et al. 8 proposed different semi-supervised algorithm like lexical Analysis with N-grams approach, Knowledge extraction, contrast approach, emoticon based approach and hyperbole approach to propose a new rule based Hybrid approach for sarcasm detection.

But, developing dictionary for these algorithms takes more time. The sarcasm detection was ignored for different languages (except English), repeated tweets and empty or a single letter/word tweets in this study.Different author proposed different approach for detection of sarcasm efficiently.

PBLGA is parsing based lexicon generation algorithm used for generating lexicon that is used to check sarcasm. Contradiction between sentiment and situation has high probability to classify as sarcastic. Another IWS (Interjection word start) is used to identify sarcasm in sentence that starts with interjection words like wow, oh, yeah etc. Table 1 show comparison of individual algorithm with existing state-of-art algorithm with various parameters like precision, recall, F-score etc.3. Proposed System3.1 DataIn this study, we are considering twitter data for sarcasm detection.

So, we have to retrieve Tweets through API (input). Twitter provides different API like search API which is used to search tweet using keyword and retrieved it, Streaming API used to fetch real time live tweets, Rest API is used to retrieve tweets from twitter database. Then after, these tweets are stored in hadoop’s HDFS file system for further processing. 3.

2 Preprocessing of DataTweet Preprocessing is required to remove noisy data which is not useful to take decision in sentimental analysis. There is some extra information present like URL which is used to give more information about particular topic or show image for that, @user mentioned in tweet is not necessary for detecting sarcasm so this data is noisy data for sarcasm detection. So, remove this type of noisy data to improve performance of system.

Table 1. Comparison of individual algorithm with state-of-art algorithm 3.3 Part of Speech TaggingP.O.S Tagging(Part of speech tagging)  is a process of taking a word from text (corpus) as input and assign corresponding part-of-speech to each word as output based on its definition and context ie: relationship with adjacent and related words in a phrase, sentence, or paragraph.

After P.O.S. tagging, store all phrases into parse file(PF) and give as an input to our proposed algorithm. For Example: “I love being ignored”. After P.O.S tagging, I|PRP love|VBP being|VBG ignored|VBN.

After assigning part of speech to each word, it is necessary to assign tag to each word so that we can identify that which is first tag, second tag and remaining tag. Separation of tags can be useful in interjection related tweet to identify sarcasm. Bharti et al.

9 proposed algorithm for assignment of tag to each word.3.4 Sentiment analysis of phraseAfter p.o.s tagging, Sentiment analysis of phrase can be done. For that positive ratio and negative ratio have to determine. Positive ratio refers to total number of positive words in phrase from total number of words in phrase. Negative ratio refers to total number of negative words present in phrase from total number of words in phrase.

Intensifier has high impact to detect sarcasm. For example, Fantastic weather has high impact then good weather. Apply rule based pattern to find polarity of word if any intensifier is present. Sentiment score can be calculated as:Sentiment score= Positive Ratio – Negative Ratio            PR= Positive Ratio, NR= Negative Ratio, PWP= Number of positive words per phrase, NWP=Number of negative words per phrase, TWP= Total words in phrase.3.

5 Feature based composite approachFeature based composite approach (FBCA) using mapreduce is our proposed algorithm that is explained in section IV. Here, two features lexical and hyperbole is composite and mapreduce is used for faster execution. Also, consider punctuation feature and negation feature to improve precision of system. After execution of proposed algorithm as a result tweet is sarcastic or not is known. In this step, actual detection of sarcasm is done. 3.6 Compare precision with individual approachWe have to find and compare precision with individual approach so that we can identify improvement in our proposed approach. 4.

Proposed AlgorithmFBCA (Feature based Composite Approach)Input: Tweet Corpus, interjection corpus, P.O.S. tag file (TF), Parse file (PF)Output: Classification of tweets as sarcastic or not sarcastic.Notation: A: adjective, V: verb, R: adverb, N: noun, UH: interjection, T: tweets, C: corpus,  t: tag, TWT: tweet wise tag, FT: first tag, INT: immediate next tag, NT: next tag, SF: sentiment file, sf: situation file, PSF: positive sentiment file, NSF: negative sentiment file, psf: positive situation file, nsf: negative situation file, SC: sentiment score, E: exclamation mark more than two, ISC: interjection sarcastic count, IF: interjection file, TWP: tweet wise phraseInitialisation: TF = { }, SF = { }, sf = { }, PSF = { },NSF = { }, psf = { }, nsf = { }, count= 0, flag=0for T in C doTake FT, INT, NT from TWTif UH in T if FT = UH && INT = (ADJ || ADV) && NT=  E   thenTweet is sarcastic & increment ISCStore tweet into IFelse if (FT = UH) && (NT=(ADV + ADJ) && (ADJ+ N)  && (ADV + V)) thenTweet is sarcastic & increment ISC,Store tweet into IFelse  Tweet is not sarcastic       end iffor T in IF dok = find_parse (T)PF   TF   kend forelse for TWP in PF dok = find_subset (TWP)if k = NP || ADJP || (NP + V P) thenSF   SF   kelse if k = V P || (ADV P + V P) || (V P + ADV P) || (ADJP + V P) || (V P + NP) || (V P + ADV P +ADJP) || (V+ADJP+NP) || (ADV+ADJP+NP) thensf    sf   kend ifend forfor P in SF doSC = sentiment_score (P)if SC >0.0 then PSF     PSF    Pelse if SC <0.0 then NSF    NSF    PelseNeutral Sentiment Phraseend ifend forfor P in sf doSC = sentiment_score (P)if SC >0.0 then psf     psf    Pelse if SC <0.

0 then nsf     nsf    PelseNeutral Situation Phraseend ifend forwhile words in tweet doif word  PSF && count==0 count = 1;check nsfcontinue;endif word  nsf && (count == 1)flag = True; break;endelse if word   NSF && (count == 0)count = 1check psf;continue;endif word   psf && (count == 1)flag = True; break;endendif flag==True thenGiven tweet is sarcasticendelseGiven tweet is not sarcasticendend ifend for5. ConclusionSarcasm detection is challenging task due to no predefined structure present. Researchers are improving accuracy of sarcasm detection by providing different algorithms. In this paper, we proposed algorithm that include lexical feature and hyperbole feature to detect sarcasm. Also, consider three types of sarcasm (i) contradiction between positive sentiment and negative situation (ii) contradiction between negative sentiment and positive situation (iii) occurrence of interjection words. We proposed algorithm that also consider punctuation related feature to improve precision. In proposed algorithm, constructing sentiment and situation file takes time so if we use hadoop framework that reduce our execution time. We are considering two features and hybrid them to improve accuracy of system.

In future, we will consider emoticon to detect sarcasm. If contradiction between text and emoticon present then it became sarcasm. Also, proposed algorithm for different language is still area of research in future.References1 Bharti, S. K., Babu, K.

S., & Jena, S. K, “Parsing-based sarcasm sentiment recognition in twitter data,” 2015 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining(ASONAM), Paris, 2015, pp. 1373-1380.2 A. Joshi, P. Bhattacharyya, and M.

J. Carman.,  (Feb. 2016).  “Automatic sarcasm detection: A survey.” Online. Available:

034263 M. Bouazizi and T. Otsuki Ohtsuki, “A Pattern-Based Approach for Sarcasm Detection on Twitter,” in IEEE Access, vol.

4, pp. 5477-5488, 2016.4 Rajadesingan, A.

, Zafarani, R., & Liu, H. (2015). “Sarcasm detection on twitter: A behavioral modeling approach.

” 2015 WSDM  – Proceedings of the 8th ACM International Conference on Web Search and Data Mining, pp. 97-1065 Tungthamthiti, P., Shirai, K.

, & Mohd, M. (2014). “Recognition of sarcasm in tweets based on concept level sentiment analysis and supervised learning approaches.” 28th Pacific Asia Conference on Language, Information and Computation, PACLIC 2014, pp. 404-4136 Riloff, Ellen & Qadir, A & Surve, P & De Silva, L & Gilbert, N & Huang, R. “Sarcasm as contrast between a positive sentiment and negative situation.

” Proceedings of EMNLP 2013, pp. 704-714.7 Clews P.

& Kuzma J.(2017). “Rudimentary Lexicon Based Method for Sarcasm Detection.” International Journal of Academic Research and Reflection, 5(4), 24-33.8 N.Vijayalaksmi, Dr. A.

Senthilrajan. “A hybrid approach for Sarcasm Detection of Social Media Data.” International Journal of 0Scientific and Research Publications (IJSRP), Volume 7, Issue 5, May 20179 Bharti, S. K.

, Vachha, B., Pradhan, R. K., Babu, K. S., & Jena, S.

K. “Sarcastic sentiment detection in tweets streamed in real time: A big data approach.” Digital Communications and Networks, 2(3), pp. 108-121


I'm Ruth!

Would you like to get a custom essay? How about receiving a customized one?

Check it out