Idiom Based Sentiment and Intent Analysis of Code-Switched Twitter Conversations
DescriptionThe explosion of social media platforms has created security challenges in that they have become an open space for those with malicious intent to openly discuss plans without fear of being detected. These persons utilize code-switching to hide the sentiment and intent of their conversations. Code-switching is the concurrent use of more than one language or variety of a language in a conversation. These persons, while often utilizing different languages, will often make use of a subset of a particular language to enable code-switching. The use of this subset can be similar to that of idioms where a group of words established by usage as having a meaning not deducible from those of the individual words. In this paper, we focus on initially developing a sentiment and intent classification model for code-switched twitter conversations based on the idiomatic usage of a coded subset of a base language. To perform this classification, we compare and contrast Bi-directional Long-Term Short-Term Memory and Bi-directional Encoder Representations for Transformers to determine which is best suited for this task.
Keywords: Sentiment, Intent, Idioms, Code-switching, Long-term Short-term Memory, Bi-directional Encoders Representations for Transformers