Word Embedding and String-Matching Techniques for Automobile Entity Name Identification from Web Reviews

Maity, Satanu and Das, Nilanjana and Majumder, Mukta and Dasadhikary, Dibya Ranjan (2021) Word Embedding and String-Matching Techniques for Automobile Entity Name Identification from Web Reviews. EAI Transactions on Scalable Information Systems. e10. ISSN 2032-9407

[img]
Preview
Text
eai.14-5-2021.169918.pdf
Available under License Creative Commons Attribution No Derivatives.

Download (1MB) | Preview

Abstract

With the huge popularity of Internet, various types of information on a wide range of domains are floating over different social media platforms. To extract this information for using in diverse natural language processing applications, identifying the names is prerequisite. A study is presented here, to identify automobile names from noisy web reviews by exploring two widely used machine learning algorithms, Conditional Random Field and Support Vector Machine. The accuracy of machine learning classifiers radically rely on size and quality of training data which has been prepared manually by extracting discussion forum corpus; the task is time consuming and laborious; hence to leverage this word embedding is adopted. Though it enhances the system’s performance but is unable to spot noisy names which occur in web reviews. Next, a gazetteer based string matching technique is proposed, it recognizes a new set of noisy automobile entities, resulting considerable improvement in accuracy.

Item Type: Article
Uncontrolled Keywords: Noisy Name Identification, Automobile Discussion Forum, Machine Learning, Support Vector Machine, Conditional Random Field, Word Embedding, String Matching
Subjects: Q Science > QA Mathematics > QA75 Electronic computers. Computer science
QA75 Electronic computers. Computer science
T Technology > T Technology (General)
Depositing User: EAI Editor IV
Date Deposited: 26 Jul 2021 15:31
Last Modified: 26 Jul 2021 15:31
URI: https://eprints.eudl.eu/id/eprint/5171

Actions (login required)

View Item View Item