Proceedings of the First International Conference on Computing, Communication and Control System, I3CAC 2021, 7-8 June 2021, Bharath University, Chennai, India

Research Article

An Efficient Nonnegative Matrix Factorization Topic Modeling for Business Intelligence

Download527 downloads
  • @INPROCEEDINGS{10.4108/eai.7-6-2021.2308681,
        author={K.  PrashantGokul and M.  Sundararajan},
        title={An Efficient Nonnegative Matrix Factorization Topic Modeling for Business Intelligence},
        proceedings={Proceedings of the First International Conference on Computing, Communication and Control System, I3CAC 2021, 7-8 June 2021, Bharath University, Chennai, India},
        publisher={EAI},
        proceedings_a={I3CAC},
        year={2021},
        month={6},
        keywords={topic modeling factorization ensemble clustering},
        doi={10.4108/eai.7-6-2021.2308681}
    }
    
  • K. PrashantGokul
    M. Sundararajan
    Year: 2021
    An Efficient Nonnegative Matrix Factorization Topic Modeling for Business Intelligence
    I3CAC
    EAI
    DOI: 10.4108/eai.7-6-2021.2308681
K. PrashantGokul1,*, M. Sundararajan2
  • 1: Research Scholar, Dept. of ECE, Bharath Institute of Higher Education and Research, Chennai, India.
  • 2: Professor & Pro-VC, Bharath Institute of Higher Education and Research, Chennai, India.
*Contact email: kprashantgokul@gmail.com

Abstract

Topic models can give us a knowledge into the basic latent design of an enormous corpus of documents. A scope of strategies have been planned in the writing, including probabilistic topic models and methods dependent on matrix factorization. Notwithstanding, the subsequent topics frequently address just broad, in this manner excess information about the data instead of minor, yet possibly significant information to clients. To handle this issue, we propose a novel sparseness improvement model of negative matrix factorization for finding excellent nearby topics. In any case, in the two cases, standard executions depend on stochastic components in their instatement stage, which can possibly prompt various outcomes being produced on a similar corpus when utilizing a similar boundary values. To address this issue in the context of matrix factorization for topic modeling, we propose the utilization of ensemble learning procedures. We show the useful utility of ENMF on New York Times dataset, and find that ENMF is particularly helpful for applied or expansive topics, where topic key terms are not surely known. We find that ENMF accomplishes higher weighted Jaccard similarity scores than the contemporary strategies..