LSTM-Based Forum Topic Classification Model

Authors

  • Leo Liang Author

DOI:

https://doi.org/10.61173/g3xexb16

Keywords:

Deep Learning, Natural Language Processing, Long Short-Term Memory Network, Forum, Text Multi-classification

Abstract

With the rapid development of information technology, forums have become an important platform for information exchange. However, the manual classification of forum topics consumes a significant amount of human resources and is prone to classification errors. To address this issue, this study proposes a forum topic classification model based on Long Short-Term Memory (LSTM) networks. By leveraging LSTM’s capability in text processing, the accuracy and efficiency of topic classification are significantly improved. This study used approximately 68,000 entries from 29 topic categories, scraped from Zhihu, for the experiments. Preprocessing steps such as text cleaning, tokenization, and word vectorization were performed, and a classification model with LSTM and Dropout layers was designed. The experimental results indicate that the model performs well in most topic classifications, although overfitting remains an issue for certain categories. The paper concludes by summarizing the advantages and limitations of the model and discusses the potential for improving classification accuracy through increasing data volume and optimizing the model in the future.

Downloads

Published

2024-08-14

Issue

Section

Articles