Time Period Categorization in Fiction : A Comparative Analysis of Machine Learning Techniques
作者 | |
---|---|
出版日期 | Published online: 23 Mar 2024 |
內容 | This study investigates the automatic categorization of time period metadata in fiction, a critical but often overlooked aspect of cataloging. Using a comparative analysis approach, the performance of three machine learning techniques, namely Latent Dirichlet Allocation (LDA), Sentence-BERT (SBERT), and Term Frequency-Inverse Document Frequency (TF-IDF) were assessed, by examining their precision, recall, F1 scores, and confusion matrix results. LDA identifies underlying topics within the text, TF-IDF measures word importance, and SBERT measures sentence semantic similarity. Based on F1-score analysis and confusion matrix outcomes, TF-IDF and LDA effectively categorize text data by time period, while SBERT performed poorly across all time period categories. |
刊名 | Cataloging & Classification Quarterly |
卷期 | vol. 62, no. 2 |
頁數 | 124-153 |
關鍵字 | Cataloging for digital resources ; time period categorization ; machine learning ; text analysis ; fiction ; LDA ; SBERT ; TF-IDF |
網址連結 |