Time Period Categorization in Fiction : A Comparative Analysis of Machine Learning Techniques

作者	Fereshta Westin
出版日期	Published online: 23 Mar 2024
內容	This study investigates the automatic categorization of time period metadata in fiction, a critical but often overlooked aspect of cataloging. Using a comparative analysis approach, the performance of three machine learning techniques, namely Latent Dirichlet Allocation (LDA), Sentence-BERT (SBERT), and Term Frequency-Inverse Document Frequency (TF-IDF) were assessed, by examining their precision, recall, F1 scores, and confusion matrix results. LDA identifies underlying topics within the text, TF-IDF measures word importance, and SBERT measures sentence semantic similarity. Based on F1-score analysis and confusion matrix outcomes, TF-IDF and LDA effectively categorize text data by time period, while SBERT performed poorly across all time period categories.
刊名	Cataloging & Classification Quarterly
卷期	vol. 62, no. 2
頁數	124-153
關鍵字	Cataloging for digital resources ; time period categorization ; machine learning ; text analysis ; fiction ; LDA ; SBERT ; TF-IDF
網址連結	Time Period Categorization in Fiction : A Comparative Analysis of Machine Learning Techniques

發布日期：2024年07月27日　最後更新：2024年12月18日

您在這裡