텍스트마이닝3 [Text Mining] Text Classification Contents • Text Classification • Naïve Bayes – Formalizing the Naïve Bayes Classifier – Naïve Bayes: Learning – Multinomial Naive Bayes: A Worked Example • Precision, Recall, and the F measure • Text Classification – Evaluation Text Classification • Classification: assigning a class or category to an input – e.g., "What is the subject of this article?" • Text categorization – Assigning a label t.. 2020. 5. 24. [Text Mining] Text Reprocessing Text Reprocessing A taxonomy of text preprocessing tasks Text Normalization Tokenizing (segmenting) words Normalizing word formats Segmenting sentences Tokenization : Task of segmenting running text into words Type VS Token Word types : different words Word tokens : multiple occurrences of words in a text Simple Tokenization in UNIX STEP 1. tokenizing STEP 2. Sorting Punctuation Issues Word-inte.. 2020. 4. 2. [Text Mining] Introduction to Text Mining Text Mining 1. Introduction to Text Mining Data mining Data mining : a process of automatically extracting meaningful, useful, previously unknown and ultimately comprehensible information from large databases. Descriptive: Understanding underlying processes or behavior Predictive: Understanding underlying processes or behavior Why Data Mining? :We are drowning in data, but starving for knowledge.. 2020. 3. 30. 이전 1 다음