본문 바로가기

Studies & Courses32

[Text Mining] Text Classification Contents • Text Classification • Naïve Bayes – Formalizing the Naïve Bayes Classifier – Naïve Bayes: Learning – Multinomial Naive Bayes: A Worked Example • Precision, Recall, and the F measure • Text Classification – Evaluation Text Classification • Classification: assigning a class or category to an input – e.g., "What is the subject of this article?" • Text categorization – Assigning a label t.. 2020. 5. 24.
[Data Visualization] Challenges and Data Visualization 데이터 시각화와 가치 발견의 도전 마이애미-데이드 카운티 공립학교 데이터 (성취도 점수) (출처 : The Truthful Art: Data, Charts, and Maps for Communication, Alberto Cairo, 2016) What Went Wrong (출처 : The Truthful Art: Data, Charts, and Maps for Communication, Alberto Cairo, 2016) The Hockey Stick Chart (출처 : The Truthful Art: Data, Charts, and Maps for Communication, Alberto Cairo, 2016) 데이터 시각화와 가치 표현의 혁신 신경과학이 10년 간 독립적인 학문으로 형성된 과정 .. 2020. 5. 24.
[Data Visualization] Intro to Data Visualization 데이터 시각화와 HCI/UX 데이터 시각화 과정의 2가지 원칙 - 보는 사람이 인지적인 부담이 없고, 쉽고 명확하게 시각화 결과물을 이해할 수 있어야 한다. - 사실의 왜곡이 없어야 한다. HCI/UX란? - HCI : 인간과 컴퓨터 상호 작용(Human-computer interaction)은 인간(사용자)과 컴퓨터 간의 상호작용에 대해 연구하는 학문 분야 (출처 : 위키디피아) - UX : 사용자 경험은 사용자가 어떤 시스템, 제품, 서비스를 직, 간접적으로 이용하면서 느끼고 생각하게 되는 총체적 경험 (출처 : 위키디피아) 칼라 스케일(Color Scales) 전략 - 정성적 (Qualitative Color Scales) - 순차적 (Sequential Color Scales) - 발산형 (Div.. 2020. 5. 24.
[Probability & Statistics] 4. Bayes’ Rule, Concept of a Random Variable Probability & Statistics Bayes’ Rule, Concept of a Random Variable Chapter 2. Probability Section 2.7 Bayes’ Rule Figure 2.12 Venn diagram for the events A, E and E ' Theorem 2.13 : Rule of Total Probability ★ If events B1,B2, . . . ,Bk constitute a partition of the sample space S and P(Bi) 6= 0 for i = 1, 2, . . . , k, then for any event A in S: Figure 2.14 Partitioning the sample space S Examp.. 2020. 4. 18.
[Probability & Statistics] 2. Sample Space, Events, Counting Sample Points Probability & Statistics Chapter 2. Probability Section 2.1 Sample Space Definition 2.1 Sample Space : the set of all possible outcome of a statistical experiment Figure 2.1 Tree diagram for Example 2.2 Section 2.2 Events Definition 2.2 Event : a subset of a sample space Definition 2.3 Comlement : an event A with respect to S is the subset of all elements of S that are not in A Definition 2.4 In.. 2020. 4. 18.
[Probability] Intro to Statistics and Data Analysis Probability Intro to Statistics and Data Analysis Why study probability/statistics? • many fields of science and industry uses probabilistic/statistical methods : manufacturing, finance, medicine, computer engineering, insurance, physics... • scientific experiments and observations Fundamental relationship between probability and statistics 2020. 4. 18.
[Probability] Probability - Part 2 Probability Probability - Part 2 Section 2.4 : Probability of an Event Definitions 2.9 Probability : an event A is the sum of the weights of all sample points in A. If A1, A2, A3... is a sequence of multually exclusive events, then Exampe 2.24 A coin is tossed twice. What is the probability that at least 1 head occurs? S = {HH, HT, TH, TT} E = {HH, HT, TH} A ={TT} 1-(1/4)=3/4 Rule 2.3 Exampe 2.2.. 2020. 4. 5.
[Text Mining] Text Reprocessing Text Reprocessing A taxonomy of text preprocessing tasks Text Normalization Tokenizing (segmenting) words Normalizing word formats Segmenting sentences Tokenization : Task of segmenting running text into words Type VS Token Word types : different words Word tokens : multiple occurrences of words in a text Simple Tokenization in UNIX STEP 1. tokenizing STEP 2. Sorting Punctuation Issues Word-inte.. 2020. 4. 2.
[Probability] Probability - Part 1 Probability Probability - Part 1 Definitions Sample Space : the set of all possible outcome of a statistical experiment Event : a subset of a sample space Comlement : an event A with respect to S is the subset of all elements of S that are not in A Intersection : the event contating all elements that are common to A and B Pemutation : an arrangement of all or part of a set of objects. Theorem Th.. 2020. 4. 1.
[Text Mining] Introduction to Text Mining Text Mining 1. Introduction to Text Mining Data mining Data mining : a process of automatically extracting meaningful, useful, previously unknown and ultimately comprehensible information from large databases. Descriptive: Understanding underlying processes or behavior Predictive: Understanding underlying processes or behavior Why Data Mining? :We are drowning in data, but starving for knowledge.. 2020. 3. 30.