일 | 월 | 화 | 수 | 목 | 금 | 토 |
---|---|---|---|---|---|---|
1 | 2 | 3 | 4 | 5 | 6 | 7 |
8 | 9 | 10 | 11 | 12 | 13 | 14 |
15 | 16 | 17 | 18 | 19 | 20 | 21 |
22 | 23 | 24 | 25 | 26 | 27 | 28 |
29 | 30 | 31 |
- 토글 그룹
- 우분투
- 원형
- c#
- SWEA
- MySQL
- 마우스 따라다니기
- 알고리즘
- 문자열 압축
- 3273
- 윈도우
- 단어 수학
- 그리디알고리즘
- 걷는건귀찮아
- 탄막 이동
- 알고리즘 목차
- 백준
- 영상 프레임 추출
- 2020 KAKAO BLIND RECRUITMENT
- 3344
- AI Hub
- 수 만들기
- 18249
- 유니티
- 회의실 배정
- mysqld.sock
- 탄막 스킬 범위
- 강의실2
- 탄막
- 자료구조 목차
- Today
- Total
와이유스토리
[Data Science] 2. Data Science and Machine Learning 본문
Link
https://app.datascientist.fr/learn/learning/57/60/166/762
CRISP-DM Process
1. Opportunity Assessment & Business Understanding
2. Data Understanding & Acquisition
3. Data Preparation & Cleaning & Transformation
4. Modeling
5. Evaluation & Residuals & Metrics
6. Model Deployment & Application
Data Preparation
1. Data Collection
- Data augmentation : Rotating the original versions, cropping
them differently, or altering the lighting conditions
- Data labeling
2. Data Processing
- Formatting
- Cleaning : Remove messy data
- Sampling : If you have too much data
3. Data Transformation(Feature engineering)
- Scaling
- Normalizing
- Decomposition
- Feature aggregation : RGB, Channels
* Missing & Repeated value
* Outliers & Errors
Machine Learning
Supervised Learning
1. Classification : Yes/No question
ex) Will it be hot or cold tomorrow?
- Evaluation of Classification
+ Confusion Matrix
* Recall = TP/(TP+FN)
* Precision = TP/(TP+FP)
* Accuracy = (TP+TN)/(TP+TN+FP+FN)
- Types
+ Binary Classification
+ Multiclass Classification
+ Multilabel Classification
2. Regression : Predict a numerical value
ex) What will be the etmperature tomorrow?
- Evaluation of Regression
+ MSE
+ RMSE
+ MAE
Unsupervised Learning
1. Clusting : Group observations into similar-looking groups
- Evaluation of Clustering
+ Internal Measures
* Cohesion
* Separation
+ External Measures
* Compare with Ground Truth
2. Recommender system : Recommendation
Dataset
1. Training Dataset : The sample of data used to fit the model
2. Validation
- Cross Validation
3. Test
Overfitting & Underfitting
1. Overfitting : Forcefitting, Too good to be true
2. Appropriate fitting
3. Under fitting : Too simple to explain the variance
- Model complexity
- Training Error < Test Error
'컴퓨터공학 > 인공지능|데이터사이언스' 카테고리의 다른 글
[Data Science] 4. Generative AI For Computer Vision (0) | 2023.10.18 |
---|---|
[Data Science] 1. Python Basics For Data Science(NumPy, Pandas) (0) | 2023.08.14 |