일 | 월 | 화 | 수 | 목 | 금 | 토 |
---|---|---|---|---|---|---|
1 | 2 | 3 | 4 | 5 | ||
6 | 7 | 8 | 9 | 10 | 11 | 12 |
13 | 14 | 15 | 16 | 17 | 18 | 19 |
20 | 21 | 22 | 23 | 24 | 25 | 26 |
27 | 28 | 29 | 30 |
- 수 만들기
- MySQL
- 탄막 스킬 범위
- 백준
- 회의실 배정
- 우분투
- AI Hub
- 알고리즘 목차
- 마우스 따라다니기
- 자료구조 목차
- 탄막
- 3273
- 강의실2
- 문자열 압축
- 토글 그룹
- c#
- 알고리즘
- 윈도우
- 그리디알고리즘
- SWEA
- 단어 수학
- 유니티
- 3344
- 원형
- 영상 프레임 추출
- 걷는건귀찮아
- mysqld.sock
- 18249
- 탄막 이동
- 2020 KAKAO BLIND RECRUITMENT
- Today
- Total
와이유스토리
[Data Science] 2. Data Science and Machine Learning 본문
Link
https://app.datascientist.fr/learn/learning/57/60/166/762
DataScientist.fr : La plateforme la plus interactive pour apprendre la data science, l'intelligence artificielle et le cloud
app.datascientist.fr
CRISP-DM Process
1. Opportunity Assessment & Business Understanding
2. Data Understanding & Acquisition
3. Data Preparation & Cleaning & Transformation
4. Modeling
5. Evaluation & Residuals & Metrics
6. Model Deployment & Application
Data Preparation
1. Data Collection
- Data augmentation : Rotating the original versions, cropping
them differently, or altering the lighting conditions
- Data labeling
2. Data Processing
- Formatting
- Cleaning : Remove messy data
- Sampling : If you have too much data
3. Data Transformation(Feature engineering)
- Scaling
- Normalizing
- Decomposition
- Feature aggregation : RGB, Channels
* Missing & Repeated value
* Outliers & Errors
Machine Learning
Supervised Learning
1. Classification : Yes/No question
ex) Will it be hot or cold tomorrow?
- Evaluation of Classification
+ Confusion Matrix
* Recall = TP/(TP+FN)
* Precision = TP/(TP+FP)
* Accuracy = (TP+TN)/(TP+TN+FP+FN)
- Types
+ Binary Classification
+ Multiclass Classification
+ Multilabel Classification
2. Regression : Predict a numerical value
ex) What will be the etmperature tomorrow?
- Evaluation of Regression
+ MSE
+ RMSE
+ MAE
Unsupervised Learning
1. Clusting : Group observations into similar-looking groups
- Evaluation of Clustering
+ Internal Measures
* Cohesion
* Separation
+ External Measures
* Compare with Ground Truth
2. Recommender system : Recommendation
Dataset
1. Training Dataset : The sample of data used to fit the model
2. Validation
- Cross Validation
3. Test
Overfitting & Underfitting
1. Overfitting : Forcefitting, Too good to be true
2. Appropriate fitting
3. Under fitting : Too simple to explain the variance
- Model complexity
- Training Error < Test Error
'컴퓨터공학 > 인공지능|데이터사이언스' 카테고리의 다른 글
[Data Science] 4. Generative AI For Computer Vision (0) | 2023.10.18 |
---|---|
[Data Science] 1. Python Basics For Data Science(NumPy, Pandas) (0) | 2023.08.14 |