220117) Iris data로 Random Forest 실습

Data Science

220117) Iris data로 Random Forest 실습

고양이호랑이 2022. 1. 17. 17:06

* load_iris 데이터셋

- petal length, petal width, sepal length, sepal width에 따라 세 종류로 나눌 수 있다.

데이터 로드

from sklearn.datasets import load_iris

load_iris() # 붓꽃 데이터 리턴 - dict
ir_dic = load_iris()

X = ir_dic['data'] # 독립변수 저장
y = ir_dic.target # 종속변수 저장

Random Forest Classifier 학습

from sklearn.ensemble import RandomForestClassifier
rfc = RandomForestClassifier(n_estimators=10) # 10개의 Decision Tree를 만들 객체 생성

from sklearn.model_selection import train_test_split
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2)
rfc.fit(X_train, y_train) # 트리 만들기

rfc.estimators_[8] # 9번째 Decision Tree 확인할 수 있다

랜덤하게 만들어진 결정트리 10개 중 하나의 트리 시각화

from sklearn import tree
import pydotplus
from IPython.display import Image

dt_dot_data = tree.export_graphviz(rfc.estimators_[8],
                     feature_names=['sepal length', 'sepal width', 'petal length', 'petal width'],
                     class_names=['setosa', 'versicolor', 'virginica']) # 9번째 트리 시각화
                
dt_graph = pydotplus.graph_from_dot_data(dt_dot_data) # dt_dot_data(트리내용)을 그림으로 바꿀 객체 생성
Image(dt_graph.create_png())

만든 이미지를 pdf로 내보낼 수도 있다.

dt_graph.write_pdf('9번째 트리 고이고이 간직하리.pdf')

성능평가지표는 다음 시간에.. (https://horangcat.tistory.com/16)

(참고자료)

https://en.wikipedia.org/wiki/Confusion_matrix

Confusion matrix - Wikipedia

From Wikipedia, the free encyclopedia Jump to navigation Jump to search Table layout for visualizing performance; also called an error matrix Terminology and derivationsfrom a confusion matrix condition positive (P) the number of real positive cases in the

en.wikipedia.org

https://towardsdatascience.com/the-f1-score-bec2bbc38aa6

The F1 score

All you need to know about the F1 score in machine learning. With an example applying the F1 score in Python.

towardsdatascience.com

'Data Science' 카테고리의 다른 글

220118) Random Forest Classifier 성능평가 (0)	2022.01.18
Random Forest Classifier, 전에 몰랐던 것들 (0)	2022.01.18
code) play tennis data로 Decision Tree Classifier 적용해보기, 트리 시각화, train_test_split 후 성능평가까지 (0)	2022.01.16
220114) 머신러닝 첫날 (0)	2022.01.14
220113) AWS와 머신러닝 (0)	2022.01.13

현재글220117) Iris data로 Random Forest 실습

개발하는 고양이 호랑이 인공지능 어쩌구 개발자

개발하는 고양이 호랑이

인공지능 어쩌구 개발자

pandas, Kaggle, colab, word2vec, DL, 코딩셰프, Python, ROC_AUC, MobileNet, Random Forest Classifier, Decision Tree Classifier, 논문읽기, NLP, Xgboost, ML, mnist, 밑바닥부터시작하는딥러닝, Flutter, Effective Python, pycaret,

Today :
Yesterday :

일	월	화	수	목	금	토
		1	2	3	4	5
6	7	8	9	10	11	12
13	14	15	16	17	18	19
20	21	22	23	24	25	26
27	28	29	30

내 블로그 - 관리자 홈 전환	`Q` `Q`
새 글 쓰기	`W` `W`

글 수정 (권한 있는 경우)	`E` `E`
댓글 영역으로 이동	`C` `C`

이 페이지의 URL 복사	`S` `S`
맨 위로 이동	`T` `T`
티스토리 홈 이동	`H` `H`
단축키 안내	`Shift` + `/` `⇧` + `/`

개발하는 고양이 호랑이