728x90
๋ฐ˜์‘ํ˜•
๐Ÿ’ก
scikit-learn(sklearn) : ํŒŒ์ด์ฌ ์˜คํ”ˆ์†Œ์Šค์˜ ๋ผ์ด๋ธŒ๋Ÿฌ๋ฆฌ ์ค‘์—์„œ ๋จธ์‹œ๋Ÿฌ๋‹์„ ๊ตฌํ˜„ํ•˜๋Š”๋ฐ ํŠนํ™”๋œ ๋ผ์ด๋ธŒ๋Ÿฌ๋ฆฌ.
scikit-learn
"We use scikit-learn to support leading-edge basic research [...]" "I think it's the most well-designed ML package I've seen so far." "scikit-learn's ease-of-use, performance and overall variety of algorithms implemented has proved invaluable [...]."
https://scikit-learn.org/stable/
  • sklearn์€ ๊ฑฐ์˜ ๋Œ€๋ถ€๋ถ„์˜ ๋จธ์‹ ๋Ÿฌ๋‹ ์•Œ๊ณ ๋ฆฌ์ฆ˜์ด ๊ตฌํ˜„๋˜์–ด ์žˆ๋‹ค.
  • ๊ต‰์žฅํžˆ ๊ตฌ์กฐ์ ์œผ๋กœ ์ž˜ ๋งŒ๋“ค์–ด์ ธ ์žˆ์–ด์„œ, ์‚ฌ์šฉํ•˜๊ธฐ ํŽธํ•ฉ๋‹ˆ๋‹ค.
  • ๋จธ์‹ ๋Ÿฌ๋‹์„ ์œ„ํ•ด์„œ ๋งŒ๋“ค์–ด์ง„ ๋ผ์ด๋ธŒ๋Ÿฌ๋ฆฌ๋‹ค ๋ณด๋‹ˆ, ๋”ฅ๋Ÿฌ๋‹์„ ํ•˜๊ธฐ์—๋Š” ์ ํ•ฉํ•˜์ง€ ์•Š๋‹ค.
  • sklearn ์ดํ›„์— ๋งŒ๋“ค์–ด์ง„ ๋Œ€๋ถ€๋ถ„์˜ ๋จธ์‹ ๋Ÿฌ๋‹ ๋ผ์ด๋ธŒ๋Ÿฌ๋ฆฌ๋“ค์ด ๋ชจ๋‘ sklearn-style์„ ๋”ฐ๋ฅธ๋‹ค.
  • Pycaret์ด๋‚˜ XGBoost, LightGBM, Catboost ๊ฐ™์€ ๋Œ€ํ‘œ์ ์ธ ๋จธ์‹ ๋Ÿฌ๋‹ ๋ผ์ด๋ธŒ๋Ÿฌ๋ฆฌ๋“ค๋„ ๋ชจ๋‘ sklearn์— dependency๋ฅผ ๊ฐ€์ง€๊ณ  ์žˆ์Šต๋‹ˆ๋‹ค.

Getting Started

Getting Started
The purpose of this guide is to illustrate some of the main features that scikit-learn provides. It assumes a very basic working knowledge of machine learning practices (model fitting, predicting, cross-validation, etc.). Please refer to our installation instructions for installing scikit-learn . Scikit-learn is an open source machine learning library that supports supervised and unsupervised learning.
https://scikit-learn.org/stable/getting_started.html
  • ์œ„์˜ ํŽ˜์ด์ง€๋ฅผ ์ž˜ ๋”ฐ๋ผ๊ฐ€๋ฉด sklearn์œผ๋กœ ์‰ฝ๊ฒŒ ๋ฐ์ดํ„ฐ๋ฅผ ๋ถˆ๋Ÿฌ์™€์„œ ํ•™์Šตํ•˜๋Š” ์ฝ”๋“œ๋ฅผ ๋”ฐ๋ผ ๋งŒ๋“ค ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.
  • sklearn์˜ ์ „์ฒด์ ์ธ ๊ตฌํ˜„ ๋ฐฉ์‹์—๋Š” ์ •ํ•ด์ง„ ํ‹€์ด ์žˆ์Šต๋‹ˆ๋‹ค.
# sklearn์„ ์‚ฌ์šฉํ•˜์—ฌ ํ•˜๋‚˜์˜ ML model์„ ๋ถˆ๋Ÿฌ์™€์„œ ๋ถ„๋ฅ˜ ๋ชจ๋ธ์„ ํ•™์Šตํ•˜๊ณ  ํ‰๊ฐ€ํ•˜๋Š” ์˜ˆ์‹œ์ž…๋‹ˆ๋‹ค.
# 1. ์‚ฌ์šฉํ•  ๋ชจ๋ธ์„ ๋ถˆ๋Ÿฌ์˜ต๋‹ˆ๋‹ค.
from sklearn.ensemble import RandomForestClassifier
from sklearn.metrics import accuracy_score

# 2. ๋ชจ๋ธ ๊ฐ์ฒด๋ฅผ ์„ ์–ธํ•ฉ๋‹ˆ๋‹ค.
model = RandomForestClassifier()

# 3. training data๋กœ ํ•™์Šต์„ ์ง„ํ–‰ํ•ฉ๋‹ˆ๋‹ค.
model.fit(X_train, y_train)

# 4. test data๋กœ inference๋ฅผ ์ง„ํ–‰ํ•ฉ๋‹ˆ๋‹ค.
pred = model.predict(X_test)

# 5. Evaluation metric์œผ๋กœ ํ‰๊ฐ€๋ฅผ ์ง„ํ–‰ํ•ฉ๋‹ˆ๋‹ค. 
print("Accuracy : %.4f" % accuracy_score(y_test, pred))

>> Accuracy : 0.8976
  • ์œ„์˜ ์ฝ”๋“œ ๊ตฌํ˜„ ๋ฐฉ์‹์€ sklearn์„ ์‚ฌ์šฉํ•˜๋Š” ๋จธ์‹ ๋Ÿฌ๋‹ ํ”„๋กœ์ ํŠธ์—์„œ๋Š” ๊ณต์‹์ฒ˜๋Ÿผ ํ™œ์šฉ๋˜๋‹ˆ ๋ฐ˜๋“œ์‹œ ์ตํ˜€๋‘์„ธ์š”!


Hands-on

  1. sklearn์ด ์ œ๊ณตํ•˜๋Š” ๋ถ„๋ฅ˜ ๋ชจ๋ธ์ค‘์— 3๊ฐ€์ง€ ์ •๋„๋ฅผ ์ฐพ์•„์„œ ์ด๋ฆ„์„ ์ ์–ด๋ณด์„ธ์š”.
  1. sklearn์—์„œ๋Š” ๋ชจ๋“  ํ•™์Šต์„ ์ˆ˜ํ–‰ํ•˜๋Š” ํ•จ์ˆ˜๋ฅผ fit, ์˜ˆ์ธกํ•˜๋Š” ํ•จ์ˆ˜๋ฅผ predict๋กœ ๊ตฌํ˜„ํ•˜์˜€์Šต๋‹ˆ๋‹ค. ์ด๋ ‡๊ฒŒ ๊ตฌํ˜„ํ•˜์˜€์„ ๋•Œ์˜ ์žฅ์ ์€ ๋ฌด์—‡์ผ๊นŒ์š”? (Hint: OOP)
728x90
๋ฐ˜์‘ํ˜•

+ Recent posts