Используя данные с сайта kaggle.com из соревнования House Prices: Advanced Regression Techniques, построить следующие модели:

Logistic Regression

?glm(Survived ~.,family=binomial(link=’logit’),data=train)

K-Means Clustering


Hierarchical clustering

d <- dist(mydata, method = «euclidean») # distance matrix
fit <- hclust(d, method=»ward»)
groups <- cutree(fit, k=5) # cut tree into 5 clusters

Classification Tree

fit <- rpart(Kyphosis ~ Age + Number + Start, method=’class’, data=kyphosis)

Random Forest

rf <- randomForest(label ~ ., train)
predictions <- predict(rf, test)

Support Vector Machines

model <- svm(Sex~., data = cats)

L1, L2 regularization


Library caret


more information: http://topepo.github.io/caret/
Занятие 3. Введение в machine learning

