これがベトナム大学院の実態だ!

Trường Đại Học Bách Khoa Thành Phố Hồ Chí Minhの大学院修士課程に社会人入学

Nhận dạng mẫu và học máy 中間試験 & 8回目 & mini project

2018年11月3日
以前から予告されていた中間試験。
問題はこちら。
f:id:k4h8:20181107124017j:plainf:id:k4h8:20181107124022j:plain
以前から告知されていた練習問題および2017年度試験と同じ問題もあったが、残念ながら初見も多く、70分あった試験時間が一瞬に感じられた。
なお、この中間試験の成績は35%に相当する。

力を出し尽くし制限時間が終わり、答案用紙が回収された後教室に残ってできなかった問題をやり直していると、なんと授業を行うという。
通常中間試験の日はそれで終わりで、授業をやらないのが通例であるが。
他の学生も疲れが出たのか、Facebookを見たり何やら授業とは関係ない作業をしたり、まともに聞いている人はほとんどいなかった。
ただ、先生のメールアドレスが板書されたので、試験終了後帰らなくてよかった。
このメールアドレスは試験にたいするいちゃもんや得点照会と思ったが念のため確認するとそのどちらでもなく、25%の成績に相当するmini projectの希望を送るためだという。

2日ぐらいたってクラスリーダーの角刈り君から今後の教科書と練習問題、それに件のmini project一覧が送られてきた。
内容は以下の33種類。

1. Study the PCA tool of WEKA and apply it in feature extraction for the following dataset:
(1, 1, 1), (1, 2, 1), (1, 3, 1), (2, 1, 1), (2, 2, 1), (2, 3, 1), (2, 3.5, 1), (2.5, 2, 1), (3.5, 1, 1), (3.5, 2, 1), (3.5, 3, 2), (3.5, 4, 2), (4.5, 1, 2), (4.5, 2, 2), (4.5, 3, 2), (5, 4, 2), (5, 5, 2), (6, 3, 2), (6, 4, 2), (6, 5, 2)
where each pattern is represented by feature 1, feature 2 and the class label.
2. Implement LDA method for the following training data set and apply it in feature extraction:
(1, 1, 1), (1, 2, 1), (1, 3, 1), (2, 1, 1), (2, 2, 1), (2, 3, 1), (2, 3.5, 1), (2.5, 2, 1), (3.5, 1, 1), (3.5, 2, 1), (3.5, 3, 2), (3.5, 4, 2), (4.5, 1, 2), (4.5, 2, 2), (4.5, 3, 2), (5, 4, 2), (5, 5, 2), (6, 3, 2), (6, 4, 2), (6, 5, 2)
where each pattern is represented by feature 1, feature 2 and the class label. (*)
3. Study how to use the X-means algorithm of WEKA and apply it in a clustering problem.
4. Study how to use the EM algorithm of WEKA and apply it in a clustering problem.
5. Study how to use the Bagging and ADABOOST algorithm of WEKA and apply it in a classification problem.
6. Implement ADABOOST algorithm and apply it in a classification problem. (*)
7. Study how to use Decision Tree classifier (J 4.8) of WEKA and apply it in a classification problem.
8. Study how to use SVM of LibSVM and apply it in a classification problem.
9. Study how to use SVM of MATLAB and apply it in a classification problem.
10. Implement back-propagation algorithm to train ANN and apply it in a classification problem. (*)
11. Implement k-NN algorithm and apply it in a classification problem. (*)
12. Implement distance-weighted k-nearest neighbor algorithm and apply it in a classification problem. (*)
13. Implement Naïve Bayes classifier and apply it in a classification problem. (*)
14. Implement logistic regression method and apply it in a classification problem. (*)
15. Study how to use the tools for attribute selection in WEKA and apply it in a benchmark dataset for customer churn prediction problem.
16. Study how to use ANN tool of Spice-Neuro software and apply it in a classification problem.
17. Study how to use ANN tool of MATLAB and apply it in a classification problem.
18. Study how to use RBF network tool of WEKA and apply it in a classification problem.
19. Study how to use RBF network tool of MATLAB and apply it in a classification problem.
20. Implement Squeezer algorithm and apply it in a clustering problem with categorical data. (*)
21. Implement k-means algorithm (with some centroid initialization technique) and apply it in a clustering problem. (*)
22. Implement HAC clustering algorithm and apply it in a clustering problem. (*)
23. Implement Condensed Nearest Neighbors algorithm and apply it in a classification problem. (*)
24. Implement the incremental clustering algorithm Leaders and apply it in a clustering problem. (*)
25. Implement Naïve Rank algorithm for reducing the training set and apply it in a classification using k-Nearest-neighbors. (*)
26. Implement anytime classification algorithm and apply it in a classification problem. (*)
27. Implement the improved k-NN algorithm with branch-and-bound technique. (*)
28. Implement a propotype selection method which is based on clustering and apply it in a classification problem using k-Nearest-neighbors. (*)
29. Implement the improved k-NN algorithm with the support of k-d-tree. (*)
30. Study how to use HMM (hidden Markov model) of MATLAB.
31. Study how to use PCA tool of MATLAB and apply it in the feature extraction for the following dataset:
(1, 1, 1), (1, 2, 1), (1, 3, 1), (2, 1, 1), (2, 2, 1), (2, 3, 1), (2, 3.5, 1), (2.5, 2, 1), (3.5, 1, 1), (3.5, 2, 1), (3.5, 3, 2), (3.5, 4, 2), (4.5, 1, 2), (4.5, 2, 2), (4.5, 3, 2), (5, 4, 2), (5, 5, 2), (6, 3, 2), (6, 4, 2), (6, 5, 2)
where each pattern is represented by feature 1, feature 2 and the class label.
32. Study how to use Decision Tree classifier of RapidMiner and apply it in a classification problem.
33. Study how to use the tools for attribute selection in RapidMiner and apply it in a benchmark dataset for customer churn prediction problem.

この中からどれか1つを選び、先生にメールをする。
ただ、すでに希望者がいるものは選べないので、早い者勝ちとなる。
とりあえずMATLABと書かれているものならやるべきことが具体的なので、

9 SVM of MATLAB.
17 ANN tool of MATLAB.
19 RBF network tool of MATLAB.
30 HMM (hidden Markov model) of MATLAB.
31 PCA tool of MATLAB.

をそれぞれMATLAB上で動かしてみたところHMM (hidden Markov model) が一番簡単そうであったため希望を送ったところ競合がなかったため無事に決定された。
このmini projectの締め切りがいつまでなのかの告知はまだないが、今後の授業を通じて知らされるものと思われる。