Final Year Project Progress - Week 2

So this week, I just focus on choosing the best method for my fyp. My project is about predict the altered state of consciousness. So from the input given which are 31 attributes for source(input) and 2 attributes for target(output).

Actually this project has been done by my senior, and he used neural network to predict both target. So now I has to use other method to predict the targets. I have done some research regarding predictive modelling.

https://en.wikipedia.org/wiki/Predictive_modelling

And also other sources that may be related :

https://www.quora.com/What-are-some-Machine-Learning-algorithms-that-you-should-always-have-a-strong-understanding-of-and-why

http://www.tutorialspoint.com/data_mining/dm_classification_prediction.htm

http://rayli.net/blog/data/top-10-data-mining-algorithms-in-plain-english/

So based on the research that I've been done, like in the Quora link given above, Sean Owen encouraged to use Random Forest for classification/regression. Also other method that catch my attention is Naive Bayes.

Based on the data that have been given to me by my supervisor (Shamimi A. Halim), so I started to play it with my weka tools.

Four method I used in this research :

1) Multilayer Perceptron (Backpropogation)
2) Naive Bayes
3) Random Forest
4) Logistic Regression


Multilayer Perceptron

Want to learn more :
https://en.wikipedia.org/wiki/Multilayer_perceptron

For this try-n-error research, the data have been preprocessed and just focus on one output which is status(Alive or Dead). Data set for training is 90%, the other 10% for testing. Total data is 204.

Parameter :


Result :


From the above result, I only got 70% accuracy.

So after try other parameters, I got the best(maybe?) parameter which have 3 hidden layers.

Parameter :


Result :



From the result I got 85% accuracy.


Naive Bayes

Want to learn more :
https://en.wikipedia.org/wiki/Naive_Bayes_classifier

No parameter.

Result :


From the result I got 75% accuracy.

Random Forest 

Want to learn more :
http://www.listendata.com/2014/11/random-forest-with-r.html
https://en.wikipedia.org/wiki/Random_forest 

Parameter :





Result :


From the above result, I only got 70% accuracy.

So after try other parameters, I got the best(maybe?) parameter which numFeatures(number of features) set to 6 and (I dont know the function of seed, but I think it is related to randomness)seed set to 5.

Parameter :


Result :


From the image above, we got 85% accuracy which is same with Multi Layer Perceptron.

Logistic Regression

So why I choose Logistic Regression to try-n-error? Based on the definition in wikipedia :

In statistics, logistic regression, or logit regression, or logit model[1] is a regression model where the dependent variable (DV) is categorical.

Based on my output, which are Alive or Dead, it is categorical. That's why I try this method too.

Parameter :


Result :


Yeah! 85% accuracy. Same with Multi Layer Perceptron and Random Forest result.

 So what now? I dont know. Lol. Maybe I have to study the algorithms before decide which method suitable and efficient for the data.

Share this

Related Posts

Previous
Next Post »