Lucky Programmer !: March 2016

Different ordered data, different result?

The answer is yes!

I will show you my result when I was testing with my altered state of consciousness data. Try with weka and using Multilayer Perceptron with same parameter.

Here are the result, same data but with different ordered :

Weka - Attribute Selector Classifier

In weka, they have three technique to perform selected attribute which are :

native approach, using the attribute selection classes directly
using a meta-classifier
the filter approach

For this time, I will be using meta-classifier. Basically meta-classifier will use Attribute Selector Classifier, after it reduce the attribute, then the attribute reduced will be use in other method.

For example :-

You have a data set, the column in data set are :

name
age
smoking
heart rate
no. tel

After using Attribute Selector Classifier to the data, it will reduce the attribute to :

age
smoking
heart rate

So this attribute will be use in other method such as Multilayer Perceptron, Naive Bayes or any method. That's it.

Practical Session :

Open your weka, and load any data. Or you can try download data from here.

After that go to classify tab.

Then click button Choose -> meta -> AttributeSelectedClassifier

You can change the method, for example I choose Linear Regression.

Just click OK, then choose any Test Options, I choose Percentage split, by 70% for training set, 30% for testing.

Thank you.

Source : https://weka.wikispaces.com/Performing+attribute+selection

Final Year Project Progress - Week 2

So this week, I just focus on choosing the best method for my fyp. My project is about predict the altered state of consciousness. So from the input given which are 31 attributes for source(input) and 2 attributes for target(output).

Actually this project has been done by my senior, and he used neural network to predict both target. So now I has to use other method to predict the targets. I have done some research regarding predictive modelling.

https://en.wikipedia.org/wiki/Predictive_modelling

And also other sources that may be related :

https://www.quora.com/What-are-some-Machine-Learning-algorithms-that-you-should-always-have-a-strong-understanding-of-and-why

http://www.tutorialspoint.com/data_mining/dm_classification_prediction.htm

http://rayli.net/blog/data/top-10-data-mining-algorithms-in-plain-english/

So based on the research that I've been done, like in the Quora link given above, Sean Owen encouraged to use Random Forest for classification/regression. Also other method that catch my attention is Naive Bayes.

Based on the data that have been given to me by my supervisor (Shamimi A. Halim), so I started to play it with my weka tools.

Four method I used in this research :

1) Multilayer Perceptron (Backpropogation)
2) Naive Bayes
3) Random Forest
4) Logistic Regression

Multilayer Perceptron

Want to learn more :
https://en.wikipedia.org/wiki/Multilayer_perceptron

For this try-n-error research, the data have been preprocessed and just focus on one output which is status(Alive or Dead). Data set for training is 90%, the other 10% for testing. Total data is 204.

Parameter :

Result :

From the above result, I only got 70% accuracy.

So after try other parameters, I got the best(maybe?) parameter which have 3 hidden layers.

Parameter :

Result :

From the result I got 85% accuracy.

Naive Bayes

Want to learn more :
https://en.wikipedia.org/wiki/Naive_Bayes_classifier

No parameter.

Result :

From the result I got 75% accuracy.

Random Forest

Want to learn more :
http://www.listendata.com/2014/11/random-forest-with-r.html
https://en.wikipedia.org/wiki/Random_forest

Parameter :

Result :

From the above result, I only got 70% accuracy.

So after try other parameters, I got the best(maybe?) parameter which numFeatures(number of features) set to 6 and (I dont know the function of seed, but I think it is related to randomness)seed set to 5.

Parameter :

Result :

From the image above, we got 85% accuracy which is same with Multi Layer Perceptron.

Logistic Regression

So why I choose Logistic Regression to try-n-error? Based on the definition in wikipedia :

In statistics, logistic regression, or logit regression, or logit model^[1] is a regression model where the dependent variable (DV) is categorical.

Based on my output, which are Alive or Dead, it is categorical. That's why I try this method too.

Parameter :

Result :

Yeah! 85% accuracy. Same with Multi Layer Perceptron and Random Forest result.

So what now? I dont know. Lol. Maybe I have to study the algorithms before decide which method suitable and efficient for the data.