Social Media, Data Mining & Machine Learning: June 2006

Machine Learning for Newbies

Posted by JoSeK at 5:11 PM . Saturday, June 10, 2006

In [1] Pedro Domingos recopilates some Machine Learning related stuff that can be useful for a beginner in the field. There is a list of journals

A list of conferences

International Conference on Machine Learning
European Conference on Machine Learning
International Joint Conference on Artificial Intelligence
National Conference on Artificial Intelligence
European Conference on Artificial Intelligence
Annual Conference on Neural Information Processing Systems
International Workshop on Multistrategy Learning
International Workshop on Artificial Intelligence and Statistics
International Conference on Computational Learning Theory (COLT)
European Conference on Computational Learning Theory

Other resources

UCI repository of machine learning databases
Online bibliographies of several subareas of machine learning
Machine Learning List
AI and Statistics List
MLC++
Weka

There are more useful resources for a researcher, I add my favourites

References:
[1] Pedro Domingos, "E4 - Machine Learning" In W. Klosgen and J. Zytkow (eds.), Handbook of Data Mining and Knowledge Discovery (pp. 660-670), 2002. New York: Oxford University Press. (Download)

Bias in Machine Learning

Posted by JoSeK at 4:12 PM .

0 comments

In statistics, the term bias is used in two different ways

A biased sample is a statistical sample where their members have not the same probability to be chosen.
A biased estimator is one estimator that over or understimates the quantity to be estimated.

In Machine learning the term bias is more related to the biased estimator as it is applied to the classifiers. As can be seen in [1], the bias can be expressed as

reflecting sensivity to the target function f(x). The bias represents "how closely on average the estimate is able to approximate the target". The bias has direct effects on the predicted error as we can decompose it as follows [2]

References
[1] J. H. Friedman, "On bias, variance, 0/1 loss, and the curse-of-dimensionality", Data Mining and Knowledge Discovery vol.1, nº 1, 55-77, 1997. (Download).
[2] G. M. James, "Variance and Bias for General Loss Functions", Machine Learning 51, nº 2, 115-135, 2003. (Download)

Mixture of Experts

Posted by JoSeK at 11:58 AM .

2 comments

Mixture of Experts is based on the "Divide and Conquer" doctrine. The problem is divided into manageable sizes for several experts and each expert learns locally from a part of the problem domain and then the outputs from these experts are combined to provide a global output.

Mixture of Experts are oriented to Neural Network, being each expert a neural network that learns only from a part of the problem and the outputs are combined by human knowledge or by gating networks. But Mixture of Experts seems to be an abstract paradigm and could be applied with other classifiers.

Basic References
[1] Jong-Hoon Oh and Kukjin Kang, "Experts or an Ensemble? A Statistical Mechanics Perspective of Multiple Neural Network Approaches" (Download)
[2] Jordan, M. I. "Hierarchical Mixtures of Experts and the EM algorithm" Neural Computation 6, 181-214, 1994 (Download)

Linear Transformation Methods

Posted by JoSeK at 11:38 AM .

0 comments

I'm interested on Linear Transformation Methods that allow us to transform an initial representation into another representation where the components are, in some way, independents. Our work with FBL (a Wrapper to improve Naive Bayes by deleting the dependent attributes) tries to do something similar and I would compare all those methods to FBL. The Linear Transformation Methods I've found are

Principal Component Analysis
Factor Analysis
Projection Pursuit
Independent Component Analysis
Independent Factor Analysis
Generalized Additive Models

Social Media, Data Mining & Machine Learning

Machine Learning for Newbies

Bias in Machine Learning

Mixture of Experts

Linear Transformation Methods

Labels

Blog Archive

Related Blogs