A Gentle Introduction to Logistic Regression With Maximum Likelihood Estimation
Tweet Share Share Logistic regression is a model for binary classification predictive modeling. The parameters of a logistic regression model can be estimated by the probabilistic framework called...
View ArticleProbabilistic Model Selection with AIC, BIC, and MDL
Tweet Share Share Model selection is the problem of choosing one from among a set of candidate models. It is common to choose a model that performs the best on a hold-out test dataset or to estimate...
View ArticleA Gentle Introduction to Expectation-Maximization (EM Algorithm)
Tweet Share Share Maximum likelihood estimation is an approach to density estimation for a dataset by searching across probability distributions and their parameters. It is a general and effective...
View ArticleA Gentle Introduction to Monte Carlo Sampling for Probability
Tweet Share Share Monte Carlo methods are a class of techniques for randomly sampling a probability distribution. There are many problem domains where describing or estimating the probability...
View ArticleA Gentle Introduction to Markov Chain Monte Carlo for Probability
Tweet Share Share Probabilistic inference involves estimating an expected value or density using a probabilistic model. Often, directly inferring values is not tractable with probabilistic models, and...
View ArticleA Gentle Introduction to Maximum a Posteriori (MAP) for Machine Learning
Tweet Share Share Density estimation is the problem of estimating the probability distribution for a sample of observations from a problem domain. Typically, estimating the entire distribution is...
View Article14 Different Types of Learning in Machine Learning
Tweet Share Share Machine learning is a large field of study that overlaps with and inherits ideas from many related fields such as artificial intelligence. The focus of the field is learning, that...
View ArticleHow to Save a NumPy Array to File for Machine Learning
Tweet Share Share Developing machine learning models in Python often requires the use of NumPy arrays. NumPy arrays are efficient data structures for working with data in Python, and machine learning...
View ArticleHow to Connect Model Input Data With Predictions for Machine Learning
Tweet Share Share Fitting a model to a training dataset is so easy today with libraries like scikit-learn. A model can be fit and evaluated on a dataset in just a few lines of code. It is so easy that...
View ArticleWhat Does Stochastic Mean in Machine Learning?
Tweet Share Share The behavior and performance of many machine learning algorithms are referred to as stochastic. Stochastic refers to a variable process where the outcome involves some randomness and...
View ArticleHow to Save and Reuse Data Preparation Objects in Scikit-Learn
Tweet Share Share It is critical that any data preparation performed on a training dataset is also performed on a new dataset in the future. This may include a test dataset when evaluating a model or...
View Article3 Ways to Encode Categorical Variables for Deep Learning
Tweet Share Share Machine learning and deep learning models, like those in Keras, require all input and output variables to be numeric. This means that if your data contains categorical data, you must...
View ArticleHow to Perform Feature Selection with Categorical Data
Tweet Share Share Feature selection is the process of identifying and selecting a subset of input features that are most relevant to the target variable. Feature selection is often straightforward...
View ArticleHow to Choose a Feature Selection Method For Machine Learning
Tweet Share Share Feature selection is the process of reducing the number of input variables when developing a predictive model. It is desirable to reduce the number of input variables to both reduce...
View ArticleHow to Use an Empirical Distribution Function in Python
Tweet Share Share An empirical distribution function provides a way to model and sample cumulative probabilities for a data sample that does not fit a standard probability distribution. As such, it is...
View ArticleA Gentle Introduction to Model Selection for Machine Learning
Tweet Share Share Given easy-to-use machine learning libraries like scikit-learn and Keras, it is straightforward to fit many different machine learning models on a given predictive modeling dataset....
View ArticleA Gentle Introduction to the Bayes Optimal Classifier
Tweet Share Share The Bayes Optimal Classifier is a probabilistic model that makes the most probable prediction for a new example. It is described using the Bayes Theorem that provides a principled...
View ArticleHow to Use Out-of-Fold Predictions in Machine Learning
Tweet Share Share Machine learning algorithms are typically evaluated using resampling techniques such as k-fold cross-validation. During the k-fold cross-validation process, predictions are made on...
View ArticleDevelop an Intuition for Bayes Theorem With Worked Examples
Tweet Share Share Bayes Theorem provides a principled way for calculating a conditional probability. It is a deceptively simple calculation, providing a method that is easy to use for scenarios where...
View ArticleHow to Develop Super Learner Ensembles in Python
Tweet Share Share Selecting a machine learning algorithm for a predictive modeling problem involves evaluating many different models and model configurations using k-fold cross-validation. The super...
View Article