In this post we will see about logistic regression, this post is in continuation of liner regression which we saw before. Make sure to read the previous post in detail before reading this.
Previously we saw what is liner regression i.e to predict data out come based on input data. Logistic regression is used when we want to probability of a event or data happening i.e yes/no. Suppose if we want to know if it will rain today or not? if an email is spam or not? etc
For probability calculation we need an output between 0 to 1 for our model. This is achieved using sigmoidal function
i.e 1 / (1 + e^-value) and it looks like this
What ever be the input negative or positive, the output will always between 0 and 1.
So based on this our regression model changes to this
y = e^(b0 + b1*x) / (1 + e^(b0 + b1*x))
Almost similar steps like linear regression are applied further to fit the model to data. For a more detailed explanation read here https://machinelearningmastery.com/logistic-regression-for-machine-learning/
In the end they data we will get will be between 0-1, lets say .3. This means there is 30% chance of the event happening.