Naive Bayes Algorithm: Easy way understand like expert

Bird’s eye view

In this article, you will be going to learn the Naive Bayes algorithm easily and after this, you can also talk to anyone about this like an expert.

Algorithm

It is the algorithm that is working on the Bayes theorem which describes the method to find the probability of an action or event based on prior knowledge.

For example, a ball may be considered to be a ball if it is round, and about 5 inches in diameter. As we can see these features may describe the ball but these features are independent while contributing that ball is a ball. That’s why we call it a NAIVE.

How it works

All you have to do is you just have to convert the data you have into the frequencies for all the classes and have to compute the probability of each case possible then you have to apply the Bayes theorem for prediction as simple as this. As you can see you just have to apply the formulae.

When you should use this

Suppose a situation comes, where you have to classify the data points into some classes and you know that there are too many points then easy way or you can say at sudden I will prefer you to do it by naive Bayes algorithm because it is too fast then others classification algorithm.

Comments

Random Forest and how it works

Random Forest Random Forest is a Machine Learning Algorithm based on Decision Trees. Random forest works on the ensemble method which is very common these days. The ensemble method means that to make a decision collectively based on the decision trees. Actually, we make a prediction, not simply based on One Decision Tree, but by an unanimous Prediction, made by ‘ K’ Decision Trees. Why should we use There are four reasons why should we us e the random forest algorithm. The one is that it can be used for both classification and regression businesses. Overfitting is one critical problem that may make the results worse, but for the Random Forest algorithm, if there are enough trees in the forest, the classifier won’t overfit the model. The third reason is the classifier of Random Forest can handle missing values , and the last advantage is that the Random Forest classifier can be modeled for categorical values. How does the Random...

DBSCAN Clustering Algorithm-with maths

DBSCAN is a short-form of D ensity- B ased S patial C lustering of A pplications with N oise. It is an unsupervised algorithm that will take the set of points and make them into some sets which have the same properties. It is based on the density-based clustering and it will mark the outliers also which do not lie in any of the cluster or set. There are some terms that we need to know before we proceed further for algorithm: Density Reachability A point “p” is said to be density reachable from a point “q” if point “p” is within ε distance from point “q” and “q” has a sufficient number of points in its neighbors which are within distance ε. Density Connectivity A point “p” and “q” are said to be density connected if there exists a point “r” which has a sufficient number of points in its neighbors and both the points “p” and “q” is within the ε distance. This is a chaining process. So, if “q” is neighbor of “r”, “r” is neighbor of “s”, “s” ...

How to be a HERO in Machine Learning/Data Science Competitions

At present to master machine learning models one has to participate in the competition which is appearing in various platforms. So how somebody who is new to ml can become a hero from zero . The guideline is in this article. The idea for this is not too hard. Just patience and some hard work are required. I will take an example of a Competition that is just finished within top 10. So the competition generally gives you the problem in which some of the features are hidden because they want you to explore the data and come up with the feature that explains the target value. By exploring I mean to say the few things: Look at the data. Get the sense of the data. Find the correlation of all features with a target value. Try new features made up of existing features. Exploration needs some cleaning of the data also. Because in general, the host will add the noise into the data so that it becomes a trouble for us to achieve good accuracy. By cleaning I...

Mindful Machines

Search This Blog