## Cost Function :

A cost function is a measure of how well a learning algorithm is doing in terms of being able to predict the correct output values for a given input. In other words, it measures the accuracy of the algorithm in making predictions. The goal of any learning algorithm is to find the set of parameters that minimize the cost function.

There are many different cost functions that can be used, depending on the specific problem at hand. Here are two examples:

Mean squared error (MSE): This cost function is commonly used in regression problems, where the goal is to predict a continuous value. MSE is calculated as the average of the squared differences between the predicted values and the true values. Mathematically, it is represented as:

MSE = 1/n * Sum(i=1 to n) (yi – y^i)^2

where yi is the true value and y^i is the predicted value for the ith example, and n is the total number of examples.

Cross-entropy: This cost function is commonly used in classification problems, where the goal is to predict a class label. Cross-entropy is calculated as the average of the negative log likelihood of the predicted class labels. Mathematically, it is represented as:

CE = -1/n * Sum(i=1 to n) [yi * log(y^i) + (1 – yi) * log(1 – y^i)]

where yi is the true class label and y^i is the predicted class label for the ith example, and n is the total number of examples.

In both of these examples, the cost function is calculated as the average of the errors for each example in the dataset. This is because the goal is to minimize the overall error of the algorithm, not just the error for a single example.

The MSE cost function is useful for regression problems because it penalizes large errors more than small errors. This means that the algorithm will try to minimize the overall error by reducing the number of large errors, which is often the most important factor in improving the accuracy of the predictions.

The cross-entropy cost function is useful for classification problems because it measures the distance between the true class probabilities and the predicted class probabilities. This means that the algorithm will try to minimize the overall error by making the predicted class probabilities as close as possible to the true class probabilities.

In both cases, the cost function provides a measure of how well the learning algorithm is doing, and the goal is to find the set of parameters that minimize the cost function. This is typically done using gradient descent, which is an optimization algorithm that iteratively updates the parameters in order to minimize the cost function.

Overall, cost functions are an essential part of any learning algorithm, as they provide a way to measure the accuracy of the predictions and guide the optimization process towards finding the best set of parameters. By choosing the appropriate cost function for a given problem, a learning algorithm can be trained to make accurate predictions and improve its performance over time.