A neural network is a network which closely simulates the learning of a human brain. This is done by using connection weights and biases for each neuron in every layer.

In a neural network, the information is passed to the layer, the layer then computes the weighed sum of the…

Gradient descent is an algorithm which aims to minimize the error or the loss metric in order to obtain the best possible set of parameters for your model. This technique is very flexible and can have many hyperparameters which can be tuned for better optimization

Gradient descent technique can be…

Introduction to Skewness

Skewness is the measure of asymmetry of data distribution.

If the data is positively skewed, then we can interpret that there are more values which are greater than the mean than the values that are lesser than the mean

If the data is negatively skewed, then we can interpret…

In this kind of regression, we have multiple features to predict a single outcome or in other words, a single dependent variable can be explained by multiple independent variables.

In this regression, we will use Gauss Markov setup which has the following assumptions

- Errors follow the normal distribution with mean…

Structural break is an unexpected change in the pattern of data that we are given to work with.

- Cyclic/seasonal — type of structural break where there are repeated patterns in the structural breaks
- Non Cyclic — type of structural break where there are no repeated patterns in the structural breaks

Multicollinearity is defined as a condition where two or more explanatory variables are related amongst themselves which may cause misleading predictions.

Multicollinearity is an issue when the correlations between the columns may change with change in the conditions.

For example let us take the scenario of the stock market before…

**F test**

This test is used for checking if all the coefficients of the regression are collectively equal to 0 or not.

For this test we have defined two models

- Restricted model — In this model, the coefficients of all the explanatory variables are 0
- Unrestricted model — In this…

In regression we fit a line through the data points to predict a continuous target variable using various independent explanatory variables

Let the line be y= α+βx

We use the least squared error approach We square the error because:

- It makes all errors positive and eliminates nullification of positive and…

Correlation is a measure of robustness of a relation between two variables. The coefficient of correlation is used in various statistical analysis and machine learning algorithm

For comparing two bivariate datasets, mean, median, mode, standard deviation and other measures of central tendencies could not be used as it was possible…

Often there is a debate about which one is better for Data Analysis, here we will generate random data and analyze it with both pandas as well as SQL to see which one is better for us.

In this case we shall be comparing both of them for ourselves by…