You're likely to have seen the Normal Distribution discussion in the Options Module if you are a regular Varsity reader. I strongly recommend that you read this chapter on Normal distribution.
This is an important topic. I recommend that you spend some time researching it before you start. Normal Distribution will be used for both pair trading techniques (Mark Whistler's Pair Trading method). It is important that you understand its central role.
I will be reiterating the central theme of Normal distribution. This should serve as a quick refresher if you are already familiar with Normal Distribution. However, for those who aren't, I hope that this doesn't demotivate from reading the Normal distribution chapter -
The theory of normal distribution that you need to know.
This image will help you to visualize the above (IMAGE 1)
There are many other ways data can be distributed, including uniform, binomial and exponential distributions. This information is for your convenience.
We have already covered three fundamental statistical metrics, the Mode, Median, and Mean in the previous chapter. These metrics will be calculated on the pair data, i.e. the differential, spread, and ratio, which were computed in the preceding chapter. These calculations will be done using the excel functions.
Please be aware that I am continuing to work on the excel we worked on in the previous chapters. You can download the updated excel from this link towards the end.
As shown below, the sheet is laid out.
These Excel functions can be used:
The numbers are as follows -
You will notice that the correlation numbers were calculated in Chapter 1.
Now we have all the data. Now we need to add one variable, the standard deviation. Varsity has already explained the standard deviation. To understand Standard Deviation better, I suggest you. The summary is here -
Standard Deviation is simply a generalization of the deviation from the average. Here's the standard definition of SD. In statistics, the standard deviation (also represented by the Greek letter Sigma, s) can be used to quantify variation or dispersion in a set of data values.
In a sense, Standard Deviation provides us with a sense of the variability of the data. It also helps us to understand the spread of the data set. Let me explain this in relation to the pair data that we are working with.
This is the differential data that we calculated a while back.
There are 496 different data points. We have also calculated the average value among these data points earlier in this chapter, i.e 228.52.
What if I asked you to help me understand how these data points vary from their average value? Better yet, ask yourself why you would need to understand the variability of these data points from their average value.
If we don't know how the data is distributed, we cannot make intelligent assessments of its behavior. We will be able to tell if the 498 the data generated is within the range or around the mean.
This is the core of pair trading.
This variation can be measured using the Standard Deviation.
Although I think the standard deviation is sufficient, traders may want to calculate an additional variable called the "Absolute Deviation". Both absolute deviation and standard deviation can help us understand the variability in the data. They differ in the way that data is handled.
I was trying to understand the difference between absolute and standard deviation when I came across the explanation on Investopedia. I find it quite interesting. I am taking it upon myself to reproduce the content.
There are many ways to measure variability in a set of data. However, the two most common are the standard deviation and the average deviation. Although they are very similar, their calculation and interpretation differ in key ways. The finance industry is particularly concerned with determining range and volatility. Therefore, professionals in accounting, investing, and economics should be well-versed in both concepts.
The most common measure for variability, the standard deviation, is often used to calculate volatility in stock markets and other investments. First, determine the variance to calculate the standard deviation. You do this by subtracting each data point's mean and then square, sum, and average the differences. Variance is an excellent indicator of range and variability. A larger variance indicates a wider range in the data. The Standard deviation is the square root of variance. The square root of the variance simply represents the difference between the points and the mean. However, it does not eliminate the possibility of negative differences in values below the average. It also means that the variance is not in the same unit as the original data. The root of the variance is the standard deviation. This returns the original measure to the original unit, making it easier to understand and use in subsequent calculations.
Another measure of variability is the average deviation. Also known as the mean absolute deviation. To avoid the problem of data with negative differences, the average deviation uses absolute values rather than squares. Simply subtract each value from the mean, then add the absolute differences and calculate the average deviation. Because absolute values are more complex and difficult to calculate than the standard deviation, the mean absolute value is less often used.
We'll compute "Standard Deviation" and "Absolute Deviation" using the three pair variables.
I am also changing the Y-axis from Mean, Median, and Mode. The X-axis is for Differential Ratio and Spread. The snapshots above will differ slightly from the ones below. I apologize for my poor data handling skills.
These variables can be calculated using the excel function -
Standard Deviation - "=Stdev.p()"
Absolute Deviation – '=avedev()
The basic descriptive statistics are also known as the mean, median, mode, standard deviation, absolute deviation.
We can use the standard deviation to get an idea of the variation in data. Now we will take it a step further by trying to quantify the variation. You may be wondering why we should do this. This will allow us to understand how much variation there is from the mean. We will know whether 275 is too low or high in the mean data.
This information allows us to make a decision about whether we want to buy or shorten the pair. These details will be discussed later. Let's focus now on quantifying the variance. To quantify the data point we will need to create a standard deviation table.
The table structure is as follows -
As you can see, we'll now calculate the standard deviations 1, 2, or 3 above and below the mean of spread, differential, or the ratio.
Let's take, for example, the Spread data. The spread's mean is 0.06. The standard deviation (SD), is also known to be 8.075.
The 1st SD would then be above the mean.
0.064 + 8.75 =
2 nd S -
0.064 + (2)*8.075 = 16.123
3 rd S -
0.064 + (3*8.0755) = 24.288
These are all values that are above the mean. You can also identify values below the average by doing the same.
0.064 - 8.075 =
0.064 - (*8.075) =
0.064 – (3*8.0755) = -24.160
I have done the same math with Ratio and Differential. This is how the table looks.
It is obvious that the value of the 498th differential data is approximately the +2 standard deviation. You can conclude with 95% confidence that there is only a 5% chance that the next set will be higher than 315.
We have all the information we need at this point to assess the pair and possibly identify if trading is possible. In the next chapter, we will do it. To be sure we're all on the same page, I will start the next chapter by briefly reviewing everything we've discussed.