We are going to work mainly with R. R is a free software environment for statistical computing and graphics. When the data have a symmetrical distribution without outliers, the mean and the sample median are very close.
consume<-c(6.9, 6.3, 6.2, 6.5 ,6.4, 6.8, 6.6)
data.frame(mean=mean(consume),median=median(consume))
> data.frame(mean=mean(consume),median=median(consume))
mean median
1 6.528571 6.5
However, when the distributions are asymmetrical the measure and the median will not be
Coincident:
- Right asymmetry: the mean is greater than the median
- Left asymmetry: the mean is less than the median
This is another example when we probe this. We are going to use this datas in R:
salaries=c(903, 2684, 550, 1571, 1190, 857, 547, 2401, 1257, 411, 3500, 284, 7537, 1666, 604, 692, 450, 770, 3013, 566)
This is a list that show monthly salary of 20 workers of one company. We can calculate median and mean using this commands in R.
mean(salaries) –> 1572.65
median(salaries) –> 880
hist(salaries)
abline(v=c(mean(salarios),median(salarios)), col=c(“blue”,”red”))
We get the results that you see in Fig. 1.
Is important remark that if we use mean salary can be misleading because 70% has a salary lower than the average salary.
mean(salaries<mean(salaries))
[1] 0.7
Also we can use a dot chart which is showed in Fig. 2.
The code is:
dotchart(salaries,pch=16,xlab=”diameter”)
abline(v=mean(salaries),col=’red’,lwd=2)
abline(v=median(salaries),col=’blue’,lty=2,lwd=2)
legend(“bottomright”,c(“mean”,”median”),
col=c(“red”,”blue”),lty=c(1,2),lwd=c(2,2),box.lty=0,cex=1.5)
Therefore is no the same median and mean. You must to be very concrete and use them wisely. There are several kinds of mean in mathematics, especially in statistics. For a data set, it may be thought use one o another.