Login Registration    Residuals in tables

Once you generate a contingency table in the 'ANALYSE' – 'Statistics' – 'Crosstabs' tab, the Chi-squared value is displayed. Within the table, cells will be displayed in different colours, based on residuals.

Residuals provide an extremely simple and effective analysis of developments in the table. Unlike the Chi-square value, which gives only a general diagnosis of the correlations in the table, residuals accurately show where we can find correlations. Chi-square can prove to be statistically significant due to correlations in a single cell, but it does not tell us where it is.

A residual is a term used in connection with analysis of nominal variables. A residual is simply the difference between the actual frequency of the given cell and the theoretical frequency, as it would have been if the variables of the two-dimensional table in this cell were not related (assumption of the null hypothesis). Theoretically, the frequency is calculated very simply as the product of two margins, divided by the total size of the table.

If basic residuals – which with the standard assumption follow the Poisson distribution – are standardised (subtract the expected value and divide by the standard deviation), we get standardised residuals, which are asymptotically normally distributed. With standardised residuals we utilize the commonly used interpretation of hypothesis testing and the usual critical values, i.e. 1.65 or 1.96 at 10% or 5% risk.

Adjusted residuals additionally correct for un-equal margin dimensions. Some researchers have proven that they are more suitable than conventional standardized residuals, which is also our recommendation, and is used with analysis.

The 1KA application uses and colours the values 1.0, 2.0 in 3.0 for values of adjusted residuals, which roughly signal the strength of the correlation in a particular cell, i.e. the strength of deviation from the assumptions of the null hypothesis. The meaning of the values for standardised residuals:

• above 1.0 implies a certain increase and attention,
• above 2.0 (this is a simplification of the value 1.96) implies a statistically significant difference (sign<0.05), thus with a relatively low risk, the residuals differ from zero,
• above 3.0 constitutes a strong deviation (sign<0.01), which means that the residuals will almost certainly be different from zero, and, therefore, there is something "going on" in the cell.

Blue coloured cells indicate that the cell contains less units than expected, while red coloured cells indicate that the cell contains more units than expected.

For example, if the cell contains 30 units, and the expected value is 20, the basic residual in 10. Thus there are 10 more units in this cell than would have been expected if the variables in these two categories were not related. For example, if we are looking at gender and opinion, we could say that men are more IN FAVOUR than expected if the gender did not have an effect. If we subtract the expected value for the residual 10 and divide it by its root (root of 20 is 4.5, since the Poisson distribution has an expected value that is equal to variance), we get a standardized residual which is larger than 2, since we get (20-10) /4.5>2.0.

If we correct this on the basis of the formulas found in the annexes below, we get an adjusted residual, which has – if there are no exceptional asymmetries in the margins (YES: NO, male: female) – a rather similar value. A detailed example of calculating residuals is found here >>. In any case, we can conclude that in this cell there are statistically significant deviations, and on this basis we can form the substantive interpretation (e.g. reasons why men are more IN FAVOUR).

The colouration of 1KA is indicative, simplified, and simply functions as a screening (exploratory) analysis. In the formal interpretation of either the exact standardized or - even better - adjusted residual, we interpret it in the standard way, as indicated in the examples below.

The exact value of the residuals is obtained by selecting their calculation in the 'Settings' option, in one of the horizontal links above the table.

We can of course interpret the entire table and its Chi-squared value. However, residuals are more precise than the entire Chi-squared value, since they focus precisely on the individual cells where deviations occur. Further insight is obtained by analysis of the difference in shares based on the t-test.

Of course, all of this is only valid for nominal variables. In case of a ‘good’ ordinal arrangement of one of the variables - even more so in the case of unequivocal interval or ration scales – we use the T-test or variance analysis.