6.2.1 Rationale for Cluster Analysis
17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33
34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50
51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67
68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84
85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101
102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118
119 120 121 122 123 124 125 126 127 128 129 130 131
The rationale is simple enough. If you remember Exercise 6.4, and the answers
given in Appendix 1.9, you’ll remember that there were several relationships
between constructs to be noticed in the grid on choosing a computer:
Easy to set up – Difficult to set up
Good build quality – Flimsy build
Fast – Slow performer
with the last construct being reversed with respect to the second. It appears
that there is a group of constructs which stands out as receiving somewhat
different ratings from the others.
Exercise 6.2 had you working out the relationships between elements for this
same grid, and if you look at the answers given in Appendix 1.7, you’ll see that
elements can group together in a similar way. The sums of differences between
the iMac G4 and the Ideal, between the eMac and the Ideal, and between the
iMac G4 and the eMac are all small, ranging between 4 and 10. They’re rather
distinct from some of the other sums of differences, and appear to stand out as
a distinct cluster. Elements can cluster together as well as constructs.
Sometimes the relationships are obvious, as you carry out an eyeball
inspection. Just look at the column of ratings in the iMac G4 column and
the Ideal column! If it wasn’t for the fact that the iMac G4 didn’t have an
enormous range of software available, these two columns would be very
similar, and very different from the other columns of ratings.
It’s asking a lot of your powers of observation to get you to look for such
groupings directly. You go cross-eyed as you try to look at which columns are
similar to which and different from others; and then, for the constructs, which
rows form one pattern, with other rows forming other patterns. Wouldn’t it be
so much easier, and make the patterns more obvious, if you could shuffle the
columns and rows around, so that the most similar values lay side by side?
Suppose you were to pick up the paper on which a grid is printed, take a pair
of scissors, and cut the grid into strips, one strip for each vertical column. Then
shuffle the columns about until the columns with the most similar ratings lie
side by side. Just as I’ve done in Figure 6.1, in fact. That is, in effect, what a
cluster analysis does, except that it repeats the procedure for the constructs as
well, snipping the grid into rows while checking for reversals, and then
shuffling the rows around until the constructs with the most similar ratings lie
side by side. This whole procedure is illustrated in Figure 6.1.
Let me offer you an analogy for the whole procedure. It’s as if you and your
interviewee had been looking at the original grid through a camera which was
slightly out of focus, so that the structure in what you were looking at wasn’t
entirely distinct; and then you adjusted the lens until the relationships you
were viewing sprang into focus. Indeed, the particular statistical procedure
developed by Laurie Thomas (Thomas, 1977) and Mildred Shaw (Shaw, 1988)
for their grid software, and adapted for subsequent analysis packages, was
first called ‘Focus’ for this very reason.
Cluster analysis starts by working out % similarity scores exactly as we did in
Sections 6.1.1 and 6.1.2 when we looked at simple relationships. The remaining
computations are too tedious to go into here. (Use a software package; most of
them include a grid cluster analysis routine.) The results are fairly obviously
Figure 6.1 Cluster analysis – focusing the picture
related to the original grid and, especially valuable if you’re working with
someone in a client capacity, readily explained to the interviewee by pointing
to the results on paper.