7.2.1 Bootstrapping Techniques
17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33
34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50
51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67
68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84
85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101
102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118
119 120 121 122 123 124 125 126 127 128 129 130 131
By this is meant any approach in which a category system is developed on the
hoof, in the course of categorising the items being dealt with. Holsti’s original
account (Holsti, 1968) is definitive and should be referred to for background
information about content analysis in general social research; Neuendorf’s
work (2002) looks set to replace Holsti in due course. The following core
procedure summarises the basics in as much detail as you require.
Firstly, a decision is taken on what constitutes the content unit, or basic unit of
analysis. What’s the basic unit being categorised: how large or small should
the basic idea be? And less obviously but equally important, the context unit:
where should the basic idea be located: should we scan the phrase, the
sentence, and the paragraph in a search for it? Should we stick to the same
context unit, such as the sentence, throughout, or can we use the sentence at
times, and a whole paragraph at other times (‘This sentence is basically about
X’; ‘These next few sentences are basically padding, but I can see that the
whole paragraph is about Y’)?
You’ll be relieved to hear that when you content-analyse a repertory grid,
these questions have already been resolved for you during construct
elicitation. Each construct is your basic unit of analysis and, in Holsti’s
terms, the construct is both the content unit and the context unit. In other
words, each and every construct is regarded as expressing a single unit of
meaning. Of course!
The Core-Categorisation Procedure
Each item being categorised is compared with the others, and then:
(1) If an item is in some way like the first item, the two are placed together
under a single category created for them at that very moment.
(2) If an item is different to the first item, they’re put into separate
categories.
(3) The remaining items are compared with each of the categories and
allocated to the appropriate one if an appropriate category exists.
(4) A new category is created if required; when a new category is created,
the possibility that existing categories need to be redefined (combined
or broken up, with their items reallocated accordingly) is considered
and effected as necessary.
(5) This process continues until all the items have been classified.
(6) However, a small number are usually unclassifiable without creating
categories of just one item, which would be pointless, so all unclassifiable
items are placed in a single category labelled ‘miscellaneous’.
(7) If more than 5% of the total are classified as miscellaneous, consider
redefining one or more existing categories so that, at the end, no more
than 5% of the total are categorised as ‘miscellaneous’.
Now, with one proviso, this caters for our first need. By recognising similarities
and dissimilarities in the interviewees’ items as they specified them, we’ve
aggregated the meanings in the whole sample while classifying some
individual meanings as similar, and others as different. We’ve accommodated
differences in meaning and intention on the part of individual interviewees.
(We have, however, lost a lot of information, for none of the ratings have been
taken into account. We’ll see that Section 7.3 deals with this difficulty.)
The proviso is obvious. The categories we have devised are simply our own
opinion. But your category system is simply your own way of construing your
interviewees’ constructs: other people might not see the same kinds of
meaning in the constructs, and might disagree! Yet the whole point of this
chapter, you’ll recall, is to make statements which communicate meanings
effectively to other people. Something’s wrong if we can’t agree on the
category to which a particular construct belongs.
To guard against this problem, every content analysis needs to incorporate a
reliability check. This is a procedure which ensures that (though one has every
right to a private interpretation – remember Section 5.3?) the category system
shouldn’t be wildly idiosyncratic, if it is to make sense to other people. There’s
more on this later, in the section on reliability.
Don’t forget that there are several different forms of reliability. Hill (1995: 105^106)
reminds us that in content-analysis terms, three kinds exist. Firstly, there is stability,
the extent to which the results of a content analysis are invariant over time. Are your
category definitions robust enough that, if you were to repeat the core procedure all
over again, you would end up with the same categories, and within the same
constructs under each category? After all, you’re supposed to be recognising similarities
in meaning in a set of constructs. So long as you’re working with that particular
set, themeaning you recognise should be the same on both occasions.
Secondly, there is reproducibility, the extent to which other people make the same
sense of the constructs as you do. If meaning is socially defined, if you are to avoid
layingyourownidiosyncracies onto the data, your content analysisneedsto bereproducible.
Finally, there is sheer accuracy. How consistently are you applying your category
definitions, once you have fixed themas a standard to aimat?
In practice, it is sometimes difficult to distinguish between these three sources of
unreliability; however, you will notice as you use the procedures outlined below that
all three confront you as you make decisions about category definitions, and about
the allocation of constructs to categories. The procedures described have been
devised to reduce unreliability under all three headings.
This is all a little abstract. Let’s pin it all down. Firstly, what are the items we’re
talking about? In fact, the generic procedure can be applied to either elements
or constructs, though usually it’s the latter. Thus, a discussion of the elements
provided by all the interviewees in a group, getting the group members to
categorise them themselves by means of the procedure outlined above, is often
a good way of getting people to examine, debate, and challenge their ideas
about the topic, particularly in a training setting (see Section 9.2.1, step 2 of the
partnering procedure outlined there, for an example of this activity in a
personal change setting).
For general research purposes, though, it’s the constructs which are the items
being categorised, and the remainder of this chapter will deal with construct
content analysis only. First, let’s look at the generic procedure, and then deal
with the matter of reliability.
The Generic Content-Analysis Procedure
(1) Identify the categories.
(2) Allocate the constructs to the categories following the core procedural
steps 1 to 7 above. You’ll notice that this results in a set of categories which are
mutually exclusive, and completely exhaustive: all your constructs are
accounted for. A convenient way of doing this is to transcribe each
construct from all the grids onto its own file card, coding the card to
identify which interviewee provided the individual construct, and which of
his/her constructs it is, in order of appearance in that interviewee’s grid.
(Thus, the code 5.3 would indicate that the construct in question was the third
construct in the fifth interviewee’s grid.)
Now go through steps 1 to 7 above, placing the cards into heaps, each heap
constituting a different category. If you lay all the cards out on a large table,
you can see what you’re doing, and shuffle cards around as you identify
categories, allocate cards to them, change your mind and create new
categories, and so on.
(3) Tabulate the result. In other words, record which constructs have been
allocated to which categories. On a large sheet of paper (flip-chart paper,
especially if it has been ruled as graph paper, is ideal), create a set of rows, one
for each category. Create a column on the left, and in it, label each row with its
category name. Now create a new column and use it to record a short
definition of that category. In a third column, record the code numbers of all
the constructs that you allocated to that category.
(4) Establish the reliability of the category system (ignore this for the
moment; see the discussion below).
(5) Summarise the table; first, the meaning of the category headings. What
kinds of categories are these? What sorts of constructs have we here? Use the
column 2 information to report on the distinct meanings available in the whole
set of constructs.
(6) Summarise the table: next, find examples of each category heading. Are
there constructs under each category which stand for or exemplify that
category particularly well? Are there perhaps several such constructs, each
summarising a different aspect of the category? Highlight the code numbers of
these constructs among the list in column 3. You’ll want to remember these
and quote them in any presentation or report that you make, since they help
other people to understand the definitions you have proposed in step 4.
(7) Summarise the table; finally, find the frequency under the category
headings. In a fourth column, report the number of constructs in each
category. Which categories have more constructs and which have fewer? Is
this significant, given the topic of the grid? For reporting purposes, when you
have to list the categories, consider reordering them according to the number
of constructs allocated to them.
Table 7.1 provides an example, taken from a study I once did of fraud and
security issues in the Benefits Agency (Jankowicz, 1996). For the moment,
ignore the two rightmost columns (the ones headed ‘Prov.’ and ‘Met.’). Notice
how, in step 3, I’ve expressed the definitions of each category as bipolar
constructs (in the second column). Next, the codes which stand for each
construct are listed; I haven’t listed them all in this example since there isn’t
the space on the page! The ‘Sum’ column shows how many constructs were
categorised under each heading, and below that, the percentage of all the
constructs that this figure represents (for example, in the first row, 57
constructs, this being 19.1% of all the constructs, came under the first
category). This table is in fact a summary, and was accompanied by a set of
tables which listed the constructs themselves, as well as their codes. You might
argue that it’s the constructs themselves, and not the code numbers standing
for them, that matter, and you’d be perfectly right. However, a table like
Table 7.1 provides a useful summary, and the third column is, in fact, a
necessary discipline in checking the reliability of the content-analysis process
(see steps (4.1) to (4.7) below), so you might as well take this column seriously!
Design Issues: Differential Analysis
If you have some hunch or hypothesis which you want to check by your
analysis, you need to create additional columns in Table 7.1 at step 7, one for
each subgroup into which you have divided the sample. You will then count
the number of constructs under each category, subgroup by subgroup, thereby
carrying out a ‘differential analysis’. This is very straightforward. It simply
reports whether the constructs from the members of one subgroup are
distributed differently across the categories than the constructs from other
subgroups, working with the percentage figures in each category. (Where the
total number of constructs provided by each subgroup varies, you’ll need to
change all the figures into percentages of each subgroup’s total, so that you
can compare between subgroups.)
For example, do younger interviewees think differently to the older ones (in
terms of the percentage of their constructs they allocated to particular
categories)? Does a sales force think differently about the price discounts
Table 7.1 Content-analysis procedure, Benefits Agency example
Category Definition Construct Sum
%
Prov. Met.
Deliberateness
of action and
intent
Knowing what’s right and
ignoring it; lawbreaking;
deliberate fraud; errors
of commission versus
bypassing procedures; making
technical errors; mistakes and
errors of omission
2.1
17.1
18.2
35.1
etc.
57
19.1
47
21.1
10
13.3
Friendship and
other external
pressures
Divulging information to a 3rd
party; collusion versus acting
alone; no 3rd party involved
46.1
31.1
etc.
39
13.3
29
13.0
10
13.3
Pressure of
work
Shortcuts to gain time or ease
workflow; pressure of targets;
negligence versus reasons other
than workflow; breaches of
confidence; deliberate
wrongdoing
16.1
44.1
13.1
etc.
34
11.4
26
11.7
8
10.7
Internal versus
external
involvement
Staff member is the agent in
doing/condoning the offence
versus claimant the agent in
the offence
1.2
17.2
35.2
etc.
33
11.1
27
12.1
6
8.0
Risk, proof
and
obviousness
Definite and easy to prove; clear
feeling there’s something wrong;
rules clearly broken versus
unsure if fraud occurred; no
rules broken
1.1
4.5
34.5
etc.
32
10.7
25
11.2
7
9.3
Systems and
security
procedures
Using information improperly;
cavalier attitude to checking;
misuse of IT procedures versus
accidental outcomes of IT system;
inadequate clerical procedures
3.1
12.2
39.3
etc.
31
10.4
21
9.4
10
13.3
Who gains Employee reaps the benefit of
fraud; less personal need or
motive versus claimant reaps
the benefit; personal problems
provide a motive
10.5
25.2
47.6
etc.
27
9.1
20
8.9
7
9.3
Money versus
information
Personal cash gains;
clear overpayments versus
provision of information
4.1
11.4
etc
21
7.1
14
6.3
7
9.3
Outcomes Severe consequences or
repercussions versus fewer/
less severe consequences
33.5
44.5
etc.
8
2.7
3
1.3
5
6.7
Training
issues
Not preventable by training
versus preventable by improved
training
7.7
36.3
etc.
4
1.3
2
0.9
2
2.7
Where it
happens
Occurs in the agency office
versus occurs in claimant’s
home or similar
7.4
9.6
etc.
3
1.3
0
0
3
1.3
Miscellaneous 3.6
etc.
9
3.0
6
2.7
3
4.0
Totals 298
100.2
223
99.9
75
99.9
Source: Reproduced from the Analytical Services Division, Department of Social Security.
available to them than the sales office staff who set the discounts but never
meet real clients?
Take a third example. Perhaps, as a manager in a department of a municipal
administration, you suspect that clerical officers who have had private-sector
experience before deciding to work in local government have a systematically
different way of thinking about their jobs than clerical officers who have
always worked in local government. You feel there may be two distinct types
of clerical officer, and if this is the case, the implications (to do with their
attitude to the public as customers; to issues of supervision; and to the way
they exercise initiative) may be wide-ranging.
The following steps can now be completed.
(8) Complete any differential analysis which your investigation requires.
Create separate columns for each group of interviewees you’re interested in,
and record the constructs separately for each of them. Count the constructs
from each group in each category, and see! Does each group of respondents
think systematically differently?
In the Benefits Agency study reported in Table 7.1, the Agency wished to see
whether there were any differences in the construing of employees in busy
metropolitan offices as distinct from quieter provincial offices, and the
sampling was designed to pick this up. Table 7.1 shows the relevant
information in the columns headed ‘Prov.’ and ‘Met.’. Since there were
differing numbers of interviewees in each location, each providing grids
showing differing numbers of constructs, the entries in each category were
turned into percentages of the total number of metropolitan and provincial
constructs. Each entry in the ‘Prov.’ column shows the number of constructs in
that category mentioned by provincial employees, and, below, the corresponding
percentage of all the provincial employees’ constructs. Ditto for the
‘Met.’ column.
As you can see, though there were some differences, these weren’t dramatic.
‘Deliberateness of intent’ was mentioned especially frequently by provincial
employees (21.1% of their constructs in this category). Metropolitan employees
saw this as important (13.3% categorised under the same heading) but were
equally concerned about ‘friendship and other pressures’ (13.3%) and ‘systems
and security procedures’ (13.3%).
(9) Complete any statistical tests on this differential analysis as required. If
you’re familiar with null hypothesis testing, you’ll have noticed that this table
consists of frequency counts under mutually exclusive headings. Any
differential analysis in which you’re involved turns the content analysis into
an X by Y table where X is the number of categories (rows) and Y is the
number of distinct groups (data columns; Y equals 2 in the case of the clerical
officers example above). This is exactly the kind of situation which lends itself
to the chi-square statistic (or Fisher’s Exact Probability Test, depending on the
number of expected values in each cell). Other tests dependent on a
comparison of the order of frequencies in each column may occur to you. If
you aren’t familiar with these statistical tests and this last step doesn’t convey
any meaning to you, just ignore it.
Reliability
You may have noticed that we omitted one crucial step from the procedure. As
you remember from our earlier discussion, content analysis can’t be
idiosyncratic. It needs to be reproducible, and to make sense to other
people. And so, all content analyses should incorporate a reliability check.
This is an additional twist to the procedure, and takes place during the
tabulation stage.
Let’s work with a running example. This is an (invented) study of the factors
which a publishing company’s sales reps believe are important in achieving
sales. Imagine, if you will, that 50 sales reps have each completed a repertory
grid, taking eight recently published books as their elements.
Run through steps 1 to 3 as above, using the core-categorisation procedure
described earlier.
(1) Identify the categories.
(2) Allocate constructs to those categories.
(3) Tabulate the result.
(4) Establish the reliability of the category system.
(4.1) Involve a colleague: ask a colleague to repeat steps 1 to 3
independently, producing a table, like your own, which summarises
his/her efforts. Now, the extent to which these two tables, yours and your
collaborator’s, agree indicates how reliable your procedures have been.
(4.2) Identify the categories you both agree on, and those you disagree
on. You can assess this by drawing up a fresh table, whose rows stand for
the categories you identified, just as you did before; and the columns stand
for the categories your collaborator identified. This is a different table to
either of the content-analysis tables which you and your collaborator have
filled out. Its purpose is to compare the two separate ones. For the sake of
clarity, let’s call it the reliability table. Here’s our worked example, shown
as Table 7.2.
Jot down the definitions of the two category sets; discuss the categories,
and agree on which ones mean the same. Now rearrange the rows and
Table 7.2 Assessing reliability, step (4.2), before rearrangement
Collaborator
Interviewer
1
Sales
price
2
Nature of
purchasers
3
Current
fashion
4
Coverage
5
Trade
announcements
6
Layout and
design
7
Competition
8
Advertising
budget
1 Popularity of
topic
5.8
2 Buyer
characteristics
6.1
3 Pricing decisions
4 Design
5 Contents
6 Competitors
7 Promotion 7.4 4.1
Example of initial content-analysis categories from a study of the factors which a publishing company’s sales reps believe to be related to the
volume of sales they’re able to achieve. This example is developed in Tables 7.3 to 7.6.
The reliability table will be used to record how the interviewer, and the collaborator, have categorised all of the constructs. As an example, four
constructs have been placed into the table. So, for example, the interviewer has put construct 6.1 into the ‘buyer characteristics’ category. The
collaborator seems to disagree about its meaning, having put it under the ‘layout and design’ category.
columns of the reliability table so that categories which you and your
collaborator share are placed in the same order: yours from top to bottom
at the left of the table, and your collaborator’s from left to right at the top
of the table. In other words, tuck the shared categories into the top left
corner of the table, in the same order across and down, with the categories
that you don’t share positioned in no particular order outside this area (see
Table 7.3).
(4.3) Record your joint allocation of constructs. Working from your two
separate content-analysis tables prepared in step 3, record the position of
each of your constructs into the reliability table. How did you categorise
the construct, and how did your collaborator categorise it: which row and
column, respectively, was it put into? Write the construct number into the
appropriate cell of the table. Table 7.3 shows just four constructs which
have been allocated in this way, as an example, while Table 7.4 shows a
full data set of constructs recorded in this way.
As you can see, there are two parts to the reliability issue. Can you agree
on the category definitions; and, when you have agreed, are you both
agreed on the allocation of the constructs to the same categories? The
rearrangement of the reliability table, as shown in Table 7.3, is a useful
exercise, since it forces you both to think about your category definitions.
It also provides you with a measure of the extent to which you can allocate
constructs consistently, as follows.
(4.4) Measure the extent of agreement between you. Think about it. If you
were both in perfect agreement on the allocation of constructs to
categories, all of your constructs would lie along the diagonal of the
reliability table (the shaded cells); to the extent that you disagree,
some constructs lie off the diagonal. So your overall measure of agreement
is as follows:
. the number of constructs which lie along the diagonal
. in all the categories you were both agreed on (the categories which lie
above and to the left of the two heavy lines drawn in Table 7.3)
. as a percentage of all of the constructs in the whole table.
Work out this figure, and call it index A. Now, repeat this calculation but
express the number as a percentage of the constructs, not in the whole
reliability table, but just those which have been allocated to categories you
both agree on. Call this index B.
Table 7.5 provides you with a complete worked example. Index A is 54%:
you’ve only agreed on just over half of what the constructs mean! Index B
is, as it must be, larger, at 64%: when you confine your attention to
categories which mean the same to both of you, you have a better result.
Table 7.3 Assessing reliability, step (4.2), after rearrangement
Collaborator
Interviewer
1
Current
fashion
2
Nature of
purchasers
3
Sales price
4
Layout and
design
5
Coverage
6
Competition
7
Trade
announcements
8
Advertising
budget
1 Popularity of
topic
2 Buyer
characteristics
6.1
3 Pricing decisions
4 Design
5 Contents
6 Competitors
7 Promotion 7.4 4.1
1. Discussion of the definitions showed that the interviewer’s ‘popularity of topic’ category is the same as the collaborator’s ‘current fashion’; the
interviewer’s ‘buyer characteristics’ is the same as the collaborator’s ‘nature of purchasers’ category; ‘pricing decisions’ is the same as ‘sales price’;
‘design’ is the same as ‘layout and design’; ‘contents’ is the same as ‘coverage’; and ‘competitors’ is the same as ‘competition’. The interviewer has
one category not used by the collaborator, ‘promotion’; and the collaborator has two categories not used by the interviewer, ‘trade announcements’
and ‘advertising budget’.
2. The categories have now been reorganised so that the commonly shared ones are at the top left of the table.
The way in which both the interviewer and collaborator have categorised the constructs is now recorded by placing construct codes into their
appropriate cells; just four examples, the same ones which appeared in Table 7.2, are shown above.
3. Construct 5.8, ‘there’s a demand for a textbook like this – no demand for this topic’, was categorised under ‘popularity of topic’ in the interviewer’s
analysis, and as ‘current fashion’ in the collaborator’s analysis, so it’s placed in row 1, column 1, in this table: a construct on which both are agreed.
Construct 6.1, ‘ring-bound covers: bookshop buyers don’t like – conventional cover: bookshop buyers will accept’, was categorised under ‘buyer
characteristics’ by the interviewer but under ‘layout and design’ by the collaborator.
Construct 7.4, ‘advertised heavily in the trade press – not advertised in the trade press’ was placed in the ‘promotion’ category by the interviewer,
but in the ‘trade announcements’ category by the collaborator.
Construct 4.1, ‘big advertising budget – small advertising budget’ was categorised under ‘promotion’ by the interviewer but under ‘advertising
budget’ by the collaborator.
5.8
Table 7.4 Assessing reliability, step (4.3)
Collaborator
Interviewer
1
Current
fashion
2
Nature of
purchasers
3
Sales price
4
Layout and
design
5
Coverage
6
Competition
7
Trade
announcements
8
Advertising
budget
1 Popularity of
topic
1.4 3.2
2 Buyer
characteristics
4.6, 6.1 1.6, 6.4 6.6
3 Pricing decisions 7.1 7.3 6.2 7.2
4 Design
5 Contents 7.8 3.6, 5.2
6 Competitors 7.9
7 Promotion 7.4 1.7, 5.6,
6.7
2.2, 4.1,
3.4, 5.4
All of the constructs in the publisher’s example are shown here, identified by their code number.
2.3, 5.8,
3.5, 5.3
1.1, 2.5,
3.7, 5.7
1.2, 3.1,
4.4, 5.5,
6.3, 7.7
1.5, 4.5,
7.5
4.3, 5.9,
7.6
2.1, 1.3,
2.4, 3.3,
4.2, 5.1, 6.5
Table 7.5 Assessing reliability, step (4.4)
Collaborator
Interviewer
1
Current
fashion
2
Nature of
purchasers
3
Sales price
4
Layout and
design
5
Coverage
6
Competition
7
Trade
announcements
8
Advertising
budget
Total
1 Popularity of
topic
1 1 6
2 Buyer
characteristics
2 2 1 9
3 Pricing
decisions
1 1 1 1 10
4 Design 3
5 Contents 1 2 6
6 Competitors 1 8
7 Promotion 1 3 4 8
Total 5 5 9 6 6 11 3 5 50
1. Index A: number of constructs along the diagonal for the categories agreed on, as a percentage of all the constructs in the table:
4 + 4 + 6 + 3 + 3 + 7 = 27;
50 constructs in total;
100_27/50 = 54%
2. Index B: number of constructs along the diagonal for the categories agreed on, as a percentage of all the constructs in the categories agreed on:
4 + 4 + 6 + 3 + 3 + 7 = 27;
42 constructs in the categories agreed on (5 + 5 + 9 + 6 + 6 + 11, or, of course, 6 + 9 + 10 + 3 + 6 + 8; it’s the same!)
100_27/42 = 64%
4
4
6
3
3
7
But it’s still not good enough; a benchmark to aim at is 90% agreement, with
no categories on whose definition you can’t agree. So:
(4.5) Negotiate over the meaning of the categories. Look at which
categories in particular show disagreements, and try to arrive at a
redefinition of the categories as indicated by the particular constructs on
which you disagreed, so that you improve on the value of Indices A and B.
Argue, debate, quarrel, just so long as you don’t come to blows. Break for
lunch and come back to it if necessary!
For example, in Table 7.5, even without knowing what the constructs are,
you can hazard a guess that the interviewer and collaborator will be able
to agree on a single category, ‘promotion’, since announcements in the
trade press, and advertising, might both be regarded as forms of
promotion. This single redefinition would be sufficient to create a total
set of seven categories which accounted for all the constructs and on
which both were agreed.
Even if nothing else changed, a redrawing of Table 7.5 (see the result in
Table 7.6) shows an improvement to 68% agreement. It is likely that this
discussion will clarify the confusion which led to construct 7.2 being
categorised under ‘pricing decisions’ by the interviewer, raising the index
to 70%. Further discussion, concentrating on the areas of disagreement,
would tighten up the definitions of the other categories. The aim is to get
as many constructs onto the diagonal of the table as possible!
(4.6) Finalise a revised category system with acceptably high reliability.
The only way of knowing whether this negotiation has borne fruit is for
each of you, interviewer and collaborator, to repeat the procedure. Redo
your initial coding tables, working independently. Can you both arrive at
the same, including categorisation of the constructs to the carefully
redefined categories?
Repeat the whole analysis again. That’s right! Repeat step 2 using these
new categories. Repeat steps 3 and 4, including the casting of a new
reliability table, and the recomputation of the reliability index.
This instruction isn’t, in fact, as cruel as it may seem. The categorisation
activity is likely to be much quicker than before, since you will be clearer
on category definitions and you will be using only agreed categories. It is
still time-consuming, but there is no alternative if you care for the
reliability of your analysis.
(4.7) Report the final reliability figure. The improved figure you’re
aiming for is 90% agreement or better, and this is usually achievable.
There are more accurate measures of reliability, including ones which provide a
reliability coefficient ranging between _1.0 and +1.0, which may be an obscure
Table 7.6 Assessing reliability, step (4.5)
Collaborator
Interviewer
1
Current
fashion
2
Nature of
purchasers
3
Sales price
4
Layout and
design
5
Coverage
6
Competition
7
Promotion
Total
1 Popularity of
topic
1 1 6
2 Buyer
characteristics
2 2 1 9
3 Pricing decisions 1 1 1 1 10
4 Design 3
5 Contents 1 2 6
6 Competitors 1 8
7 Promotion 1 8
Total 5 5 9 6 6 11 8 50
1. The collaborator’s category no. 7, ‘promotion’, is the result of combining the previous two categories: 7, ‘trade announcements’ and 8, ‘advertising
budget’.
2. Index A: number of constructs along the diagonal for the categories agreed on, as a percentage of all of the constructs in the table:
4 + 4 + 6 + 3 + 3 + 7 + 7 = 34
50 constructs in total
100_34/50 = 68%
3. (As all the categories are now agreed on, index A is identical to what was earlier called index B.)
4
6
3
3
7
7
4
characteristic to anyone other than a psychologist or a statistician, who is used to
assessing reliability in this particular way. Probably the most commonly used
statistic in this context is Cohen’s Kappa (Cohen,1968).However, if having a standard
errorof the figure youhave computed matters to you, thenthe Perrault ^Leigh Index is
the appropriatemeasure to use: see Perrault & Leigh (1989).
The value of Cohen’s Kappa or the Perrault ^Leigh Index which you would seek to
achieve would be 0.80 or better.This is the standard statistical criterion for a reliable
measure, but, if you’re conscientious about the way in which you negotiate common
meaningsforcategories, ahighly respectable 0.90 istypical for repertorygrid content
analyses.
And that’s that: with the completion of step 4 of our procedure, you’d continue
with the remaining steps, 5 to 9, taking comfort that the categories devised in
steps 5 to 9 were thoroughly reliable.
All of this seems very pedantic, and for day-to-day purposes, most people
would skip the reanalysis of step 4.5. However, if you were doing all this as
part of a formal research programme, especially one leading to a dissertation
of any kind, you’d have to include this step, and report the improvement in
reliability (it is conventional to report both the ‘before’ and the ‘after’ figure, by
the way!).
Well and good; but haven’t you forgotten something? When you present the final
results at steps 5 to 7 (the content-analysistable,withits subgroup columnsfordifferential
analysis as required), whose content-analysis table do you present: yours, or
your collaborator’s? You’ve increased your reliability but, unless you’ve achieved a
perfect100%match, the two tables, yours and your collaborator’s, will differ slightly.
Which should you use? Whose definition of reality shall prevail?
In fact, you should use your own (what we’ve been calling the interviewer’s contentanalysis
table), rather than your collaborator’s.You designed the whole study and it’s
probably fair for any residual inaccuracies to be based on your way of construing the
study, rather than your collaborator’s. (Though if someone were to argue that you
should spin a coin to decide, I could see an argument for it based on Kelly’s alternative
constructivism: that one investigator’s understanding of the interviewees’constructs
is asgood as another’s, once the effort tominimise researcher idiosyncrasy hasbeen
made!)
Okay, this is a long chapter: take a break! And
then, before you continue, please do Exercise 7.1.