Texts

К оглавлению
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 
17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 
34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 
51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 
68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 
85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 
102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 
119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 
136 137 138 139 140 141 142 143 144 145 146 

In the area of network text analysis, previous research and development have provided

computer-supported solutions that enable analysts to gain a window into social structure

and meaning as represented in texts. Collectively these approaches enable the

analyst to extract networks of concepts and the connections between them from the texts.

These networks are sometimes referred to as maps (Carley, 1997b), networks of centering

words (Corman, Kuhn, Mcphee & Dooley, 2002), semantic nets (Reimer, 1997), semantic

networks (Monge & Contractor, 2001, 2003; Popping, 2003; Ryan & Bernard, 2000),

networks of concepts (Popping, 2000), or networks of words (Danowski, 1993). Herein,

we refer to such techniques using the general term — network text analysis (NTA)

(Carley, 1997b; Popping, 2000). NTA approaches vary on a number of dimensions such

as the level of automation, a focus on verbs or nouns, the level of concept generalization,

and so on. Nevertheless, in all cases, networks of relations among concepts are used to

reveal the structure of the text, meaning, and the views of the authors. Further, these

networks are windows into the structure of the groups, organizations and societies

discussed in these texts. This structure is implicit in the connections among people,

groups, organizations, resources, knowledge tasks, events, and places.

NTA is a specific text analysis method that encodes the links between words in a text and

constructs a network of the linked words (Popping, 2000). The method is based on the

assumption that language and knowledge can be modeled as networks of words and the

relations between them (Sowa, 1984). NTA methodologically originates from traditional

techniques for indexing the relations between words, syntactic grouping of words, and

the hierarchical and non-hierarchical linking of words (Kelle, 1997). The method of NTA

enables the extraction, analysis, and concise representation of the complex network

structure that can be represented in texts. Furthermore, NTA covers the analytic

spectrum of classical content analysis by supporting the analysis of the existence,

frequencies, and covariance of words and themes (Alexa, 1997; Popping, 2000). Given

these functionalities, computer-supported NTA is a suitable method for analyzing large

collections of texts effectively and efficiently. Several NTA methods exist (see bullet

items listed below; for more details on methods, see Popping, 2000; Popping & Roberts,

1997). Many have been applied in empirical settings (see discussion by Monge &

Contractor, 2003) such as:

• Centering Resonance Analysis (Corman et al., 2002)

• Functional Depiction (Popping & Roberts, 1997)

• Knowledge Graphing (Bakker, 1987; James, 1992; Popping, 2003)

• Map Analysis (Carley, 1988, 1997b; Carley & Palmquist, 1992)

• Network Evaluation (Kleinnijenhuis, Ridder & Rietberg, 1996)

• Word Network Analysis (Danowksi, 1982).

Besides the analysis of textual data, current work also focuses on the visualization of

networks extracted from texts (Batagelj, Mrvar & Zaveršnik, 2002).

In this research we concentrate on map analysis. Map analysis systematically extracts

and analyzes the links between words in texts in order to model the authors “mental maps”

as networks of linked words. Coding texts as maps focuses analysts on investigating the

meaning of texts by detecting the relationships between and among words and themes

(Alexa, 1997; Carley, 1997a). Maps are a cognitively motivated representation of knowledge

(Carley, 1988). In map analysis, a concept is a single idea represented by a single

word or a phrase. A statement is two concepts and the relation between them. A map is

the network of the statements (Carley, 1997b).

Before continuing, it is worth noting that the terminology in this area is very diverse,

having come from a variety of disciplines. Thus to orient the reader and help avoid

confusion, we provide some basic terminology as we will use it herein in Table 1. This

will foreshadow the discussion of the procedure we are proposing in this chapter.

Table 1. Terminology and associated symbols

Term Definition Alternative Terms Examples

Text A written work. Sample Newspaper article,

abstract, Web site,

interview

Text-level concept Words that appear in text Word, concept, phrase,

named-entity

Rantissi, Palestine,

Hamas, terrorism,

captured

Higher-level

concept

A word or phrase chosen by

the analyst into which other

words or phrases are

generalized

Concept, node Terrorist, Osama bin

Laden

Concept Single ideational kernel Node Terrorism, terrorist,

Friday, 9-11

Entity class Objective category that can be

used for classifying concepts;

Top level in the ontology

Meta-node, entity,

category, concept type,

node type

People, Organizations

Relation Connection between concepts Link, tie, edge,

connection

Rantissi is in the

Hamas

Relation class Objective category that can be

used for classifying relations

connecting concepts in entity

class “a” to concepts in entity

class “b,” such that “a” and

“b” may or may not be

distinct.

Relation type, Edge

type, Tie type, subnetwork

Social network, is a

member of

Map The network formed by the set

of statements (two concepts

and the relation between them)

in a text.

Network, concept

network, semantic

network, network of

concepts

See Figures 3 and 4

Meta-matrix Conceptual organization of

concept networks into a set of

networks defined by entity

classes and relation classes

Ontology, classification

scheme, meta-network

See Tables 2 and 3

Using the Meta-Matrix as an Ontology

Since NTA can be used to extract networks of concepts, we can leverage the methods

of social network analysis (SNA) to analyze, compare and combine the network of

concepts extracted from the texts (see e.g., Scott, 2000; Wasserman & Faust, 1994 for SNA

techniques). This provides the analyst with tremendous analytical power (see Hill &

Carley, 1999 for illustrative study). If in addition, we cross classify the extracted concepts

into an ontology, particularly one designed to capture the core elements of social and

organizational structure, we gain the added theoretical power of extracting in a systematic

fashion an empirical description of the social and organizational structure. The key would

be to design a useful ontology.

Such an ontology is implicit in the meta-matrix approach (Carley 2003, 2002; Krackhardt

& Carley 1998) to organizational design. Krackhardt and Carley defined an approach to

represent the state of an organizational structure at a particular point in time as the set

of entities (people, resources, and tasks) and the relations among them. The meta-matrix

approach is a representational framework and a set of derived methods for the computational

analysis of multi-dimensional data that represents social and organizational

systems. The concept of the meta-matrix originates from the combination of:

1. Information processing and knowledge management (Carley & Hill, 2001; Galbraight,

1977; March & Simon, 1973).

2. The PCANS approach (Krackhardt & Carley, 1998), which was later generalized by

Carley and Hill to include knowledge, events, and organizations (Carley, 2002;

Carley & Hill, 2001).

3. Operations research (Carley & Krackhardt, 1999; Carley, Ren & Krackhardt, 2000).

4. Social network analytic techniques and measures (see e.g., Scott, 2000; Wasserman

& Faust, 1994).

The meta-matrix enables the representation of team or organizational structure in terms

of entity classes and relations. In principle, this is an extensible ontology such that new

entity classes and new classes of relations can be added as needed. Each entity class

represents an ontologically distinct category of concepts (or in the social network

language, nodes). Each relation class is a type of link between concepts within entity

class 1 and 2. For the sake of illustration, we use a simple form of the ontology in which

we identify four entity classes — People, Resources (or Knowledge/Skills), Tasks or

Events, and Groups or Organizations (see Table 2 headers). We choose these entity

classes as they are sufficient for illustration and they are critical for understanding the

structure of teams, groups and organizations. The reader should keep in mind that it is

possible to use different entity classes and still think in terms of the meta-matrix

conceptualization (as we do in this chapter). The key aspect for our purposes is that the

meta-matrix defines a set of entity classes and a set of relation classes. This facilitates

thinking systematically about organizational structure and provides a limited hierarchy

for structuring the network of concepts.

Between any two entity classes there can be one or more classes of relations. For example,

between people and people we can think of a number of relations including, but not limited

to, communication relations, friendship relations, or money/exchange relations. To

orient the reader, in Table 2, common labels for the network formed by linking the row

and column entity classes are identified. The data in a meta-matrix represents the

structure of the group or organization at a particular time. It can be analyzed to locate

vulnerabilities, strengths, features of the group, to identify key actors, and to assess

potential performance. In summary, the meta-matrix approach allows analysts to model

and analyze social systems according to a theoretically and empirically founded schema

(Carley, 2003). By employing this approach as an ontology, we enable the analyst to

extract and analyze social systems as described in texts.