The importance of cultural analysis in counterinsurgency operations is well established. Whether systematically compiled cultural knowledge is employed as a means of building institutions within the indigenous social fabric, or modeling the flow of information through the social system, there can be little doubt that warfighters directly benefit from a deeper understanding of the social context of their operating environment. In the tribal societies of the Middle East and Central Asia the salience of tribe/clan affiliations, and their often strict hierarchical nature, make cultural awareness particularly crucial, especially in light of technological developments in the fields of Visual Analytics and geospatial/relational data fusion. Data fusion is defined as "the integration of data and knowledge collected from disparate sources by different methods into a consistent, accurate, and useful whole." It is this capability (or previous lack thereof) to "fuse" operationally relevant relational and geospatial data, and its importance in the field of cultural research, that will enhance the way that both analysts and warfighters understand the battlefield. The realization of "multidimensional data fusion" allows the cultural analyst to not only work toward an understanding of the relational structure of the predominant social system (tribal or otherwise), but also to integrate these entities (persons, tribes, etc.) with their spatially fixed anchor points (homes, schools, territories, etc), allowing for a contextually-enriched and actionable understanding of the operating environment.
The main challenge in the complex task of employing data fusion and visual analytics in cultural research stems not from a lack of cultural information, which is abundant in the unclassified literature, but from the challenges posed by the effective cataloguing, fusion, and presentation of various data (relational, geospatial, or temporal). This process of properly cataloguing and "fusing" data is an important part of Visual Analytics’ focus on creating "automated analysis techniques with interactive visualizations for an effective understanding, reasoning and decision making on the basis of very large and complex data sets," while also "foster[ing] the constructive evaluation, correction, and rapid improvement of our processes and models and—ultimately—the improvement of our knowledge and our decisions." The limiting factor is that data is not being "processed" in a way that is conducive to collaboration and "analytical discourse." It is precisely this lack of collaboration (and implicitly data fusion) that was discussed in the RAND Corporation’s 2008 report titled "Analytical Support to Intelligence in Counterinsurgencies," and included a proposal for the creation of COINCOP (Counter Insurgency Common Operational Picture), which would "provide displays of key information about insurgent networks" including "the insurgents, their assets and their personal relationships (including those with civilians)," and "the location of insurgent cells, their weapons caches, and supply chains for weapons and other war-related equipment." The need is well recognized, it is the means to fuse these data in a "Common Operational Picture" that has been, until now, lacking.
To illustrate the impact of this shortcoming, picture the industrious analyst attempting to unravel the complicated social system for a given province in eastern Afghanistan. Regardless of the possibility for outside collaboration, a talented cultural analyst will use one of several relational analysis platforms to synthesize an exhaustive model of the province’s social/tribal hierarchy, including relationships like organization membership, familial ties, and friendships. Until recently, no matter how great the analyst’s motivation, the level of data fusion possible between this extensive relational web of key individuals and organizations, and corresponding geospatial data for the province, showing notable places like compounds, schools, or hospitals was, at best, minimal. In the end, Montague and Capulet would be doomed to remain two distinct ways of understanding the same social system.
This lack of fusion between the geospatial and relational realms has crippled our analytical potential, but by leveraging advances in software development, it is possible to fuse these two data sets to produce an analytical product much more meaningful than the sum of its geospatial and relational parts. The fused analytical product allows placement of individuals and organizations within their geospatial and relational context, which, in turn, allows for the inference of individual relational information (tribal affiliation for example) based upon geospatial context and vice versa. For example, after geospatially and relationally mapping the tribal "skeleton" of the aforementioned Afghan province, one may infer an individual’s tribal identity based on his geospatial location or vice versa. Other examples of contextual analysis using a fused social model will follow.
The Challenge of Data Fusion
While intricate "link charts" showing relational networks like the one shown in Figure 1 may prove informative, they remain disparate from the real-time, spatially-oriented operating environment, wherein geospatial data generally resides outside its relational context and loses meaning as part of the broader "social skeleton."
Figure 1: "Link chart" showing the relational network or "nexus topography" for the Omani tribal system. Boxes represent tribes and the lines connecting them represent relationships like descendency, leadership, or hierarchy. While informative, the data remains apart from its geospatial context.
To use an example from the Southern Oman, suppose that relational analysis tells us that the Ja`bub and Tabuk tribes are descendents of a common ancestor, and that each tribe is composed of eight specific clans. We also have the names of the clans and the names of their respective leaders. In terms of geospatial information, we have a list of coordinates for the homes of the leaders, and a map overlay showing which clans are present in each of the area’s fifty villages. Even in this relatively simple example the limits of analysis become apparent as one imagines the analyst attempting to integrate the relational and geospatial data manually. One can picture the dizzying task of plotting this simple data onto a map overlay, with a myriad of colored pieces of yarn and thumbtacks connecting persons to villages, villages to clans, clans to tribes, etc. Now imagine adding another ten thousand inter-related data points ranging from religious leaders to shipping companies including attributes like address, sect, friendships, political affiliations, or lineage for each!
The task quickly moves from complicated to impossible. Even if this sort of "manual" integration were possible, the data would likely only be useful as a system overview since effectively parsing the various properties and relationships and examining them would be impossible. Instead we should benefit from the advantages conferred by the analytical science of Visual Analytics. One of the recommendations of this field is a situation wherein "visualization becomes the medium of a semi-automated analytical process, where humans and machines cooperate using their respective distinct capabilities for the most effective results." As shown in Figure 2 the result is an "analytical process" in which the cognitive heavy lifting is assumed by the machine, with data then being presented in a way which allows the analyst to concentrate on more nuanced analysis.
Figure 2: Tight integration of visual and automatic data analysis methods with database technology for a scalable interactive decision support.
Moving from "One Dimensional" to "Multi-dimensional" Data Fusion
Suppose that two commanders decide to fuse their relational datasets outlining the organizational structure of religious leadership for two overlapping areas of Mosul, Iraq. Regardless of the medium of information collaboration (huge whiteboard or otherwise), they would undoubtedly encounter the issue of how to determine whether "Ahmed al Burghouthi" of Sector A is the same person as "Ahmad Bourghuthi" of Sector B, and if so, how to merge or "resolve" these two identities into a single individual. Although names are probably the most often used resolution criterion in relational models, additional information like address, ration card number, or license plate number will often aid in the resolution process. Thus, in order to properly identify and resolve the thousands or even millions of individual relational nodes, we require an engine capable conducting the "automated analytical process" mentioned in the previous section. The resolution process employs a series of preprogrammed guidelines for what constitutes a justified resolution, based upon combinations of identifying information (name, address, blood-type, etc). If the resolution is found to be justified (they are indeed the same person), the properties for both "Ahmed Bourghuthis" like aliases, skills, friendships, and identifying information will be knitted together to form a new "Ahmed" integrated within the fused relational network, and based upon the combined information of both commanders. Even this "one-dimensional" resolution process can get rather dicey when it involves multiple datasets, each with their own terminologies and naming conventions, but software advances have made the process more accessible. Following the development of this automated resolution capability, the next logical step in developing the fusion process is its extension to different types of data, meaning relational and geospatial information. Although recognition of the importance of creating the fused "Common Operating Picture" is widespread, until now efforts to do so have generally taken the form of crude "side by side" fusion and data presentation.
For example, during its 2007 deployment to Southeast Baghdad, Task Force Dragon of the 3rd Infantry Division clearly demonstrated the benefits of "Human Terrain Mapping" by creating a shared database of relevant cultural and demographic information that proved crucial in its counterinsurgency operations. The effort "created a common human-terrain picture that enabled more proactive initiatives and faster, much more effective responses to events," but still relied almost exclusively upon the geospatial framework of analysis, thereby neglecting the cumulative benefits of relational integration. Although relational data like the "location and contact information for each sheik or village mukhtar" was inserted into the geospatial interface showing "the boundaries of each tribal area," "locations of mosques schools and markets," and "nearest locations and checkpoints of Iraqi security forces," these two types of information were not combined in an optimal way. The main medium of analysis remained geospatial, with relational information being simply layered upon it in the way that photos or dossiers of leaders in a given neighborhood may be added to the folder containing the neighborhood’s tribal map overlay. This distinction between "layered" data and fused data is critical. Whereas data layering can be described as an "additive process," data fusion "goes beyond merely looking at a problem through different lenses; it collects the respective lenses, and looks through them all at the same time for an in-depth view of the problem."
Although the addition of relational information is clearly valuable, these two data media remain distinct, and the actual information fusion (among photos, dossiers, and maps) remains the onerous task of the analyst. Surely efforts have been made to combine these data manually (recall the colored string and thumbtacks mentioned earlier), but this type of crude integration does not leverage the true utility of data fusion. Although crude integration may add value, the analyst requires true fusion of the two data types in a way that allows for fluid analysis of both data sets simultaneously, with each set of entities and nodes, both geospatially and relationally based, being resolved into the same conceptual "space." A solution to this tenuous problem of conflicting data types is an analytical methodology wherein geospatial and relational data are kept in their respectively optimal presentation formats (geospatial as a map overlay and relational as a link chart), but may be analyzed using geographical tools in conjunction with relational tools in order to move between these two data formats smoothly, thus achieving "multi-dimensional data fusion." Figure 3 shows an example of this type of fluid analysis, where a relational filter isolating a particular set of clans interacts simultaneously with the geospatial display showing where these clans reside. An inverse process could also be performed, with a geospatially bounded search being used to isolate the Beni Ruwahah, with the tribes then being highlighted in their relational context.
Figure 3: A relational filter is applied to a selected set of Omani tribes and clans of the Beni Ruwahah confederation (highlighted in yellow on right), and simultaneously highlighted on the geospatial display (highlighted in yellow on left).
An intuitive concern with the fusion of these two data types stems from the fact that our basic unit of relational analysis is generally the mobile (and often highly elusive) individual, while our unit of geospatial analysis is the fixed point in space at a given time (i.e., place), and yet this focus on the individual is an integral part of our evolving COIN doctrine. "Man-hunting" expert John Dodson (2006) aptly captures both the importance of this level of analysis and the need for further analytical capabilities at the level of individual:
The fluid, dynamic and surreptitious nature of the HVI [High Value Individual targeting] differs significantly from the monolithic nation state threat most analysts were trained for and previously experienced working. The HVI requires intelligence collection and analysis at the lowest level, the individual. This granularity is outside the norm for most current collection systems.
We have come to the crux of the matter: how can we preserve the individual as the root of analysis while marrying it to fixed points in space?
Solving the Problem of "Fixing" an Individual in Space
While the concern regarding the incompatability of these two data types is certainly justified (people tend to move around), there are two reasons why it need not derail the analytical windfalls of "multi-dimensional" data fusion. The first is simply that, while individuals do move, on the aggregate level these swarms of individuals tend toward relatively static spatial distributions, especially in the case of traditional societies. Furthermore, in areas where drastic demographic change is occurring, the model may be updated periodically, and in fact this change may be tracked through time to add a third "dimension" to the analysis. The second is that, while the individual is certainly not "fixed" in space, individuals do tend to orbit certain set "anchor points," which may be used to infer position. In the case of "man-hunting" operations these "anchor points" limit "the vast majority of the HVI’s hiding locations ... because of an unwillingness to depart from his normative behavior." An example from Quantum Chemistry provides a useful metaphor on how this connection may be operationalized.
An electron is an ethereal, intangible thing, difficult to detect directly, and not unlike Mao’s guerillas, being "of the people as a fish is of the sea." However, Molecular Orbital Theory provides us with a method for divining "the probability of finding an electron in any specific region [of a molecule]." It is nearly impossible to identify the exact location of an electron orbiting around the perimeter of an atom at any given time, but we can give a probability that the electron will be within specific areas near the atom’s positive core. Likewise, persons tend to orbit certain familiar points locked in space, and although we may be unable to geo-locate persons with any real meaning (the point would constantly be changing!), we can define the "orbital patterns" for the individual and give an indication of the places that the individual is likely to be near. This is done by linking the individual to points in space that he/she is in turn related to (school, apartment, employer, favoriate night club, etc.), all of which can be plotted in space. Based upon this cluster of points we may gain an idea of the "orbital sphere" of the person, and take action based upon this educated guess as to the his or her whereabouts. The result is a fused relational and geospatial method for locating the individual.
Figure 4: Orbital diagram showing the probable locations of particular electrons orbiting the positively charged nucleus of an atom. A useful metaphor for the habits of persons to "orbit" spatially fixed points (home, work, school) in daily life.
Figure 4 shows the probable positions, or orbitals, of an electron in orbit around a nucleus. To illustrate the metaphor, imagine that the electron represents an individual named "Bob," with each orbital representing Bob’s probability of being near a particular point in space. For example, the 1s orbital might represent Bob’s home, where he spends most of his time, the 2s orbital a two mile radius around his home, the 2p orbitals to Bob’s six favorite restaurants, and finally the 3s orbital to a fifty mile radius around Bob’s home. Although we cannot be certain of where Bob is at any specific time without watching him constantly (impractical when scaled up), based on Bob’s daily routine we can estimate that at any given time, we are thirty percent likely to find Bob in or around his home (1s orbital), fifty percent likely to find Bob within a two mile radius of his home, seventy percent likely to find Bob "orbiting" to, from, or around a favorite restaurant (2p orbitals), and ninety five percent likely to find Bob somewhere within a fifty mile radius of his home. The point of this metaphor is not that we need to closely study the routines of the individuals in our social system, but rather to show how individuals may theoretically be "anchored" geospatially, based upon their relationships to points in space. This concept of relationally "tethering" the individual to points in space provides the critical linkage between our geospatial unit of analysis, the fixed point in space and time, and our basic relational unit of analysis, the person.
The Utility of "Multi-Dimensional" Data fusion
Suppose that the analyst is interested in whether IED attacks occurring in an Afghan valley are related to a blood feud declared against coalition forces. Since the valley does not have a particular address and corresponding boundary, the attacks occurring in that valley must be separated from others in the sector using a geographic filter, i.e., drawing a circle around the perimeter of the valley using a mapping tool. Now that the attacks have been isolated geospatially, we could apply a data layering technique by simply overlaying a map of tribal boundaries onto the map of IED attacks, and use any one of several methods to sleuth a significant relationship between the two layers. Although this tried and true method of data layering may offer valuable insights into the relationship between attacks and tribes in the valley, its use is limited by its lack of truly integrated relational information, as shown in a second example.
Now consider the same question of relationship between IEDs and tribes using a fused dataset. Although several paths are available to detect a relationship, we would probably begin with the isolation of the IED "events" using a geospatially bounded search, but at this point we might slip into the relational realm, and display the "IED Events" as a series of nodes on a link chart. Perhaps then we would link each "IED Event" to the persons involved in the attack by things like arrests, license plate numbers, fingerprints, etc., then identify the tribal affiliations of these persons using their surnames, and finally determine the tribal leaders that likely authorized the attacks. We may then return to the geospatial interface, to plot the known addresses for each of the tribal leaders likely to have authorized the attacks.
Although the first example yielded the relationship between the tribal system and a particular set of IED attacks using geospatial analysis, the layering method does not harness the elegance of the relational data, limits the analyst to system-wide trends, and leads to an analytical dead-end. In contrast, consider the use of the fused dataset. In the second example the analyst is free to sift through relationships between the attacks and their environment geospatially (using a tribal area overlay for example), then examine the filtered data relationally (perhaps through arrests), in order to leverage linking data like vehicle identifiers or fingerprints to determine the persons or organizations involved in the attacks, and how they may be located and affected. The true value-added of data fusion is this ability to present entities (events, persons, organizations, places, etc.) within their spatial, relational, or even temporal contexts, and to move between these data domains seamlessly. This analytical flexibility creates the type of virtuous circle of knowledge refinement that is one of the key concepts of Visual Analytics (see Figure 5).
Figure 5: The sense-making loop for Visual Analytics
Consider the example of the infamous Jemaah Islamiyah bomb-maker and Malaysian terrorist leader Noordin Mohammed Top. Suppose that the analyst is tasked with determining possible locations that Top may be hiding on the island of Java based upon Dodson’s man-hunting principles of familiarity, survivability, safety, and vulnerability, which effectively limit the target’s location based upon past experience and practical constraints. The analyst will likely begin by plotting Noordin’s past and present relationships to persons or organizations onto the link-chart. Relationships may then be presented geospatially by showing the locations of related individuals on a map based upon address information, thus making the relational information actionable. Should the analyst opt to continue the investigation further, by perhaps delving deeper into Noordin’s friendship ties in the Jaipur area, these ties may be isolated using a geo-filter and reexamined more closely relationally. The important element of these examples is that data fusion empowers the analyst with the freedom to move between data dimensions and refine knowledge freely.
In addition to the geospatial and relational data dimensions, mention should also be given to the fusion of temporal data. By fusing this third data dimension, the researcher may not only leverage the previously discussed power of "bidimensional" data fusion (geospatial and relational), but also observe trends in the social system over time. To give another example, recall the case of Noordin Top and suppose that, in addition to isolating his relationships and plotting them geospatially, we are also able to display changes in these relationships and other events (like movements) over time. Perhaps we will notice that Noordin’s past movements tend to follow a seasonal pattern, or that police crackdowns on his followers tend to precipitate Top’s relocation. These types of observations and complex analyses are only possible through the integration of temporal attributes within the already fused geo-relational system, and are likely to present the next frontier in data fusion.
The Suitability of Tribes for Fusion Analysis
Now that the value and need of "multi-dimensional" data fusion has been established, I will discuss why this method of social modeling is especially appropriate for use in understanding tribal social systems. An important element of this suitability lies in what French sociologist Emile Durkheim refers to as "Mechanical Solidarity." Grahame Thompson (2003) explains: Mechanical solidarity typifies a segmentary community, often small in scale, in which there are clearly separated roles between its members and clear standards by which their behavior can be assessed. This produces collective conscience in a "mechanical" way as the members of the community interact along these strictly demarcated lines. People "know their place" and then act accordingly.
Now contrast this idea with the "Organic Solidarity" prevalent in "advanced industrial economies," which
refers to a functionally differentiated society of a more complex character. In this case, solidarity is more difficult to generate. The complexity of the functions in a differentiated society implies a greater variability of social relations, where the social roles members are called upon to play are less clear-cut (and often multiple) and the behavioral norms associated with those roles equally complex.
The Mechanical Solidarity prevalent in tribal social systems in turn reinforces the effects of social "structure" (those factors such as social class, religion, gender, ethnicity, customs, etc., which seem to limit or influence the opportunities that individuals have) while detracting from "agency" (the capacity of individual humans to act independently and to make their own free choices). From the point of view of the cultural analyst, this situation effectively simplifies the relational system by increasing the importance of social hierarchy and often disaggregating segmentary (tribal) communities in a system (province) socially as well as spatially (i.e.. tribesmen tend to associate with other tribe members and live within tribal territories). The same reasons that author and historian Steven Pressfield gives for tribes being a "natural-born warfighting unit," including "obedience, respect for elders, hostility to all outsiders, loyalty, fidelity, the obligation for revenge and blood payback," also effectively simplify the social system through increased mechanical solidarity.
Representing Tribal Systems: Points Not Lines
In order to exploit the theoretical suitability of tribal systems for fused analysis we must first identify effective ways of representing tribal territories. There are several ways to determine tribal authority within a given territory, but among the simplest is to represent the presence of tribal elements by plotting individual data-points geospatially, and then relate these points to one another. Through the use of source language ethnographic or historical documents, survey/census data, or personal interviews, the researcher may assemble a compilation of data points corresponding to tribally linked persons, villages, companies, neighborhoods, etc. When displayed collectively geospatially, these points effectively show the distribution of the tribal group as the sum of its individual "parts."
The advantage of this method is that it allows a more realistic and meaningful geospatial representation based upon the most granular unit of analysis available. Ideally, this unit would be the residences and other geospatial "anchor points" of individual tribesmen, but where this data is unavailable, tribal identification to the village level is still much more meaningful than the traditional boundary line method. Instead of presenting a tribe geospatially as a polygon, the tribe would be shown as clusters of points among populated areas throughout the map, with points corresponding to villages, valleys, or neighborhoods inhabited or controlled by members of the tribe. This form of presentation allows for the realistic blending of tribal boundaries as several tribes may be present in a particular urban center or grazing area, and drift within these boundaries may be tracked incrementally over time, as demographics change.
The point of this essay has been to outline ways that data fusion may be achieved, and how it can dramatically enhance the analytical capabilities of cultural analysts, especially in tribal social systems. By using Visual Analytics theory and technology to conduct the labor intensive aspects of data fusion, and accepting the theoretical justification of fusion between the geospatial, relational, and temporal data dimensions, the field of Cultural Analysis seems poised to make a major contribution to COIN doctrine. The software developers racing to fill this technological need include I2, Access Pro, and a company called Palantir Technologies, which has proven especially well suited for data fusion during the author’s ongoing analysis of the Omani tribal system, and is discussed in detail by Hartunian and Germann (2008).
However, these software advances must also be accompanied by two caveats. The first is that, no matter how powerful or versatile the technology, a deep understanding of the social system will always depend on "expert opinion familiar with the culture, indoctrination procedures, and institutional foundations" that lend significance to relationships, as well as the skill, intuition, and innovation of the analyst/collector. This is especially true in the case of tribal social systems, where linguistic skill, cultural knowledge, and analytical experience are not a luxury, but a requirement. Without the skills required to accomplish tasks like imbedding source language information in the analysis, proper transliteration to English, or understanding the nuance of complex tribal systems, the analysis is best left undone. With this in mind, we must also refrain from attempting to reinvent the wheel, by tapping existing sources of social data ranging from deployed company intelligence officers to Civil Affairs teams operating outside the combat zone. While the need for effective Human Terrain Analysis is especially acute in the combat zone, as a colleague put it, "building these models in the war zone is like trying to build a bike while running beside it." Just as we have accumulated a wealth of geospatial data for use in any future deployment throughout the globe, we must have the strategic foresight to match and fuse this information with its relational context. In the end, by harnessing technology to fuse geospatial, relational, and temporal data in a meaningful way, we may drastically enhance the field of cultural analysis, and further empower the warfighter in his mission of defeating contemporary and future insurgency.
1. Montgomery McFate, "The Military Utility of Understanding Adversary Culture." Joint Forces Quarterly 38 (Third Quarter 2005): 42-48.
2. Encarta World English Dictionary, North American Edition, 2009.
3. Daniel Keim, Gennady Andrienko, Jean-Daniel Fekete, Carsten Gorg, Jorn Kohlhammer, and Guy Melancon, "Visual Analytics: Definition, Process, and Challenges," in Information Visualization: Human-Centered Issues and Perspectives (Berlin, Heidelburg: Springer, 2008), 154-175.
4. Walter L. Perry and John Gordon IV. Analytic Support to Intelligence in Counterinsurgencies (Santa Monica: RAND National Defense Research Institute, 2008.)
5. Figure based upon unpublished research of Omani tribes conducted by the author, and generated using Palantir Analytical Platform.
6. Keim et al., Op. Cit.,154-175.
7. Keim et al., Op. Cit., 154-175.
8. J. Marr, B. G. Johncushing, and R. Thompson, "Human Terrain Mapping: A Critical First Step to Winning the COIN Fight, Military Review 88 (March/April 2008), 37-51 .
9. Eric Hartunian and Wade A. Germann, "Data Integration to Explore the Dynamics of Conflict: A Preliminary Study," Master's Thesis, Naval Postgraduate School, Monterey, CA, December 2008.
10. Figure based upon unpublished research of Omani tribes conducted by the author, and generated using Palantir Analytical Platform.
11. John R. Dodson, "Man-Hunting, Nexus Topography, Dark Networks, and Small Worlds," IO Sphere (Winter 2006).
13. John Daintith, ed., Oxford Dictionary of Chemistry (New York: Oxford University Press, 2004).
14. Keim et al, Op. Cit.
15. Dodson, Op. Cit.
16. Grahame F. Thompson, Between Hierarchies and Markets: The Logic and Limits of Network Forms of Organization (Oxford: Oxford University Press, 2003).
17. Peter L. Berger and Thomas Luckmann,The Social Construction of Reality: A Treatise in the Sociology of Knowledge (Garden City, NY: Anchor Books, 1966).
18. Steven Pressfield, "How to Win in Afghanistan," blog.stevenpressfield.com, 2009.
19. Hartunian and German, Op. Cit.
20. Todd J. Hamill, Dr. Richard F. Drecko, Dr. James W. Chrissis, and Dr. Robert F. Mills, "Analysis of Layered Social Networks," IO Sphere (Winter 2008).