Using geographical data to explore the history of militarized inter-state disputes throughout the world.

**NOTE: This page may take several moments to load and may become unresponsive while rendering. Please be patient.**

This project explores the relationship between power (rosources at the disposal of a country) and fatalities suffered during inter-state militarized disputes. All data comes from the Correlates of War Project datasets.

All source code for this project is available here.

As nation-states obtain more power (money, population, technological advantages, competitive advantages resulting in favorable international trading, etc.) they have more incentive to reach peaceful resolutions of conflict, as they have more to lose by going to war. When such powerful nation-states do chose to enter a militarized inter-state dispute, they have certain advantages over less powerful nations such as significant investments in military capabilities or political power that might lead to diplomatic resolutions as a result of trading partnerships and treaty alliances. Powerful coutries should have access to advanced military weaponry (drones, missles, aircraft, intelligence, assistance from treaty alliances, etc.) to which less powerful states might not have access. With this asymmetry in mind, we can begin to form a hypotheis regarding fatalities suffered during militarized disputes:

Countries with increasing levels of resources experience decreasing levels of fatalities per war.

From Stratasan blog post

### Why this is a bad visual

This visual seems to have a good map projection and dataset, but does a poor job of implementation. The legend gives no indication of the units used for population. At first, I thought the units were millions of people, however after some quick scrutiny that is clearly not the case. The use of circle overlays in itself is not bad, however the way this visual has moved the overlays aside to allow labeling is problematic. If the intent is to show population density, then keeping the circle overlays positioned geographically accurate is key. Moving them aside only distorts the data being displayed, while not adding much value to the labeling. Instead, the circles should have been kept centered over the city they represent with labeling done inside the circle when needed. A different scale for the circles might also be beneficial (a logarithmic scale perhaps?). These changes would have more clearly shown population density per geographic region.

See also: Gapminder's Wealth and Health Of Nations and Mike Bostock's Recreation in D3

Mike Bostock's Airport Example, showing Voronoi overlay

These contrast the bad visual example to show how circle overlays can be used effectively. A small amount of interaction resolves the problem of labeling. Animation is used to introduce the dimension of time to the visualizations. The Gapminder visualization contains a stunning number of indicators that can be visualized, resulting in a great exploratory visual analysis tool.

Again shows circle overlays used effectively, contrasting with the poor implementation of circle overlays in the Bad Visual example. This visual also shows how the technique of a hidden Voronoi overlay works for selecting an observation on mouseover. This technique, especially effective for small observation points on a circle overlay, is commonly used in D3 visuals to add an interactive component.

This visual makes use of the Correlates of War Militarized Interstate Dispute Dataset v3.10 (MID) and the Militarized Interstate Dispute Location Dataset v1.1 (MIDLOC). The MID dataset contains data on inter-state disputes from 1816-2001, including the level of battle related fatalities suffered by all states (combined). The levels are None, 1-25, 26-100, 101-250, 251-500, 501-999 and more than 999 deaths. Precise fatality data is sparse for this dataset and threfore was not used. The MIDLOC dataset contains the precise longitude and latitude of the onset of most disputes in the MID dataset (location data is available for 2240 of the 2332 disputes contained in the MID dataset. Observations lacking location data and/or fatality data are excluded.

This visual uses a heatmap overlay to highlight battle related fatality levels throughout the world. The value of each pixel displayed on the map is defined by its distance to a point that holds data (in this case fatality level for a dispute): \(v=\sum{f_{i} - s*d^w}\) where \(v\) = value of the pixel, \(f_i\) = fatality level for observation \(i\), \(d\) = distance from observation \(i\), \(s\)=0.1 , and \(w\) = 3. The color of the pixel is then determined from the value \(v\) by a color scheme function.

Map annotations are plotted for each dispute showing summary information. An annotation clustering library is used to help manage overlapping map annotations.

This visual serves as a good exploratory visual analysis tool. The map is interactive and allows the user to zoom/scroll and view summary information for each dispute. From the initial full world view, we can see distinct "hot spots" that overshadow other areas of the world: Central America, the Middle East, and North/South Korean peninsula. This is tentatively consistent with our hypothesis: the countries in these areas have significantly less "power" than other regions of the world. This analysis is weak however: we have no measure of "power". Also, as mentioned above the fatality level tops out at 999, therefore for the heatmap observations containing 1,000,000 fatalities have the same influence as observations of 1000. Finally, MID v3.0 does not allocate fatalities per state, rather only by dispute. It is clear therefore that the current data will be insufficient for our analysis.

This visual was created using the following open-source libraries:

- Leaflet.js - a JavaScript library for interactive maps
- heatcanvas - a JavaScript library for creating pixel based heatmaps with html5 canvas.
- Leaflet.markercluster - provides beautiful animated marker clustering functionality for Leaflet

This visual makes use of v4.0 of the Correlates Of War inter-state war data. Unlike v3.10 (used in Visual #1), this dataset contains number of battle related fatalities per country for interstate militarized disputes from 1816-2007. We also introduce a measure of the "power" of a country: the National Material Capabilities Index. This index is composed of a composite of six variables: iron and steel production, military expenditures, military personnel, energy consumption, total population and urban population for each country. This index will serve as a proxy for the power of a country for the remainder of our analysis.

This visual consists of an interactive map with a colored overlay corresponding to one of three variables: the National Capabilites Index score, average fatalities per war or average fatalities per war per military personnel. The crossfilter box for each variable shows the distribution of the variable among the observations, but also allows the user to filter the observations by selecting a range for each variable. This provides an interactive tool to explore our hypothesis: by selecting the lower range on the National Capability index Crossfilter tool, we should expect to see observations filtered with relatively high fatalities. Conversely, by filtering at the high end of average fatalities, we should expect to see countries at the low end of the National Capability Index distribution. And indeed, by doing this filtering we see some evidence that supports our hypothesis.

While this visual provides some evidence consistent with our hypothesis, this method of analysis does not quantify the correlation we are attempting to examine. A different visual analysis technique should be applied to the data. Also problematic, the time dimension is removed from this analysis: all data has been averaged across observations. Since the size of a country's military and its score on the National Capabilites Index changes through time it would be much more useful to observe the relationship between fatalities and the National Capabilities Index at a specific time, rather than across a time average. We explore such an analysis in the next two visuals.

The following open-source libraries were used:

- Leaflet.js- JavaScript library for interactive maps
- Crossfilter- fast multidimensional filtering for coordinated views
- D3- Data Driven Documents
- Sift.js- MongoDB inspired array filtering

Regression Summary Stats | |
---|---|

p-value: |
0.0003 |

r^2: |
0.0403 |

Slope: |
1742839.37 |

Intercept: |
31011.72 |

This visual makes use of the same data shown in Visual #2, but in the form of a scatter plot. A data point is plotted for each state involved in each dispute. Summary information can be viewed by hovering over a data point.

A quick examination of the regression summary stats table shows the results are inconsistent with our hypothesis: fatalities are positively correlated with the National Capabilities Index score. The regression is statisticaly significant (p-value less than 0.05), however very little of the variation in fatalities is explained by the National Capabilities Index (low r-squared statistic). There is a significant mistake in this visual however: the number of fatalities is not adjusted to a per-capita value. It is likely that larger militaries experience greater fatalities per dispute simply becuase more members of the military are involved in fighting. Visual #4 takes this into account. It is also important to note tha by observing the sizes of the data points plotted (military personnel) we can see that military personnel is highly correlated with the National Capabilities Index.

This visual was created using D3. Regression analysis was done using Python SciPy.

Regression Summary Stats | |
---|---|

p-value: |
0.1107 |

r^2: |
0.0079 |

Slope: |
-1.9471 |

Intercept: |
0.4499 |

This visual is similar to Visual #3, however the number of fatalities is adjusted by dividing the size of the state's military at the sime oof the dispute.

We can now see a slight negative correlation between fatalities per military personnel and the National Capabilities Index. However, the regression statistics show this result is not statistically significant. There is also an important problem: by adjusting fatalities by the size of the military we have introduced cross correlation into our regression. Since size of the military is highly correlated with the National Capabilities Index (as we saw in Visual #3, and indeed number of military personnel is a component of the Index) we have possibly contaminated the results by introducing this cross correlation. Overall results seem to be inconclusive, a better model should be used for this analysis.

Visuals #1 and #2 both were displayed using a Mercator projection. The Mercator projection has a long history (first presented in 1569) and gained initial popularity because of properties that made it particularly convenient for nautical navigation (by representing rhumb lines, or a line crossing all meridans or longitude at the same angle - in nautical terms a bearing - as stright lines). The Mercator projection significantly distorts the map projection at the poles, however.

**Projection formulas**

- \(x = \lambda - \lambda_0\)
- \(y - \ln(\tan(\phi)+\sec(\phi)) \)
- where: \(\phi\) is the latitude and \(\lambda\) is the longitude

- \(\phi = \tan^{-1}(\sinh(y))\)
- \(\lambda = x + \lambda_0 \)

The polar perspective of the Lambert-Azimuthal is used for the United Nations logo. The entire Earth can be viewed, however angular distortions are extremely bad at the edges.

- \(x = k^{'}\cos(\phi)\sin(\lambda-\lambda_0)\)
- \(y = k^{'}(\cos(\phi_1)\sin(\phi)-\sin(\phi_1)\cos(\phi)\cos(\lambda-\lambda_0)) \)
- where: \( k^{'}=\sqrt{\frac{2}{(1+\sin(\phi_1)\sin(\phi)+\cos(\phi_1)\cos(\phi)\cos(\lambda-\lambda_0))}}\)

- \(\phi = \sin^{-1}(\cos(c)\sin(\phi_1)+ \frac{(y\sin(c)\cos(\phi_1))}{\rho}) \)
- \(\lambda = \lambda_0+\tan^{-1}(\frac{(x\sin(c))}{(\rho\cos(\phi_1)\cos(c)-y\sin(\phi_1)\sin(c))}) \)
- where \( \rho = \sqrt{x^2+y^2} \)
- and \( c = 2\sin^{-1}(\frac{1}{2}\rho)\)

The Hammer projection is an extension of the Lambert-Azimuthal projection which attempts to reduce the distortion toward the outer edges, where angual distortions are particularly bad. This extension involves an intermediate transformation by halving the vertical coordinates and doubling the values of the meridians from the center.

- \(x = \frac{2\sqrt{2}\cos(\phi)\sin(\frac{\lambda}{2})}{\sqrt{1+\cos(\phi)\cos(\frac{\lambda}{2})}}\)
- \( y = \frac{\sqrt 2\sin(\phi)}{\sqrt{1 + \cos(\phi) \cos\left(\frac\lambda 2\right)}}\)
- where: \(\phi\) is the latitude and \(\lambda\) is the longitude

- \(z = \sqrt{1 - \left(\frac1 4 x\right)^2 - \left(\frac1 2 y\right)^2}\)
- \(\lambda = 2 \arctan \left[\frac{zx}{2(2z^2 - 1)}\right]\)
- \(\phi = \arcsin(zy) \)

- http://en.wikipedia.org/wiki/Mercator_projection
- http://mathworld.wolfram.com/MercatorProjection.html
- http://www.progonos.com/furuti/MapProj/Normal/CartHow/HowAzEqDA/howAzEqDA.html
- http://lazarus.elte.hu/~guszlev/vet/plane.htm
- http://en.wikipedia.org/wiki/Lambert_azimuthal_equal-area_projection
- http://mathworld.wolfram.com/LambertAzimuthalEqual-AreaProjection.html
- http://mathworld.wolfram.com/Hammer-AitoffEqual-AreaProjection.html
- http://en.wikipedia.org/wiki/Hammer_projection