Analysis


After compiling the dataset, the testing approach is straightforward. To predict the occurrence of successful coercion, I control for the independent variables by using a stepwise analysis with mixed elimination and a convergence measure to separate the interrelations along with the effects followed by a logistic regression to determine their significance. For the reader unfamiliar with this method of statistical regression, let me explain the value of this method over others that have been used.

The logistic regression predicts the probability of some event occurring (given other related and affecting factors) with the equation P=(1)/[1+e^-(a+bX)] where (a) is the intercept and (b) represents each of the independent variables. In this case, I predict the occurrence of a successful coercion by the United States for the years 1800-2000 given the independent variables listed previously.

To calculate the effects of the dichotomous variables, a form of model parameterization is employed. Dummy variables are assigned to independent variables that take on categorical rather than ordinal values. The program runs an initial stepwise regression to eliminate a number of these variables which are clearly insignificant or have any strong interrelation with other variables. We are then left with a model containing only the relevant variables. Each of these variables and or values is added to the model, and the output is compared to the intercept[1]. After this first round of results, the least significant variables with the least convergence (modular interdependence) are eliminated and the regression is run again until only significant independent and interdependent variables remain. From these variables one can select key variables and produce confidence intervals surrounding the level of effect each key variable has on the outcome event[2].

Variables will be considered to be significant at the .2 level or above—a less-restrictive analysis which is used because of the varied nature of the social sciences. In the event that variables are found not to be significant, they cannot be said to not “matter” but they lose their simplified predictive power status as outlined in the introduction to this design. Variables that are found to be significant will be carefully considered to ensure elimination of cross-variable contamination and redefined as appropriate. Significance indicates that the variable in question has some contribution to the success or failure of coercive diplomacy.

A total of four data sets are considered: 1) Instances of coercive diplomacy employed by the United States only; 2) instances of coercive diplomacy employed by the United States, the USSR, the United Kingdom and Germany; 3) instances of coercive diplomacy employed by the United States, the USSR and the United Kingdom (excluding Germany); and 4) instances of coercive diplomacy employed by the United States (1900-2000), the USSR (1940-2000), and the United Kingdom (1800-1940).

The first data set allows us to test the research question: are there factors that allow a state to predetermine success when employing coercive diplomacy? It is within the scope of the paper (the United states, from 1800-2000) and includes 123 cases with a 60/40 split (failure/success) in the outcome.

The second data set incorporates two other world hegemons and a regional hegemon. The model created for the United States will then be employed on the second set of data to see if it has cross-country applicability. The second set of data will then be independently analyzed and a model built that maximizes its combined predictive power.

The third data set removes the regional hegemon and tests only the world hegemons. We again test the United State’s model for applicability and then generate a unique model to maximize combined predictive power. The fourth data set follows the footsteps of the third but limits the years to analyze each country when it was considered a global hegemon. The same model test and generation is applied to the fourth data set.






[1] A zero-value relation to the outcome in question. The same concept as an intercept on a graph where one of the two values in the Cartesian coordinate set always equals zero.

[2] For the logistic regression there is no r^2 value reported. R^2 is based on the concept of sums of squares, and is employed primarily in linear models. In a non-linear model (as in the case of the logistic model) because it is a logarithmic curve the theoretical conditions that sums of squares requires are not met in the parametric logistic model and non-parametric models (with the exception of a simple chi-square) are beyond the scope of my training. In addition, there are no exact tests available for this type of procedure.