Steps for the Impatient

  1. Import your calibrated data. Kirq can import plain text (CSV) and Excel.

  2. Specify your outcome in the drop-down box at the upper-left of the program.

  3. Specify your causal conditions by checking off those that you wish to include in your analysis.

  4. Specify the type of analysis (necessity or sufficiency) and the associated parameters in the lower-right tab.

Differences between Kirq and fs/QCA

Kirq and fs/QCA employ the same underlying algorithm and, therefore, will produce the same results. However, there are a few important differences:

Using Kirq

We've designed Kirq to be extremely user-friendly, but some of it's features are not immediately obvious:

Introduction to QCA

Qualitative comparative analysis (QCA) is a comparative method used for identifying necessary and/or sufficient conditions. A distinguishing feature of QCA is that it is based on Boolean algebra (the algebra of sets) and, therefore, is not subject to degrees-of-freedom limitations. This means that QCA can be used for small- and medium-N datasets, in addition to large-N analysis.

There are three steps to conducting a QCA: dataset calibration, necessity testing, and sufficiency testing.

Dataset Calibration

The QCA analogue of a variable is a "condition." A condition identifies a set of cases. The process of transforming a conventional variable into a condition is referred to as "calibration." Calibration involves adjusting one's measuring instruments so that they conform to established standards; it makes measures meaningful. Uncalibrated measures permit assessment of the positions of cases relative to one another. Calibrated measures, on the other hand, are directly interpretable. For example, calibration would permit one to classify a country as democratic or autocratic, rather than merely more or less democratic than other countries.

Observations may have full, partial, or no membership in a given condition, with membership scores ranging between 1.0 and 0.0. A score of 1.0 indicates full membership in the condition; for example, a country that fully belongs to the set of democratic countries. A score of 0.0 indicates full non-membership in the condition; for example, a country that is completely out of the set of democratic countries. Scores between 0.0 and 1.0 indicates partial degrees of membership; for example, a country that is mostly-but-not-fully democratic.

Observe that calibrated conditions are asymmetric and that a score of 0.0 merely indicates that an observation does not belong to the target set. For example, to be fully out of the set of rich people does not mean that a person is poor; it merely means that the person is "not rich."

Calibrated conditions may be "crisp" (scores are either 0 or 1) or "fuzzy" (scores may range between 0.0 and 1.0, inclusive). Most users find it easier to begin with crisp sets because the calibration process is easier; observations are either in or out of the target set.

Kirq does not provide a data editor, nor any functionality for calibrating conditions. Instead, you keep your data in whatever format you prefer (e.g., CSV or Excel) and import it into Kirq for analysis. Kirq can read plain text and Excel. For plain text files, Kirq automatically determines the delimiter (comma, tab, etc) by looking at the second line of data.

I provide a LibreOffice Calc macro for the "direct method of calibration" on my website.

I also maintain a QCA add-on for Google Sheets (Google's cloud-based spreadsheet). The add-on provides the direct method of calibration, as well as functions for computing consistency, coverage, and Boolean functions. (It does not provide routines for truth table construction and reduction, however, as these routines are too computationally expensive.)

When applying the direct method of calibration, you will need to assign three parameters:

For a complete discussion of the calibration process, see chapter 5 of Ragin (2008).

Necessity Testing

A necessary condition is a cause that must be present, all or most of the time, for the outcome to occur. The presence of a necessary condition does not mean that the outcome will occur; however, the absence of a necessary condition means that the outcome won't occur, all or most of the time.

The qualifier "all or most of the time" is important because necessary conditions can be imperfect. For example, it is almost always necessary to have many publications in order to receive tenure. That is, most faculty who don't have many publications won't receive tenure; however, some will.

The goodness-of-fit of necessary conditions is assessed through two measures: consistency and coverage, both of which range from 0.0--1.0. Consistency measures the strength of the necessity relationship. A score of 1.0 indicates that whenever the outcome is present, the necessary condition is also present. Scores less than 1.0 indicate a corresponding degree of inconsistency. For example, a score of 0.95 would indicate that whenever the outcome is present, the necessary condition is "almost always" present. A generally-accepted rule-of-thumb is that necessity is indicated with consistency is equal to or greater than 0.90.

A coverage score of 1.0 indicates that whenever the necessary condition is present, the outcome is present. Scores of less than 1.0 indicate that there are instances when the necessary condition is present but the outcome is not. This provides a measure of empirical importance. For example, almost all heroin users previously used marijuana. It is also the case that almost all heroin users previously drank milk. For both of these relationships, consistency would be close to 1.0. However, the corresponding coverage scores would differ: there are many more milk drinkers who do not use heroin than there are marijuana users who do not use heroin. Based upon the coverage scores, a researcher would conclude that milk drinking is not a necessary condition for heroin use. There is no established rule-of-thumb for acceptable coverage scores. This is because coverage only measures empirical relevance: a necessary condition with relatively low coverage may still be important. For example, the number of people who have unprotected sex dwarfs the number of people who contract HIV. Nevertheless, it is clear that not having unprotected sex substantially reduces the likelihood of contracting HIV.

To test for necessary conditions using Kirq, specify your outcome condition in the upper-left drop-drown list. Next, check off the conditions that you want to test. Click the "Necessity" tab in the lower-right panel. Specify your minimum consistency and coverage thresholds in the same panel. Click "Analyze."

Kirq then tests all combinations of those conditions to see which may indicate necessity, at the specified consistency and coverage levels. When a condition is rendered in UPPERCASE, it indicates that the condition is present. When rendered in lowercase, it indicates that it is absent. The results are presented in a "produce-of-sums" format, using Boolean notation. In Boolean notation, addition (+ symbol) is read as "OR" and multiplication (* symbol) is read as "AND."

For example, the necessary condition "A+B" is read as "the presence of A or the presence of B is necessary for the outcome to occur." Similarly, the condition "B+c" is read as "the presence of A or the absence of C is necessary for the outcome to occur."

Typically, Kirq will present a long list of necessary conditions. Think of these as "candidate" necessary conditions. It is up to the researcher to decide if any of these conditions makes sense as a necessary condition. Just because some combination of conditions is mathematically consistent with necessity doesn't mean that it makes sense substantively or theoretically.

At the bottom of output is a solution term, which provides the consistency and coverage scores for all of the necessary conditions ANDed together. This can be used for assessing compound necessary conditions (e.g., the presence A and the absence of B are, in combination, necessary for the outcome to occur).

Always evaluate consistency prior to coverage: if a relationship isn't consistent with necessity, there's no point in assessing its empirical relevance. As a rule, simpler necessary conditions (with just one or two terms) are preferable to more complex ones. Kirq can easily "discover" necessity by ORing many conditions together; i.e., when Y is present, A or b or c or D is also present.

Sufficiency Testing

A sufficient condition is one which, when present, ensures that the outcome will occur, all or most of the time. Like necessary conditions, sufficiency can be imperfect. For example, the proper use of condoms is almost always sufficient to prevent HIV transmission. "Almost always" is an important qualifier because even when properly used, there is a small chance of transmission still occurring (for example, because the condom breaks). A distinguishing feature of QCA is that it can detect equifinality: when there are different paths to achieving the same outcome. For example, elective abortion and miscarriage are two different and distinct causes of pregnancy termination. Note that with QCA there is no claim that any identified sufficient conditions are the only ways of achieving the outcome. For example, abortion and miscarriage aren't the only causes of pregnancy termination; a fetus can also be killed via physical assault on the mother. In recognition of QCA's sensitivity to equifinality, sufficiency solutions are often referred to as paths or "recipes" (as there can be many recipes for making the same cake).

Sufficient conditions are also assessed using consistency and coverage. Their interpretation is also the same: consistency indicates the strength of the sufficiency relationship and coverage gauges empirical relevance. For sufficient conditions, the rule-of-thumb is to look for consistency scores of equal to or greater than 0.80. Scores of less than 0.80 generally indicate substantial inconsistency and that a sufficiency relationship does not exist. As with necessary conditions, there is no corresponding rule of thumb for acceptable coverage scores, again because a sufficient condition might only explain a small number of cases but nevertheless be very important.

The sufficiency test proceeds in two steps. The calibrated dataset is first sorted into a "truth table," which is then reduced to a set a Boolean equations. The truth table lists all possible combinations of the causal conditions--whether or not actually present in your data--and will therefore have 2^k rows, where k equals the number of causal conditions. That is, a truth table with 2 causal conditions will have 4 rows, one with 3 causal conditions will have 8 rows, one with 4 causal conditions will have 16 rows, and so forth. Each observation from the dataset will be sorted into one, and only one, row of the truth table, based on its membership scores. The truth table therefore provides a typology, grouping similar observations together. The rows of the truth table are often referred to as "configurations."

It is generally fruitful to analyze the truth table itself in order to understand which configurations are (and are not) associated with the presence and absence of the outcome. The filter dialog can be helpful here; for example, to hide remainders so that you can focus on the configurations that empirically exist. Open the filter dialog by clicking the binoculars icon. After arriving at a satisfactory truth table, the truth table may be reduced to a set of Boolean equations that describe the different paths ("recipes") associated with the outcome occurring.

Kirq provides four levels of truth table reduction. The least reductive, reducing to primitive expressions, retains all complexity; the most reductive, reducing to the parsimonious solution, uses strong simplifying assumptions to arrive at the simplest possible solution. Note that these solutions are related to one another; the simpler solutions are generalized versions of the more complex ones. It is up to the researcher, based on their theoretical and substantive knowledge, to choose the solution that makes the most sense and strikes the best balance between complexity and parsimony.

Testing for sufficient conditions using Kirq is similar to testing for necessary conditions. First, specify your outcome condition in the upper-left drop-drown list and check off the causal conditions that you wish to include in the analysis. Use the lower-right panel to specify the parameters for the analysis. There are four parameters to specify:

Prime implicants are the first level of simplification. Prime implicants remove basic logical redundancies. For example: assume that there are two truth table rows that are associated with the presence of the outcome: "A and B" and "A and not-B." Since both rows lead to the outcome, and since B is present in one configuration but not the other, A must logically be the cause. The resulting prime implicant would be A.

The complex solution is equivalent to that produced by Ragin's fs/QCA software. It eliminates logically-redundant prime implicants. For example, Ragin (1987:96-7) provides a solution S=AC+AB+Bc, where the observations conforming to AB also conform to either AC or Bc. The prime implicant AB is, therefore, logically redundant, and would therefore be eliminated by the complex solution. It is up to the researcher to decide if this makes sense theoretically and empirically. An alternative interpretation might be that the observations explained by AB are "overdetermined."

The parsimonious solution is also equivalent to that produced by Ragin's fs/QCA software. It further simplifies the solution by using "remainders" as simplifying assumptions. A remainder is a truth table row with no associated observations (or, as described below, one in which the number of observations falls below the specified frequency threshold). Remainders are logically-possible combinations of conditions that lack empirical instances; that is, they may be thought of as potential counterfactuals. It is common in QCA for a truth table to have many remainders.

The parsimonious solutions uses remainders as counterfactuals to arrive at the simplest possible solution. The parsimonious solution is often overly simplistic. But it can be understood to provide the lower bound of causal complexity: the terms that make up the parsimonious solution will be present at the other levels of reduction--the complex solution, prime implicants, and primitive expressions. Some researchers interpret the parsimonious solution as identifying the "fundamentally explanatory" conditions of the solution.

If the computed consistency value for a truth table row falls below the consistency threshold, the outcome for that row will be set to False. This indicates that the presence of this combination of causal conditions is not consistently associated with the presence of the outcome.

The proportion threshold identifies the proportion of cases that need to be consistent (or inconsistent) in order to classify a truth table row as True or False. The default setting, 1.0, means that a truth table row's outcome will only be set to True if (a) the row's consistency meets or exceeds the specified consistency threshold and (b) if all of the observations belonging to that row are consistent with the outcome. (Likewise, a row will only be set to False if the row's consistency falls below the specified consistency threshold and all of the corresponding observations are inconsistent with the outcome.)

It is impossible to reduce to Boolean equations a truth table that contains contradictions. Therefore, any contradictions must first be resolved. There are four ways of resolving contradictions:

  1. Manually set the outcome to True or False. If, based on your theoretical and substantive knowledge, you have a good reason for determining that the combination of causal conditions described by this truth table row should (or should not) consistently lead to the outcome, you can manually change the outcome to True or False. To do this, double-click the cell that you wish to change and a drop-down list will appear. (The outcome column is the only column that you can recode in this manner.)

  2. Return to your cases and identify a missing variable. The presence of contradictions often indicates an insufficiently-specified model. That is, if you have some observations that exhibit the outcome and some that do not, it may be that there exists an as-of-yet-unidentified condition that distinguishes the two groups. Returning to your cases and figuring out what is different about the two groups is often the most productive way of resolving contradictions.

  3. Return to your cases and reexamine your calibrations and consistency threshold. Contradictions can also arise when your calibrations are off, or when your consistency threshold is too high. Adjusting these can produce a truth table free of contradictions.

Notice that all three of these strategies involve the researcher returning to the cases and immersing him or herself in the data. This is one of the characteristic strengths of QCA--that it encourages close interaction with one's data, guided by the application of theoretical and empirical knowledge. The fourth strategy, listed below, avoids this:

  1. If you set the proportion threshold to 0.5, the software will not produce any contradictory rows. Doing so, therefore, mimics the behavior of Ragin's fs/QCA software, which does not include support for fuzzy-set contradictions.

Having specified the four parameters, click "Reduce" to execute the sufficiency analysis. The software will construct the truth table and attempt to reduce it. If there are any contradictions present in the truth table, the software will pop up an error and drop you into the truth table so that you may proceed with resolving the contradiction(s). (If you wish to simply generate the truth table without Kirq attempting to automatically reduce it, you may click "Truth Table.")

Once the truth table is successfully reduced, you will be presented with a consistency/coverage (concov) table similar to that produced for the necessity analysis. The concov table lists the different recipes that produce the outcome, along with their consistency, raw coverage, and unique coverage. Consistency is interpreted as described above. For sufficient conditions, coverage may be partitioned in a manner analogous to partitioning explained variance. Hence Kirq distinguishes between raw and unique coverage, which allows one to assess the degree to which solution recipes overlap, explaining the same cases. Kirq further facilitates this by listing in the concov table the observations associated with each recipe. Note that it is possible for recipes to report 0.0 unique coverage, if the observations explained by that combination of conditions are also explained by one (or more) other recipes.

Related Software

Acknowledgments

Thanks to Roberto Franzosi and Doug Fernald for their comments and feedback on earlier drafts of this tutorial. I've shamelessly stolen at least a couple of sentences originally written by Lee Green for a paper that we're co-authors on. All errors and omissions are the author's responsibility.