Ackerman and Pinson (2016) ‘Speaking Truth to Sources.’
Citation: Ackerman, Gary A. and Pinson, Lauren E. (2016) ‘Speaking Truth to Sources: Introducing a Method for the Quantitative Evaluation of Open Sources in Event Data,’ Studies in Conflict & Terrorism, 39:7-8, pp. 617-640.
Abstract: Open-source event data sets frequently used for social science analysis rarely provide any transparent explanation of the credibility of sources or the validity of data thereby obtained. We develop a sample Source Evaluation Schema for the purpose of operationalizing measures of open-source event validity at the case, source, and variable levels. Based on our findings, we argue that explicitly incorporating and disclosing credibility and validity levels allows for greater flexibility in tailoring the inclusion of cases for researchers’ specific analytical requirements. By facilitating more transparent analyses, the inclusion of such measures in similar datasets can result in more defensible conclusions, especially in highly charged political and security contexts such as those surrounding terrorism.
Ackerman and Pinson (2016:618): “Particularly in the domain of terrorism, official reporting is often insufficient for scientific analysis. First, many countries and localities might lack the resources to properly collect and record statistics on terrorist events occurring within their jurisdiction, resulting in the lack of official data in many areas of the world. Second, even where these resources exist, many governments, especially those of the less democratic ilk, might find it politically expedient to obfuscate the incidence of terrorism to serve the regime’s interests, either censoring terrorism data that the government fears could undermine its legitimacy, or alternatively inflating the rate of terrorism in order to provide a scapegoat or justification for pursuing particular enemies. Third, even in those cases where official sources do not consciously dissemble, governments adopt varying definitions of terrorism that reflect parochial concerns, making it problematic, especially for global data sets, to obtain consistent data from official reports.” Offer open sources as a potential antidote to this.
Ackerman and Pinson (2016:619-621): Identify selection and description bias as problems facing databases. Selection bias relates to events not being reported on, which can sometimes be mitigated by using multiple sources [which presumes those sources themselves don’t use the same sources – ignoring the problem of broader media environments]; thwarted or failed events are particularly likely to be overlooked. Description bias relates to impartial or slanted reporting. Suggest assessing and coding the credibility of sources as helping mitigate these biases. Define validity as “the level of confidence that the information that is recorded objectively reflects the reality of what is being measured.” Credibility is defined as “the likelihood that an additional completely competent and disinterested source would report the same information based on the same event.” [Validity, however, requires that an objective truth be available (something the authors themselves acknowledge as a complication), and it is unclear where this competent and disinterested source can come from in a politicised media environment.]
Ackerman and Pinson (2016:621): In the absence of a record of absolute truth, authors propose a single measure aggregating credibility for a source.
Ackerman and Pinson (2016:623): Note that many IR event datasets either use a single source or, having started with more than one, reduce their scope.
Ackerman and Pinson (2016:638-640): Source Evaluation Schema Individual source credibility 1. Institutional and Author Objectivity (source_object) Rating the objectivity of a source provides a subjective measure of the extent to which the provided information reflects bias. If either one of the author or the institutional publisher is biased, the source is regarded as biased. Owing to the coder subjectivity involved, a relatively broad (3-point) scale was used. –99 = Inherited If the source is not independent, objectivity is based on the measure attributed to the original source. 0 = Low Author and/or institutional publisher have consistently demonstrated systematic bias (extrinsic evaluation) or the source document clearly reflects a lack of objectivity (intrinsic evaluation), signified by such characteristics as overly emotive writing. Examples include: a newspaper directly affiliated with a terrorist group; a reporter with a history of advocating for or against a particular group without any use of facts; or a passage overtly sanitizing or exaggerating certain violent behaviors. 1 = Potential No intrinsic indications of bias, but author and/or institution have demonstrated bias in some cases, but not others (i.e., non-systematically). For example, a newspaper that is generally measured in its approach to reporting but is known on occasion to take a very pro-Israeli (or pro-Palestinian) stance on the Israeli–Palestinian issue, a media outlet owned by a member of a royal family that espouses his family’s views or a state- run news source reporting on continuing conflict between rebels and the government. 2 = High Neither the author nor the institution has a reputation for systematic bias and there are no intrinsic indications of bias, that is, the document itself shows no overt or easily rec ognizable signs of bias. To code “High” without prior knowledge of the author/publisher, the coder must research the history and reputation of the author and institutional publisher. 2. Institutional and Author Competence (source_comp) –99 = Inherited If the source is not independent, objectivity is based on the original source. 0 = Low Extrinsic evaluation reveals that the author and/or institutional publisher: (a) have had serious and widespread questions raised about their reporting skills or (b) obviously lacked the resources or skills to have adequately reported on the event. Alternatively, intrinsic analysis of the source document indicates: (c) substantive inconsistencies or errors. Examples might include a tabloid with no reputation for serious journalism or an original language document with multiple overt typographical errors, misspellings, or grammatical errors. 1 = Questionable While at least one of the institutional publisher or author has failed to develop a general reputation for high quality output (extrinsic evaluation), the source document itself shows a prima facie level of competence (intrinsic evaluation). Conversely, the institutional publisher and author are generally regarded as competent reporters pro ducing high quality output, but the source document itself shows inconsistencies and/ or errors of a non-negligible number or nature. An example would be a seemingly well-written and researched source from an institution that has previously published unique stories that have been heavily disputed by other media sources and never substantiated. 2 = General The author and institutional publisher are generally regarded as producing high quality output (extrinsic evaluation of reputation) and the source document itself shows a prima facie level of competence (intrinsic evaluation). However, neither the author nor institutional publisher covers the subject matter area (often referred to as a “beat” in journalism circles) or geographic region on a regular basis. 3 = Full Both the author and institution have demonstrated prior competence with respect to the geographical and substantive domain on which they are reporting and there are no intrinsic indications of a lack of competence.
Overall event validity determination
- Source-Derived Validity (sdv) Sources with either source_object = 0 or source_comp + 0 are excluded. –99 = Unknown No usable sources describing the event could be located. Downloaded by [University of Birmingham] at 05:41 03 June 2016 –88 = Not Obtained Source(s) exist but have not been accessed (includes translation and legal issues).1 = Single A single source or multiple non-independent sources describe the event.2 = Two Independent Sources Two independent sources describe the event and agree on the broad nature of the event. A vital check at this stage is whether these two ostensibly independent sources both display potential bias (i.e., source_object D 1). In this case the sources are revisited and each source’s bias is compared. If both sources display the same bias, the sources are not regarded as independent for the purposes of source-derived validity. 3 = Three Plus Independent; Two With Competing Bias Three or more independent sources describe the event and agree on the broad nature of the event; or two independent sources with competing biases describe the event and agree on the broad nature of the event. If two or more of the sources display any level of common bias, they are dealt with similarly to the above instruction.
Inherent event uncertainty
- Inherent Event Uncertainty (uncert_event) 0 = None All sources agree that the event occurred in such a way that it warrants inclusion. 1 = Some Most observers believe that the event occurred in such a way that it warrants inclusion, but the sources portray some uncertainty regarding this. 2 = Considerable The event most likely occurred in a way that would not warrant inclusion in the data set (in our example, either an accident or the result of natural causes), but the possibil ity has been raised by one or more observers that it may have constituted a genuine inclusion event (in our example, an attack).
Event detail evaluation
- Detail Uncorroborated (X_Uncorroborated) 0 = None More than one reasonably credible source provides information on the detail variable. 1 = Uncorroborated There is only a single reasonably credible source that provides a value for that detail variable.
- Detail Discrepancy (X_discr) 0 = None All reasonably credible sources provide the same information on the detail variable. 1 = Some Two (or more) reasonably credible sources have a discrepancy in the information pro vided for a particular variable.