what do you think is the biggest threat to secondary data validity and why
Identifying, categorizing and mitigating threats to validity in software engineering secondary studies
Abstract
Context
Secondary studies are vulnerable to threats to validity. Although, mitigating these threats is crucial for the brownie of these studies, we currently lack a systematic approach to identify, categorize and mitigate threats to validity for secondary studies.
Objective
In this paper, we review the corpus of secondary studies, with the aim to identify: (a) the trend of reporting threats to validity, (b) the virtually mutual threats to validity and respective mitigation actions, and (c) possible categories in which threats to validity can be classified.
Method
To attain this goal we utilize the 3rd study enquiry method that is used for synthesizing noesis from existing secondary studies. In particular, we nerveless data from more than 100 studies, published until December 2016 in top quality software engineering venues (both journals and conference).
Results
Our results suggest that in recent years, secondary studies are more than likely to report their threats to validity. Yet, the presentation of such threats is rather advertizing hoc, eastward.g., the same threat may exist presented with a different name, or under a different category. To alleviate this trouble, we advise a classification schema for reporting threats to validity and possible mitigation actions. Both the classification of threats and the associated mitigation deportment accept been validated by an empirical study, i.e., Delphi rounds with experts.
Decision
Based on the proposed schema, we provide a checklist, which authors of secondary studies can utilise for identifying and categorizing threats to validity and corresponding mitigation actions, while readers of secondary studies can use the checklist for assessing the validity of the reported results.
Introduction
Empirical Software Engineering (ESE) research focuses on the awarding of empirical methods on any phase of the software development lifecycle. The three predominant types of empirical inquiry are [44], [47]: (a) surveys, which are performed through questionnaires or interviews on a sample in order to obtain characteristics of a population [36]; (b) case studies, which study phenomena in a "existent-world" context, especially when the boundaries between phenomenon and context are not clear [51]; and (c) experiments, which have a limited scope and are virtually often run in a laboratory setting, with a high level of control [47]. During the terminal years and mainly due to the rising of the Bear witness-Based Software Technology (EBSE) Paradigmane [22], 2 other types of studies have become quite popular [15]:
- •
-
Systematic Literature Reviews (SLRs) use data from previously published studies for the purpose of enquiry synthesis, which is the collective term for a family of methods for summarizing, integrating and, when possible, combining the findings of dissimilar studies on a topic or inquiry question. Such synthesis can also identify crucial areas and questions that take not been addressed adequately with past empirical enquiry. It is built upon the observation that no matter how well-designed and executed, empirical findings from individual studies are limited in the extent to which they may exist generalized [18].
- •
-
Systematic Mapping Studies which employ the same bones methodology as SLRs but aim to identify and allocate all research related to a broad software engineering topic rather than answering questions about the relative merits of competing technologies that conventional SLRs address. They are intended to provide an overview of a topic area and identify whether in that location are sub-topics with sufficient primary studies to conduct conventional SLRs and likewise to identify sub-topics where more primary studies are needed [21].
The strength of bear witness produced by ESE inquiry depends largely on the use of systematic, rigorous guidelines on how to carry, and report empirical results (see e.thou., for experiments [47], for SLRs [18], for mapping studies [34], for surveys [36], and for example studies [38]). One of the most crucial parts of conducting an empirical study is the direction of threats to validity, i.east., possible aspects of the research design that in some way compromise the brownie of results. Despite this crucial role, we currently lack guidelines on how to identify, mitigate, and categorize threats to validity in secondary studies; this is in contrast to experiments, example studies and surveys, where mature guidelines exist. Due to this reason, researchers either practice non report threats to validity for secondary studies, or report them in an ad hoc manner (run across Department v). Specifically, the most common issues found in practice, business threats to validity being:
- •
-
Completely missing from sure studies. Thus, such studies do not provide any mitigation actions for them;
- •
-
Incorrectly categorized. The aforementioned threat is classified in different categories past different researchers (e.1000., study selection bias is categorized in some studies every bit threat to internal and in others every bit a threat to conclusion validity. As well, in some cases threats are inefficiently categorized based on guidelines for other types of empirical inquiry (e.g., for experiments [45], or for example studies [38]), or under a custom categorization, which is not uniform . One possible reason for this problem is the fact that threat categories are not orthogonal, especially in cases where they stalk from unlike schools of idea or guidelines (see Section 2.one). For example, reliability examines if the results of a study depend highly on the involved researchers. In turn, this relates to decision validity, in the sense that people are decumbent to biases (e.g. due to previous experiences, preferences on research, etc.);
- •
-
Inconsistently named. The same threat is reported with a different name by dissimilar researchers (east.g., the terms publication bias and researcher bias are used for describing the same threats);
- •
-
Inconsistently mitigated. The same threat is mitigated differently by different researchers. Although this provides a variety of available mitigation actions, some mitigation actions are ineffective and cause confusion to readers who consider post-obit them.
These issues, in turn lead to a difficulty in evaluating the validity of the reported results and hinder a uniform comparison between secondary studies. In addition, the lack of guidance for mitigating threats to validity, which could serve as a reference betoken, makes information technology more difficult to reuse mitigation strategies, also as to consistently identify and categorize both threats and mitigation actions.
To address this trouble, we conducted a tertiary study (i.eastward., an SLR on secondary studies), and so every bit to recollect and analyze how software engineering secondary studies identify, categorize and mitigate threats to validity. The objective of this third study is: "to summarize secondary studies that report threats to validity, with the aim of identifying: (a) the frequency of reporting threats to validity over the years, (b) the most common threats to validity and (c) the respective mitigation actions, and (d) a possible classification schema of threats to validity". The master outcomes of the study are a classification schema for threats to validity and a checklist that tin exist used while conducting/evaluating secondary studies. The outcomes are expected to contribute towards establishing a standard and consistent way of identifying, categorizing and mitigating threats to validity of secondary studies. In addition to that, in order to enrich the outcomes of this work we explored existing literature in 2 related research sub-fields: (a) secondary studies in medical scientific discipline (i.eastward., the area from where the Testify-Based paradigm has emerged from), and (b) guidelines for conducting secondary studies. Related studies from medical science and the guidelines for performing secondary studies has led to the identification of best practices in secondary studies that can be applied as mitigation actions for minimizing of effects of a validity threat, enriching the provided checklist that has been derived from the classification schema. Finally, acknowledging the subjectivity in the qualitative nature of this work, we validated the outcomes through a Delphi method based on the opinion of experts in secondary studies and empirical studies in general. The Delphi method was iterated in three rounds and provided preliminary evidence for the merits of the classification schema and checklist.
Nosotros note that literature reviews have been performed long earlier the advent of the terms 'Systematic Mapping Report' and 'Systematic Literature Review' and respective guidelines. We too acknowledge that secondary studies can be performed without post-obit the guidelines of SMSs and SLRs (especially before the 2 terms go popular). Notwithstanding, such non-systematic literature reviews have not reported (in the vast majority of the cases) threats to their conclusions. Reporting of threats became popular one time specific guidelines were proposed and adopted in the context of the EBSE prototype. Thus, for a report aiming at systematically analyzing the reported threats, we consider it proper to focus on the studies that take adopted the corresponding guidelines. For the rest of the study, when we refer to secondary studies, we refer to Systematic Mapping Studies and Systematic Literature Reviews.
The balance of the paper is organized equally follows: Section 2 presents related work, i.e., categories of threats to validity in other empirical methods; Section 3 presents our third report protocol; Section 4 reports on the results; and Section v discusses the proposed guidelines for identifying, categorizing and mitigating threats to validity for secondary studies in software applied science. In Section 6, nosotros present the pattern and results of our validation study, whereas in Sections 7 and 8 nosotros nowadays threats to validity and conclude the paper.
Section snippets
Related work
The empirical software engineering literature points out the relevance and importance of identifying and recording validity threats, equally an attribute of research quality [12], [32] and [35]. According to Perry et al. [32] the structure of an empirical study in SE should include a section of threats to validity. This section should discuss the influences that may limit the authors' and readers' power to translate or draw conclusions from the report's data. In addition, Jedlitschka et al. [17]
Methodology
This section outlines the protocol used to perform this tertiary study. The protocol consists of five activities [22], namely defining the research objectives and questions, the search process (terms and resources), inclusion/exclusion criteria, information extraction strategy, and synthesis of the extracted data.
Results
This section presents the results of this tertiary study, aiming at providing an overview of how threats to validity are identified, categorized and mitigated in secondary studies. The remainder of the sub-sections are organized by research question: Department 4.1 presents the frequency with which secondary studies report on threats to validity; Section four.2, reports the most mutual threats to validity; Department 4.3 lists the nigh common mitigation actions for the most threats identified in the previous
Discussion
The identification, categorization and mitigation of threats to validity is an important part for secondary studies. During the last decade, the ratio of secondary studies managing threats to validity has continuously increased. All the same, our results suggest that a considerable confusion still exists in terms of terminology, mitigation strategies, and classification. We further focus on the classification of threats to validity and consider the example of the study choice bias threat, which
Validation of classification schema and checklist
In this section, nosotros nowadays the validation of the proposed classification listing and checklist, by applying the Delphi technique, with secondary study experts. This validation is necessary due to the nature of this written report (i.east., the synthesized results provide guidelines for conducting future secondary studies); thus nosotros want to make explicit the potential limitations and strengths of the classification schema and checklist, as identified by experts. In Section 6.1, we present the blueprint of our
Threats to validity
In this section nosotros present the threats to validity that we have identified for this tertiary study. In guild for this department to deed as a proof of concept for the classification proposed in this work, we structure this section, based on the checklist provided in Department 5.2. Specifically, in Section 7.1, nosotros study threats to validity related to study selection (Tv1-TV7), in Department seven.2, nosotros report threats related to data validity (TV8-Television receiverxiv), and in Section 7.3, we study threats related to
Conclusion
In the final decade, secondary studies (i.east., systematic literature reviews and mapping studies) accept emerged every bit a pop research methodology for summarizing existing literature. Despite their popularity and the thorough guidelines for conducting them, the enquiry country-of-the-art lacks support on how to place, report and mitigate threats to validity for secondary studies. To alleviate this problem we take conducted a tertiary study on software engineering research corpus, i.due east., a
Cited by (62)
Recommended articles (half dozen)
© 2018 Elsevier B.V. All rights reserved.
Source: https://www.sciencedirect.com/science/article/abs/pii/S0950584918302106
0 Response to "what do you think is the biggest threat to secondary data validity and why"
Post a Comment