Abstract: Pattern discovery, the process of discovering previously unrecognized patterns, is usually performed as an ad-hoc process with little resulting certainty in the quality of the proposed patterns. Pattern validation, the process of validating the accuracy of proposed patterns, has rarely gone beyond the simple heuristic of “the rule of three”. This article shows how to use established scientific research methods for the purpose of pattern discovery and validation. The result is an approach to pattern discovery and validation that can provide the same certainty that traditional scientific research methods can provide for the theories they are used to validate. This article describes our approach and explores its usefulness for pattern discovery and evaluation in a series of studies.
Keywords: Patterns, pattern discovery, pattern validation, theory codification, theory building and evaluation, research design
Reference: Riehle, D., Harutyunyan, N., & Barcomb, A. (2020). Pattern Discovery and Validation Using Scientific Research Methods. Friedrich-Alexander-Universität Erlangen-Nürnberg, Dept. of Computer Science, Technical Reports, CS-2020-01, February 2020.
Many computer science degree programs do a lousy job at teaching science. A high school student, entering university, often has a good idea what science is about, based on their physics and chemistry classes. At least, it involves controlled experiments. At university, this is rarely picked up, and computer science students are given the idea that programming something novel constitutes science. With that idea, they are often bewildered when I teach them rigorous research methods, in particular if those originated in the social sciences (like qualitative interviews or hypothesis-testing surveys).
Pay-walled publications are just that: Publications that nobody reads unless someone pays the publisher’s fee. I have no problem with that, because I don’t read pay-walled work and don’t consider it published research and prior art that I should care about.
The real problem starts with researchers and editors who expect me to find, read, and consider pay-walled work as prior art. That’s an unacceptable proposition to me and an unfair one to the world.
Most people believe that scientists first perform basic (“fundamental”) research and then perform applied research. Basic research delivers the fundamental insights that then get detailed and refined as they hit reality in applied research. Along with this comes the request that basic research funding should be provided by the country (because few companies would ever pay for it) before industry kicks in and supports applied research. Nothing could be further from the situation in my engineering process research.
When planning a publication strategy for a dissertation, invariably the question comes up where to submit your papers. Ph.D. students naturally are biased towards conferences, because if a paper gets accepted to a conference they get to travel to a (usually) nice place. I nip this bias in the bud right away: For a journal paper, every Ph.D. student gets a conference to attend for free. This lets us focus then on the economic value of a journal vs. a conference paper and how to best reap the benefits of hard research work. Here, I’m a contrarian (to most colleagues): I’m in favor of journals. It is also the economically smart choice for a Ph.D. student.
Research should be presented with appropriate choice of words to the world. So it bugs me if researchers, maybe unknowingly, overreach and call the evaluation of a theory a validation thereof. I don’t think you can ever fully validate a theory, you can only validate individual hypotheses.
The following figure shows how I think key terms should be used.
From my excursion into qualitative research land (the aforementioned Berliner Methodentreffen) I took away some rather confusing impressions about the variety of what people consider science. I’m well aware of different philosophies of science (from positivism to radical constructivism) and their impact on research methodology (from controlled experiments to action research, ethnographies, etc.) I did not expect, however, for people to be so divided about fundamental assumptions about what constitutes good science.
One of the initial surprises for me was to learn that it is acceptable for a dissertation to apply only one method and for that method to only deliver descriptive results (and thereby not really make a contribution to theory). In computer science, it is difficult to publish solely theory development research (let alone purely descriptive results) without any theory validation attempt, even if only selective. The limits of what can be done in 3-5 Ph.D. student years are clear, but this shouldn’t lead anyone to lower expectations.
A researcher-friend recently complained to me that her research paper had been rejected, because the reviewers considered it “boring”. I recommended that she complain to the editor-in-chief of the journal, because in my book, “boring” is no acceptable reason to reject a paper. (“Trivial” may be, but that is a different issue.)
The reasoning behind rejecting a paper, because it is “boring”, is as follows: Research should be novel and provide new and perhaps even unintuitive insights. Results that are not surprising (in the mind of the reviewer, at least) are not novel and therefore not worthy of publication.
Traditional science has a clear idea of how research is to progress, rationally speaking: First you build a theory, for example, by observation, expert interviews, and the like, and then you generate hypotheses to test the theory. Over time, some theories will stand the test of time and will be considered valid.
Sadly, most software engineering research today, even the one published in top journals and conferences, often skips the theory building process and jumps straight to hypothesis testing. Vicky the Viking, from the accordingly named TV series of way back comes to my mind: Out of the blue, some genius idea pops into the researcher’s mind. This idea forms the hypothesis to be tested in research.