How the Lack of Theory Building in Software Engineering Research is Hurting Us

Traditional science has a clear idea of how research is to progress, rationally speaking: First you build a theory, for example, by observation, expert interviews, and the like, and then you generate hypotheses to test the theory. Over time, some theories will stand the test of time and will be considered valid.

Sadly, most software engineering research today, even the one published in top journals and conferences, often skips the theory building process and jumps straight to hypothesis testing. Vicky the Viking, from the accordingly named TV series of way back comes to my mind: Out of the blue, some genius idea pops into the researcher’s mind. This idea forms the hypothesis to be tested in research.

Continue reading “How the Lack of Theory Building in Software Engineering Research is Hurting Us”

On the Misuse of the Terms Qualitative and Quantitative Research

Researchers often use the term “qualitative research” to mean research without substantial empirical data, and use “quantitative research” to mean research with substantial empirical data. That doesn’t make sense to me, as most “qualitative researchers” will quickly point out, because qualitative research utilizes as much data in a structured way as it can. Everything else would not be research.

Continue reading “On the Misuse of the Terms Qualitative and Quantitative Research”

Why “Soft” Research is “Hard”

Some of my colleagues like to talk about how research that involves programming is “hard”, while research that involves human subjects is “soft”. Similarly, some colleagues like to call exploratory (qualitative) research “soft” and confirmatory (quantitative) research “hard”. Soft and hard are often used as synonyms for easy and difficult, and this is plain wrong.

Pretty much any research worth its salt is difficult in some way, and working with human subjects makes it even more difficult. I find research methods like qualitative surveys, involving interview analyses, for example, much harder than the statistical analysis of some data or some algorithm design. The reason is the lack of immediate feedback.

While you can (and should) put quality assurance measures in place for your interview analyses, ranging from basic member checking to complex forms of triangulation, it might take a long time until you learn whether what you did was any good. So you have to focus hard in your analysis without knowing whether you are on the right track. It doesn’t get more difficult than this.

Research Papers vs. Blog Posts

A short Twitter thread:

Scientific research papers cite other research papers for related or prior knowledge they build on; they cite blog posts as primary material to work with in theory building; two very different things 1/4

A blog post needs to (a) properly use the scientific method and (b) be socially validated by peer review to be considered a scientific research article; for (b) it needs a traditional journal 2/4

The problem with research papers are for-profit publishers and the to-pay open access model; publishers are bottom feeding to increase revenue, hurting the reputation of science with low quality papers 3/4

Please don’t start another journal unless you have an answer that solves the conflict between for-profit publishing and academic publishing incentives @timoreilly 4/4

Challenges to Making Software Engineering Research Relevant to Industry

I just attended FSE 2016, a leading academic conference on software engineering research. As is en vogue, it had a session on why so much software engineering research seems so removed from reality. One observation was that academics toil in areas of little interest to practice, publishing one incremental paper of little relevance after another. Another observation was that as empirical methods have taken hold, much research has become as rigorous as it has become irrelevant.

My answer to why so much software engineering research is irrelevant to practice is as straightforward as it is hard to change. The problem rests in the interlocking of three main forces that conspire to keep academics away from doing interesting and ultimately impactful research. These forces are:

  • Academic incentive system
  • Access to relevant data
  • Research methods competence

Continue reading “Challenges to Making Software Engineering Research Relevant to Industry”

Lost Over Call for Open Access for all Scientific Papers

I’m at a loss over the recent reports on the requirement for all research publications to be open access by 2020. Open access means that the research papers are accessible openly without a fee. There are plenty of confusing if not outright wrong statements in the press, but I’m not so much concerned with poor journalism than with the actual proposed policies.

Sadly, I couldn’t find more than this one sentence on page 12 of the report linked to from the meetings website:

Delegations committed to open access to scientific publications as the option by default by 2020.

I’d like to understand what this means and then how this is supposed work. Specifically, I’d like to know how this is not going to either break free enterprise or make predatory publishers like Elsevier laugh all the way to the bank.

Continue reading “Lost Over Call for Open Access for all Scientific Papers”

Follow-up on the Discussions about Knowledge for Knowledge’s Sake

I’ve been enjoying the discussion around Patek’s recent video argument for knowledge for knowledge’s sake in several forums. I thought I’d summarize my arguments here. To me it looks all pretty straightforward.

From a principled stance, as to funding research, it is the funder’s prerogative who to fund. Often, grant proposals (funding requests) exceed available funds, so the funder needs to rank-order the grant proposals and typically will fund those ranked highest until the funds are exhausted. A private funder may use whatever criteria they deem appropriate. Public funding, i.e. taxpayer money, is more tricky as this is typically the government agencies setting policies that somehow rank-order funding proposals for a particular fund. It seems rather obvious to me that taxpayer money should be spent on something that benefits society. Hence, a grant proposal must promise some of that benefit. How it does this, can vary. I see at least two dimensions along which to argue: Immediacy (or risk) and impact. Something that believably provides benefits sooner is preferable to something that provides benefits later. Something that believably promises a higher impact is preferable to something that provides lower impact.

Thus, research that promises to cure cancer today is preferable over research that explains why teenage girls prefer blue over pink on Mondays and are generally unapproachable that day. Which is not to say that the teenage girl question might not get funded: Funders and funding are broad and deep and for everything that public agencies won’t fund there is a private funder whose pet peeve would be solving that question.

The value of research is always relative, never absolute, and always to be viewed within a particular evaluation framework.

Continue reading “Follow-up on the Discussions about Knowledge for Knowledge’s Sake”

The Downside of the “Knowledge for Knowledge’s Sake” Argument

On the PBS Newshour Duke University biologist Sheila Patek just made a passionate plea for “why knowledge for the pure sake of knowing is good enough to justify scientific research” using her own research into mantis shrimp as an example. While I support public funding for basic research, Patek makes a convoluted and ultimately harmful to her own case argument.

Continue reading “The Downside of the “Knowledge for Knowledge’s Sake” Argument”

Why You Should Not Cite Research Work on Wikipedia That is Not Freely Available

I recommend that Wikipedia articles do not reference research papers that are not freely available, just like research papers should not cite research work that is not freely available. Anyone who cites non-open-access, non-free research bases their work and argument on materials not accessible to the vast majority of people on this planet. By doing so, authors exclude almost everyone else from verifying and critiquing their work. They thereby stop science and progress dead in their tracks.

My advice is that authors need to understand that non-open-access, non-free research articles have not been published, they have been buried behind a paywall. With the vast majority of people not having access to such paid-for materials, any such buried article is not a contribution to the progress of science and should be ignored.

Continue reading “Why You Should Not Cite Research Work on Wikipedia That is Not Freely Available”