Why are there only two research groups working on inner source?

I got asked the other day why there are only two research groups working on inner source world-wide. Inner source is the use of open source best practices within companies, and it is a hot topic with many companies who want to go beyond agile. There was varied research around the world in the past 15 years, but only two groups really have been consistently working on this: Brian’s group at LERO and my research group at FAU.

Continue reading “Why are there only two research groups working on inner source?”

On the misuse of the terms qualitative and quantitative research

Researchers often use the term “qualitative research” to mean research without substantial empirical data, and use “quantitative research” to mean research with substantial empirical data. That doesn’t make sense to me, as most “qualitative researchers” will quickly point out, because qualitative research utilizes as much data in a structured way as it can. Everything else would not be research.

Continue reading “On the misuse of the terms qualitative and quantitative research”

Why “soft” research is “hard”

Some of my colleagues like to talk about how research that involves programming is “hard”, while research that involves human subjects is “soft”. Similarly, some colleagues like to call exploratory (qualitative) research “soft” and confirmatory (quantitative) research “hard”. Soft and hard are often used as synonyms for easy and difficult, and this is plain wrong.

Pretty much any research worth its salt is difficult in some way, and working with human subjects makes it even more difficult. I find research methods like qualitative surveys, involving interview analyses, for example, much harder than the statistical analysis of some data or some algorithm design. The reason is the lack of immediate feedback.

While you can (and should) put quality assurance measures in place for your interview analyses, ranging from basic member checking to complex forms of triangulation, it might take a long time until you learn whether what you did was any good. So you have to focus hard in your analysis without knowing whether you are on the right track. It doesn’t get more difficult than this.

Research papers vs. blog posts

A short Twitter thread:

Scientific research papers cite other research papers for related or prior knowledge they build on; they cite blog posts as primary material to work with in theory building; two very different things 1/4

A blog post needs to (a) properly use the scientific method and (b) be socially validated by peer review to be considered a scientific research article; for (b) it needs a traditional journal 2/4

The problem with research papers are for-profit publishers and the to-pay open access model; publishers are bottom feeding to increase revenue, hurting the reputation of science with low quality papers 3/4

Please don’t start another journal unless you have an answer that solves the conflict between for-profit publishing and academic publishing incentives @timoreilly 4/4

Research Questions on Product Management of Open Source in Commercial Products

I’m seeking advice on how to frame the research question for a research project (Ph.D. thesis) on software product management and open source. The simple heuristic “non-differentiating -> open source it, competitively differentating -> keep it closed” doesn’t cut it because of secondary effects like development efficiency resulting from open sourcing, market opportunities resulting from platform compatiblity, etc.

The best I could come up with so far are three different but related questions. These are:

  1. For a non-differentiating function, which open source component to use?
  2. For a chosen open source component, how to manage this dependency?
  3. For a competitively differentiating function, when to open source?

Questions 1 and 2 are well-defined. Question 3 remains unwieldy. The heuristic mentioned above would answer “never”, but this is not true, as explained. Overall competitive situation and compatibility considerations may still lead to open sourcing unique intellectual property.

I’m seeking comments as to how practitioners (or other researchers) would look at this question. Any comments are appreciated.

The Downside of the “Knowledge for Knowledge’s Sake” Argument

On the PBS Newshour Duke University biologist Sheila Patek just made a passionate plea for “why knowledge for the pure sake of knowing is good enough to justify scientific research” using her own research into mantis shrimp as an example. While I support public funding for basic research, Patek makes a convoluted and ultimately harmful to her own case argument.

Continue reading “The Downside of the “Knowledge for Knowledge’s Sake” Argument”

Having Fun Thinking About AI Challenges

“AI” (or just smart algorithms, if you will, where smart will be plain in a few years and dumb in 10 years) is on the rise, no doubt about it. As a consequence, I’ve been having fun with “AI challenges” of the sort: Could a computer figure this out? As an example, take a look at the advertisement below. It is for a conference of University chancellors in Germany (administrative leaders of their universities). Could a computer figure out the disconnect between the depicted young people, presumably students, and the more advanced-in-years chancellors of their universities?

Continue reading “Having Fun Thinking About AI Challenges”

Internal vs. External Validity of Research Funding

So far, most of my research funding has been from industry. Sometimes, I have to defend myself against colleagues who argue that their public funding is somehow superior to my industry funding. This is only a sentiment; they have not been able to give any particular reason for their position.

I disagree with this assessment, and for good reason. These two types of funding are not comparable and ideally you have both.

In research, there are several quality criteria, of which the so-called internal and external validity of a result are two important ones.

  • Internal validity, simplifying, is a measure of how a result is consistent within itself. Are there any contradictions within the result itself? If not so, than you may have high internal validity (which is good).
  • External validity, simplifying, is a measure of how a result is predictive or representative of the reality outside the original research data. If so, your result may have high external validity, which is also good.

Public grants have high internal validity but low external validity. In contrast, industry funding has high external validity, but low internal validity. The following figure illustrates this point:

Continue reading “Internal vs. External Validity of Research Funding”

Fraudulent Publishers not Missing a Beat in 2015

Unbelievable. About everything in this Call for Papers and the website being linked to is screaming fraud. However, it is so badly done that I can only assume that someone is turning the Scigen experiment on its head.

Once Again Natural vs Engineering Sciences Struggeling over Definitions #FSE2014

I’m in Hong Kong, attending FSE 2014. I had signed up for the Next-Generation Mining-Software-Repositories workshop at HKUST but missed it for (undisclosed) reasons. Apparently there were two main topics of dicussion:

  • Calls by colleagues to make mining work “useful” rather than “just” interesting
  • Calls by colleagues to build tools rather than “just” generate insight

Both issues are joined at the hip and an expression of a struggle over the definition of “what is good science” in software engineering. As someone who started out as a student of physics, I have an idea of science that views “interesting insights” as useful in their own right: You don’t need to build a tool that shows your insight improves the world. On the other end is the classic notion of engineering science, where there is no (publishable) research if you don’t improve the world in some tangible way.

Continue reading “Once Again Natural vs Engineering Sciences Struggeling over Definitions #FSE2014”