Do Engineering Researchers Care About Truth?

So ICSE, the top software engineering conference, rejected our paper, again. The reviewers were actually quite positive: high-quality work, little or no flaws, interesting. One of the reviewers found the paper’s results surprising, asked for more details, and suggested new research directions. The final conclusion of both reviews, however, was the same: The work has no merit because it only explains the world, it does not improve it.

Our paper provides a high-quality model of a key aspect of programming behavior in open source, basically the modeling behind this earlier empirical paper. As such, it is a descriptive empirical paper. It takes a large amount of data and provides an analytically closed model of the data so that we can explain or predict the future (better). That’s pretty standard operating procedure in most of natural and social sciences.

Continue reading

Developer Belief vs. Reality: The Case of the Commit Size Distribution

Abstract: The design of software development tools follows from what the developers of such tools believe is true about software development. A key aspect of such beliefs is the size of code contributions (commits) to a software project. In this paper, we show that what tool developers think is true about the size of code contributions is different by more than an order of magnitude from reality. We present this reality, called the commit size distribution, for a large sample of open source and selected closed source projects. We suggest that these new empirical insights will help improve software development tools by aligning underlying design assumptions closer with reality.

Reference: Dirk Riehle, Carsten Kolassa, Michel A. Salim. “Developer Belief vs. Reality: The Case of the Commit Size Distribution.” In Proceedings of Software Engineering 2012 (SE ’12). Springer Verlag, 2012.

The paper is available as a PDF file. The survey used in the paper is also available as a PDF file.

Business Risks and Governance of Open Source in Software Products (in German)

Titel: Geschäftsrisiken und Governance von Open-Source in Softwareprodukten

Zusammenfassung: In fast jedem Softwareprodukt, auch in großer Standardsoftware, sind heute Open-Source-Komponenten enthalten. Die Hersteller dieser Software müssen die Geschäftsrisiken, die mit der Integration von Open-Source-Software in kommerzielle Produkte verbunden sind, verstehen und vernünftig managen. Dieser Artikel zeigt ein Modell verschiedener rechtlicher, technischer und sozialer Risiken auf, die durch unkontrollierten Einsatz von Open-Source-Software entstehen und erläutert ausgewählte Erfolgsmethoden der Open-Source-Governance, die von führenden Firmen angewandt werden. Das Modell ist das Analyseergebnis von fünf mit großen deutschen Softwareherstellern geführten Interviews sowie weiterer Literaturrecherche.

Continue reading

Call for Papers: OSS 2012

For your convenience, the OSS 2012 call for papers (I’m on the program committee).


THE 8th INTERNATIONAL CONFERENCE ON OPEN SOURCE SYSTEMS

Hammamet, Tunisia, 10-13 September 2012

Scope of OSS 2012

Over the past two decades, Free/Libre Open Source Software (FLOSS) has introduced new successful models for creating, distributing, acquiring and using software and software-based services. Inspired by the success of FLOSS, other forms of open initiatives have been gaining momentum. Open source systems (OSS) now extend beyond software to include open access, open documents, open science, open education, open government, open cloud, open hardware, open artworks and museum exhibits, open innovation and more. On the one hand, the openness movement has created new kinds of opportunities such as the emergence of new business models, knowledge exchange mechanisms, and collective development approaches. On the other hand, the movement has introduced new kinds of challenges, especially as different problem domains embrace openness as a pervasive problem solving strategy. OSS can be complex yet widespread and often cross-cultural. Consequently, they require an interdisciplinary understanding of their technical, economic, legal and socio-cultural dynamics.

Continue reading

Call for Papers: ECOOP 2012

For your convenience, the ECOOP 2012 call for papers (I’m on the program committee).


Call for Papers 征稿启事

The European Conference on Object-Oriented Programming (ECOOP) is the premium international conference covering all areas of object technology and related software development technologies. ECOOP 2012 will take place from 11-16 June, 2012 in Beijing, China — only the second time ECOOP has been held outside Europe. ECOOP 2012 embraces a broad range of topics related to object-orientation, including:

Continue reading

Call for Papers: Software Product Lines (SPLC 2012)

For your convenience, the SPLC 2012 call for papers (I’m on the program committee).


Call for Contributions (SPLC 2012)

We invite the following classes of contributions:

  1. Research papers: (max. 10 pages, 5 for short papers) describe original research contributions (theoretical, conceptual) to the field of software product line engineering. We also call for short research papers, which are intended to report ideas in their early stages. Submission deadline: Feb. 20th, 2012.
  2. Continue reading

Design and Implementation of the Sweble Wikitext Parser: Unlocking the Structured Data of Wikipedia

Abstract: The heart of each wiki, including Wikipedia, is its content. Most machine processing starts and ends with this content. At present, such processing is limited, because most wiki engines today cannot provide a complete and precise representation of the wiki’s content. They can only generate HTML. The main reason is the lack of well-defined parsers that can handle the complexity of modern wiki markup. This applies to MediaWiki, the software running Wikipedia, and most other wiki engines. This paper shows why it has been so difficult to develop comprehensive parsers for wiki markup. It presents the design and implementation of a parser for Wikitext, the wiki markup language of MediaWiki. We use parsing expression grammars where most parsers used no grammars or grammars poorly suited to the task. Using this parser it is possible to directly and precisely query the structured data within wikis, including Wikipedia. The parser is available as open source from http://sweble.org.

Keywords: Wiki, Wikipedia, Wiki Parser, Wikitext Parser, Parsing Expression Grammar, PEG, Abstract Syntax Tree, AST, WYSIWYG, Sweble.

Reference: Hannes Dohrn and Dirk Riehle. “Design and Implementation of the Sweble Wikitext Parser: Unlocking the Structured Data of Wikipedia.” In Proceedings of the 7th International Symposium on Wikis and Open Collaboration (WikiSym 2011). ACM Press, 2011.

The paper is available as a PDF file (preprint).

Technical Report on WOM: An Object Model for Wikitext

Abstract: Wikipedia is a rich encyclopedia that is not only of great use to its contributors and readers but also to researchers and providers of third party software around Wikipedia. However, Wikipedia’s content is only available as Wikitext, the markup language in which articles on Wikipedia are written, and whoever needs to access the content of an article has to implement their own parser or has to use one of the available parser solutions. Unfortunately, those parsers which convert Wikitext into a high-level representation like an abstract syntax tree (AST) define their own format for storing and providing access to this data structure. Further, the semantics of Wikitext are only defined implicitly in the MediaWiki software itself. This situation makes it difficult to reason about the semantic content of an article or exchange and modify articles in a standardized and machine-accessible way. To remedy this situation we propose a markup language, called XWML, in which articles can be stored and an object model, called WOM, that defines how the contents of an article can be read and modified.

Keywords: Wiki, Wikipedia, Wikitext, Wikitext Parser, Open Source, Sweble, Mediawiki, Mediawiki Parser, XWML, HTML, WOM

Reference: Hannes Dohrn and Dirk Riehle. WOM: An Object Model for Wikitext. University of Erlangen, Technical Report CS-2011-05 (July 2011).

The technical report is available as a PDF file.

On the Open Cloud Principles: Every Real-World Specification is an Underspecification

Trying to wrap my head around the Open Cloud Principles put out by the revamp of the Open Cloud Initiative, I’m happy to note that software engineering research has something to say to the challenges these principles will face.

Every real-world specification is an underspecification.

So, well, I say that, but I doubt that I’m the first one to have learned this from 30+ years of software engineering research. This principle leads us directly to the challenges anyone is facing who is trying to be truthful to the intentions behind the Open Cloud Principles.

Continue reading