Why Open Source is Good for German Software Businesses

I’m on the expert advisory committee of one of the German parties for the current “Internet Enquette”, a commission tasked by the German parliament with suggesting future directions for Germany’s stance toward the Internet and everything digital. At a meeting this evening, a lobbyist confided in me: “Open source is bad for German software vendors!” I gasped. He couldn’t be further from the truth. If this was mechanical engineering or electrical engineering, he’d be right. ME? EE? Germany is top. Software? Not so. Beyond a few selected highlights, Germany is an also-ran internationally. When it comes to software product businesses, German companies would benefit significantly if the dice would be rolled again. Anything that upsets the current order can only be an improvement over the current state of affairs. Open source does just that. More power to open source business models!

Developer Belief vs. Reality: The Case of the Commit Size Distribution

Abstract: The design of software development tools follows from what the developers of such tools believe is true about software development. A key aspect of such beliefs is the size of code contributions (commits) to a software project. In this paper, we show that what tool developers think is true about the size of code contributions is different by more than an order of magnitude from reality. We present this reality, called the commit size distribution, for a large sample of open source and selected closed source projects. We suggest that these new empirical insights will help improve software development tools by aligning underlying design assumptions closer with reality.

Reference: Dirk Riehle, Carsten Kolassa, Michel A. Salim. “Developer Belief vs. Reality: The Case of the Commit Size Distribution.” In Proceedings of Software Engineering 2012 (SE ’12). Springer Verlag, 2012.

The paper is available as a PDF file. The survey used in the paper is also available as a PDF file.

Business Risks and Governance of Open Source in Software Products (in German)

Titel: Geschäftsrisiken und Governance von Open-Source in Softwareprodukten

Zusammenfassung: In fast jedem Softwareprodukt, auch in großer Standardsoftware, sind heute Open-Source-Komponenten enthalten. Die Hersteller dieser Software müssen die Geschäftsrisiken, die mit der Integration von Open-Source-Software in kommerzielle Produkte verbunden sind, verstehen und vernünftig managen. Dieser Artikel zeigt ein Modell verschiedener rechtlicher, technischer und sozialer Risiken auf, die durch unkontrollierten Einsatz von Open-Source-Software entstehen und erläutert ausgewählte Erfolgsmethoden der Open-Source-Governance, die von führenden Firmen angewandt werden. Das Modell ist das Analyseergebnis von fünf mit großen deutschen Softwareherstellern geführten Interviews sowie weiterer Literaturrecherche.

Continue reading

Call for Papers: OSS 2012

For your convenience, the OSS 2012 call for papers (I’m on the program committee).


THE 8th INTERNATIONAL CONFERENCE ON OPEN SOURCE SYSTEMS

Hammamet, Tunisia, 10-13 September 2012

Scope of OSS 2012

Over the past two decades, Free/Libre Open Source Software (FLOSS) has introduced new successful models for creating, distributing, acquiring and using software and software-based services. Inspired by the success of FLOSS, other forms of open initiatives have been gaining momentum. Open source systems (OSS) now extend beyond software to include open access, open documents, open science, open education, open government, open cloud, open hardware, open artworks and museum exhibits, open innovation and more. On the one hand, the openness movement has created new kinds of opportunities such as the emergence of new business models, knowledge exchange mechanisms, and collective development approaches. On the other hand, the movement has introduced new kinds of challenges, especially as different problem domains embrace openness as a pervasive problem solving strategy. OSS can be complex yet widespread and often cross-cultural. Consequently, they require an interdisciplinary understanding of their technical, economic, legal and socio-cultural dynamics.

Continue reading

Cloud Computing is not a Business Model

I’m at the Dagstuhl Seminar “Information Management in the Cloud” where I keynoted about cloud computing businesses models. Given that I’m hardly a cloud computing expert this may seem like a stretch, however, the organizers had asked me to talk about my open source experience and relate this to cloud computing. This perspective turned out to be surprisingly fruitful. By realizing that both open source and cloud computing are disruptive innovations that enable a new generation of business models, I believe I was able to draw reasonable conclusions on the future of cloud computing from the history of open source. I reason by analogy, and here are the main conclusions:

Continue reading

Design and Implementation of the Sweble Wikitext Parser: Unlocking the Structured Data of Wikipedia

Abstract: The heart of each wiki, including Wikipedia, is its content. Most machine processing starts and ends with this content. At present, such processing is limited, because most wiki engines today cannot provide a complete and precise representation of the wiki’s content. They can only generate HTML. The main reason is the lack of well-defined parsers that can handle the complexity of modern wiki markup. This applies to MediaWiki, the software running Wikipedia, and most other wiki engines. This paper shows why it has been so difficult to develop comprehensive parsers for wiki markup. It presents the design and implementation of a parser for Wikitext, the wiki markup language of MediaWiki. We use parsing expression grammars where most parsers used no grammars or grammars poorly suited to the task. Using this parser it is possible to directly and precisely query the structured data within wikis, including Wikipedia. The parser is available as open source from http://sweble.org.

Keywords: Wiki, Wikipedia, Wiki Parser, Wikitext Parser, Parsing Expression Grammar, PEG, Abstract Syntax Tree, AST, WYSIWYG, Sweble.

Reference: Hannes Dohrn and Dirk Riehle. “Design and Implementation of the Sweble Wikitext Parser: Unlocking the Structured Data of Wikipedia.” In Proceedings of the 7th International Symposium on Wikis and Open Collaboration (WikiSym 2011). ACM Press, 2011.

The paper is available as a PDF file (preprint).

Technical Report on WOM: An Object Model for Wikitext

Abstract: Wikipedia is a rich encyclopedia that is not only of great use to its contributors and readers but also to researchers and providers of third party software around Wikipedia. However, Wikipedia’s content is only available as Wikitext, the markup language in which articles on Wikipedia are written, and whoever needs to access the content of an article has to implement their own parser or has to use one of the available parser solutions. Unfortunately, those parsers which convert Wikitext into a high-level representation like an abstract syntax tree (AST) define their own format for storing and providing access to this data structure. Further, the semantics of Wikitext are only defined implicitly in the MediaWiki software itself. This situation makes it difficult to reason about the semantic content of an article or exchange and modify articles in a standardized and machine-accessible way. To remedy this situation we propose a markup language, called XWML, in which articles can be stored and an object model, called WOM, that defines how the contents of an article can be read and modified.

Keywords: Wiki, Wikipedia, Wikitext, Wikitext Parser, Open Source, Sweble, Mediawiki, Mediawiki Parser, XWML, HTML, WOM

Reference: Hannes Dohrn and Dirk Riehle. WOM: An Object Model for Wikitext. University of Erlangen, Technical Report CS-2011-05 (July 2011).

The technical report is available as a PDF file.

On the Open Cloud Principles: Every Real-World Specification is an Underspecification

Trying to wrap my head around the Open Cloud Principles put out by the revamp of the Open Cloud Initiative, I’m happy to note that software engineering research has something to say to the challenges these principles will face.

Every real-world specification is an underspecification.

So, well, I say that, but I doubt that I’m the first one to have learned this from 30+ years of software engineering research. This principle leads us directly to the challenges anyone is facing who is trying to be truthful to the intentions behind the Open Cloud Principles.

Continue reading

Controlling and Steering Open Source Projects

The IEEE just published a short version of the “control points and steering mechanisms” article. Here is the abstract. Please see the original for more details.

Abstract: Open source software has become an important part of the software business. In a 2009 survey, Forrester Research found that 46 percent of all responding enterprises were using or implementing open source software. Moreover, in 2009, the Gartner Group estimated that by 2012, at least 80 percent of all software product firms will use open source software. Thus, it’s important to understand how software product firms depend on open source and how they manage that dependency to meet their business goals. There are three main types of software product firms. [...]

Continue reading