Springer Verlag by way of its incompetence to properly edit manuscripts has been a royal pain in my butt for a long-time. In the most egregious example, one of their editors changed the title of what was a crowning paper of many years of research work. He turned “open source” to “open course”, completely altering the focus of the paper as suggested by the title. I was not given a final proof-reading chance after that change and only found out about it when I saw the paper on their website. When I complained, Springer steadfastly refused to change the title to the correct original wording and only filed an Erratum that everybody thereafter of course ignored.
Abstract: Detecting and understanding changes between document revisions is an important task. The acquired knowledge can be used to classify the nature of a new document revision or to support a human editor in the review process. While purely textual change detection algorithms offer fine-grained results, they do not understand the syntactic meaning of a change. By representing structured text documents as XML documents we can apply tree-to-tree correction algorithms to identify the syntactic nature of a change. Many algorithms for change detection in XML documents have been propsed but most of them focus on the intricacies of generic XML data and emphasize speed over the quality of the result. Structured text requires a change detection algorithm to pay close attention to the content in text nodes, however, recent algorithms treat text nodes as black boxes. We present an algorithm that combines the advantages of the purely textual approach with the advantages of tree-to-tree change detection by redistributing text from non-over-lapping common substrings to the nodes of the trees. This allows us to not only spot changes in the structure but also in the text itself, thus achieving higher quality and a fine-grained result in linear time on average. The algorithm is evaluated by applying it to the corpus of structured text documents that can be found in the English Wikipedia.
Keywords: XML, WOM, structured text, change detection, tree matching, tree differencing, tree similarity, tree-to-tree correction, diff
Reference: Hannes Dohrn, Dirk Riehle. “Fine-grained Change Detection in Structured Text Documents.” In Proceedings of the 2014 Symposium on Document Engineering (DocEng 2014). To appear.
The paper is available as a PDF file.
A German court ordered Uber to stop offering its taxi services (for now). The argument was as to be expected: Uber taxi drivers and cars are not fit for the job. This is definitely the right decision under the assumption that the German taxi approval rules make sense. Even if the court decision stands, this is not the end of Uber (nor Lyft nor AirBnB nor Wimdu as same or similar business model based companies).
I see two distinct innovations in Uber’s model:
- Higher service quality (mostly improved convenience) through the Uber app (and the system behind it)
- Lower costs of operations by utilizing drivers and cars who couldn’t or wouldn’t become taxi drivers
I think #1 is a justified and sustainable advantage: It just is easier to use an app rather than the phone and the more efficient and feedback-based system behind it. If Uber was a regular taxi service accessible through this app, it would already kill the market.
In the most recent CACM editor’s letter, Moshe Vardi, the CACM’s editor-in-chief, addresses the question of open access from the perspective of the ACM . The ACM is a non-profit organization for (mostly) computer scientists, and a publisher of conference proceedings and journals.
I find the editorial rather disconcerting. Vardi views “the open access movement” as being in “the IP communist camp”. There are so many things wrong this terminology. For one, I didn’t know there was one open access movement; I see many different streams of activity. Then, using 19th century terminology like communists and capitalists isn’t really going to help either; if meant as a provocation it probably achieves its goal, but to what end does this provocation help us? I’m a proponent of open access and most certainly don’t consider myself an “IP communist”. Finally, by pigeonholing well-intentioned efforts as a communist endeavor, it wholly ignores the struggle for new and innovative models of publishing research.
Aufruf zur Einreichung von Beiträgen für den Software-Engineering-Ideen-Track der SE 2015. (Ich bin der Programkomiteevorsitzende.)
Über die SE 2015 und den SEI-Track
Die Software-Engineering-Tagung ist das wichtigste jährliche Treffen der Software-Engineering-Community im deutschsprachigen Raum. Ein besonderer Schwerpunkt der SE 2015 ist das Thema “Sichere cyber-physikalische Systeme”.
Das Ziel des Software-Engineering-Ideen-Tracks ist, ein Forum für die Präsentation von vielversprechenden Ideen und Innovationen im Bereich des Software-Engineering bereitzustellen, welche noch nicht vollständig implementiert oder evaluiert wurden. Die Beiträge können eine Forschungsidee, erste Resultate einer Dißertation oder eine formativ durchgeführte Fallstudie präsentieren, wobei der Bezug auf ein zukünftiges Forschungsfeld, ein neuartiges Werkzeug, eine neue Methode oder die neuartige Zusammenarbeit mit anderen Disziplinen erkennbar sein sollte.
I’ll be keynoting the European Conference on Software Engineering Education on Nov 28, 2014, at 11:00 Uhr, at Seeon Monastery, Germany. Here is the abstract. See you at the conference!
Over the last few years, we have shifted most of our courses from traditional upfront lecturing to project-based learning. Each course consists of multiple projects with three main stakeholders: students, teachers, and industry. Using AMOS, our “agile methods and open source” software engineering course as the example, we review our course concept and discuss our experiences. We take the perspectives of the three stakeholders in turn: Achieving learning goals and performing meaningful work (students), fulfilling both an educational and an economic mission (university), and receiving a return on time and monetary investment (industry). The perhaps surprising result is that these three perspectives can work together well and make reaching each stakeholder’s goal easier.
We recently discussed our approach on my research group’s website as the “Lehrkonzept der Praktischen Softwaretechnik an der FAU” (in German).
I’ll be keynoting the 16th KKIO Software Engineering Conference on Sept 22, 2014, in Posnan, Poland. Here is the abstract. See you at the conference!
MySQL was sold for one billion US-dollar. Red Hat is worth a multiple of that. The Eclipse Foundation has pushed many software tool vendors out of business. How come that open source, a phenomenon dubbed “temporary” not only has become sustainable but the business strategy of choice? In this talk, I discuss the four main business models, two for-profit and two not-for-profit, that have made open source sustainable. These models are changing the business of software and the future of our industry.