Challenges to Open Collaborative Data Engineering [HICSS 2023]

Abstract: Open data is data that can be used, modified, and passed on, for free, similar to open-source software. Unlike open-source, however, there is little collaboration in open data engineering. We perform a systematic literature review of collaboration systems in open data, specifically for data engineering by users, taking place after data has been made available as open data. The results show that open data users perform a wide range of activities to acquire, understand, process and maintain data for their projects without established best practices or standardized tools for open collaboration. We identify and discuss technical, community, and process challenges to collaboration in data engineering for open data.

Continue reading “Challenges to Open Collaborative Data Engineering [HICSS 2023]”

The Benefits of Pre-Requirements Specification Traceability [RE 2022]

Abstract: Requirements traceability is the ability to trace requirements to other software engineering artifacts. Traceability can be classified as either pre- or post-requirements specifications (RS) traceability. Pre-RS traceability is the ability to trace between requirements and their origin. However, the benefits of pre-RS traceability are often not clear. In this article, we systematically lay out the benefits of pre-RS traceability. We present results from both a literature review and a qualitative survey of practitioners involved with documenting and utilizing such trace links. We find that the benefits strongly depend on the practitioners, their tasks, and the project environment. Awareness of these relationships supports a clearer understanding of the benefits of pre-RS traceability and thus motivates successful implementation of the required practices. The results of our research motivates the adoption of pre-RS traceability and present problem areas for future research.

Continue reading “The Benefits of Pre-Requirements Specification Traceability [RE 2022]”

Calculating the Costs of Inner Source Collaboration by Computing the Time Worked [HICSS 2022]

Abstract: A key part of taxation, controlling, and management of international collaborative programming workflows is determining the costs of a supplied software artifact. The OECD suggests the use of the Cost Plus method for calculating these costs. However, in the past, this method has been implemented using only coarse-grain data from the costs of whole organizational units. Due to the move to inner source software development, we need a much more fine-grain solution for computing the detailed time spent on programming specific components. This is necessary, because a more accurate work time distribution is required to fulfill the fiscal and administrative challenges posed by collaborating across organizational boundaries. In this article, we present a novel method to determine the time spent on an individual code contribution (commit) to a software component for use within cost calculation, especially for taxation purposes. We demonstrate the usefulness of our approach by application to a real-world data set gathered at a large multi-national corporation. We evaluate our work through feedback received from this corporation and from the German Ministry of Finance.

Continue reading “Calculating the Costs of Inner Source Collaboration by Computing the Time Worked [HICSS 2022]”

A Validation of QDAcity‑RE for Domain Modeling Using Qualitative Data Analysis [RE Journal]

Abstract: Using qualitative data analysis (QDA) to perform domain analysis and modeling has shown great promise. Yet, the evaluation of such approaches has been limited to single-case case studies. While these exploratory cases are valuable for an initial assessment, the evaluation of the efficacy of QDA to solve the suggested problems is restricted by the common single-case case study research design. Using our own method, called QDAcity-RE, as the example, we present an in-depth empirical evaluation of employing qualitative data analysis for domain modeling using a controlled experiment design. Our controlled experiment shows that the QDA-based method leads to a deeper and richer set of domain concepts discovered from the data, while also being more time efficient than the control group using a comparable non-QDA-based method with the same level of traceability.

Continue reading “A Validation of QDAcity‑RE for Domain Modeling Using Qualitative Data Analysis [RE Journal]”

Creating and Growing Healthy Community Open Source Projects [PLoP 2020]

Abstract: This article presents a succinct and minimal handbook of best practices of how to create and grow community open source projects. We start with the assumption that the handbook’s user has a minimal but useful piece of software at hand that they want to open source and build a community around.

Keywords: Open source, open source projects, open source communities, creating open source projects, growing open source projects

Reference:  Riehle, D. (2020). Creating and Growing Community Open Source Projects. In Proceedings of the 27th Conference on Pattern Languages of Programs (PLoP 2020). ACM, 14 pages.

The paper can be downloaded as a PDF file.

Inner Source and Financial Compliance

Inner source is the use of open source practices within companies. Engineers generally love it, but any open-source-style collaboration across business unit boundaries will usually get stopped dead in its tracks by the financial compliance department. That’s because financial compliance is likely to worry that to the tax authorities such inner source collaboration will look like attempts at profit shifting.

Below, please find a 20min. presentation on inner source and transfer pricing that I prepared for a workshop at the German Ministry of Finance. It is aimed at non-technical people.

You can also skim the slides, though the video offers significantly more information. Feel free to shoot any questions you might have my way.

How to Make Finding Inner Source Projects Easy

In 2006, we set-up SAP forge to make finding and collaborating on inner source projects easy. The advice of how to design a forge or portal for this purpose hasn’t really changed over the years. The most important advice is:

Make the forge available at one place (and one place only) with a memorable URL like forge.acme.corp

The second most important advice is on the design of the home page of the forge. There are a couple of independent mechanisms that should be present. In order of descending importance (read: prominence of screen real estate given):

Continue reading “How to Make Finding Inner Source Projects Easy”

Industry Best Practices for Component Approval in Open Source Governance [EuroPLoP 2020]

Abstract: Increasingly companies realize the value of using free/libre and open source software (FLOSS) in their products, but need to manage the associated risks. Leading companies introduce open source governance as a solution. A key aspect of corporate FLOSS governance deals with choosing and evaluating open source components for use in products. Following an industry-based research approach, we present 13 best practices in the pattern format of context-problem-solutions paired with consequences. In this paper, we cover an excerpt of the Component Approval section of our FLOSS governance handbook. This article builds upon our previous EuroPLoP publication covering Component Reuse in FLOSS governance processes, as well as other publications on the topic. Analyzing qualitative data gathered from 15 expert interviews, we derive and interconnect the common industry recommendations for reviewing, tracking, and approving open source components in a company environment. We conclude by presenting workflow templates that put various best practices in relation to each other.

Keywords: Commercial use of open source, component approval, FLOSS, FOSS, industry best practice, open source software, open source governance, pattern language

Reference: Harutyunyan, N. & Riehle, D. (2020). Industry Best Practices for Component Approval in FLOSS Governance. In Proceedings of the 25th European Conference on Pattern Languages of Programs (EuroPLoP ’20). ACM, article 33.

The paper can be downloaded as a PDF file.