In this post, I will be summarizing the work which was done during all the coding periods of the Google Summer of Code 2020 program along with the results and future work. This post would also serve as the final report for the project and contains the information of all my contributions during the program.
It has been more than a year since I was involved with CHAOSS. I applied for GSoC last year too, but I didn't get selected. I continued contributing to the organization even after that. I was in my Bachelor’s final year and was probably the last chance to apply for the program. I decided to stick to the same org and started working early.
The mentors (Valerio) were supportive during the application period, and I submitted the proposal for the QM idea (287). It was on May 4, the results were out, and I got selected for the project. I prepared myself for an exciting summer ahead.
You can read more about it here, aceptación gsoc.
GrimoireLab is a powerful open-source platform that provides support for monitoring and in-depth analysis of software projects. It produces a rich set of dashboards, which can be easily inspected by decision-makers to help them understand the evolution and health of their projects. Despite the large set of dashboards available in GrimoireLab, comparing projects between each other is not straightforward since it requires navigating and drilling down the data in different dashboards.
On the other hand, Quality Models are well-known tools useful to evaluate a project, since they allow decision-makers to understand its status without diving into low-level details. Nevertheless, due to the lack of tooling and missing properties not covered by the standard quality models, operationalizing a quality model and applying it to a specific scenario may turn into a difficult experience.
Prosoul is an open-source tool that lets users create, manage, and visualize their quality models, operationalizing the latter with software metrics coming from a set of target projects. Prosoul's quality model is composed of coarse-level goals (e.g., sustainability), each one defined as a set of fine-grained attributes (e.g., activity, community), which can be derived from one or more low-level metrics (e.g., number of commits, number of active users).
The main aim of the project is to design an approach to shape the GrimoireLab data in a format that can easily be consumed by Prosoul and implement it on the data obtained from a few data sources like git, github and mailing list repositories to obtain simple quality models.
Before the starting of the project, I didn’t have much understanding of how Prosoul works. I had a good experience of working with the GrimoireLab and its components. So, I spent some time learning how Prosoul performs the assessments based on the quality models and the CROSSMINER data.
I have to design the structure of the data that can be easily consumed by Prosoul without much change in its architecture. I took the reference of the CROSSMINER data that was supported by Prosoul and came up with an initial format of the data that refined over time.
I worked on GrimoireLab-ELK and created a few enrichers (gitqm and few metrics in gitlabqm) that can generate the enriched data from the raw data according to the format that was required. I integrated them with the toolchain.
I have performed a few test assessments using the enriched data and demo quality models created using the implemented metrics. I worked on making a dashboard to have a better understanding of the enriched data. The dashboard shows different visualizations of the enriched data.
I made the results dashboard. The assessment results are stored as indexes. I took inspiration from the existing CROSSMINER dashboard and built a dashboard on top of that. The dashboard has bar and heat map visualizations. I have performed the first pilot study on the projects of amFOSS that includes two data sources, git & gitlab, and used the Developer Quality Model for the pilot study.
We discussed the metrics of the enrichers, and I completed implementing them in the respective enrichers. (gitlabqm, pipermailqm, and meetup). I worked on adding tests to the existing enrichers.
We decided to perform the second pilot study on GitLab.org that involves the gitlab & meetup data sources and customized Developer Quality Model. As the projects were not related, I had to perform two separate studies.
After the study, we found a few implementation flaws in the enrichers while handling large repositories. I spent some time debugging the issues and fixed them.
We checked the results of the pilot study and evaluated the usage of the existing visualizations. We decided to remove the project pie chart visualization.
We planned to add support for another data source, github. It would have the same metrics as that for gitlab. So, I worked on the githubqm enricher and implemented all the metrics.
As the github enricher is now ready, we planned for a third pilot study on the CHAOSS software projects. The involved data sources are git & github and used the Developer Quality Model.
There is a characteristic called threshold for each metric, which allocates the score to the project. I analyzed the data range for each metric and used a specific threshold for each metric.
From this study, we found a limitation to the approach. The results may not be accurate with the work done on the forked projects. Kibiter is a soft-fork of the Kibana project, so it had the commits from the upstream project too. In this case, the evaluation of the actual work done on the Kibiter can differ.
Towards the end, I worked on adding the documentation of the project. One correct place was the Prosoul repository and also the qm folder in the GrimoireLab-ELK. I revamped the Prosoul with GrimoireLab doc and updated it with the latest workflow of the project.
Even after GSoC ends, I look forward to continuing to work with the community and contribute to GrimoireLab.
I created a project tracker, vchrombie/gsoc for storing the information about the project. We had a meeting every Thursday at the
#grimoirelab channel on Freenode IRC. You can find the meeting logs in the meetings directory. You can find the weekly work reports in the work directory. All the discussions happened using the GitHub issues, and I used the project board to track the progress of the work.
You can check out all my weekly blog updates on my blog, GSoC related posts.
Please feel free to comment below if you have any opinions/suggestions. :)