Now that Hacktoberfest 2022 is over, it’s time to celebrate our contributors, look back at what we’ve achieved together, and share what we’ve learned during this month of giving back to the community through contributing to open source.
What is Hacktoberfest
Hacktoberfest is an annual event to celebrate open-source and encourage contributions, organized by DigitalOcean. For the 9th time in a row, once a year and for the whole of October, the digital community comes together to contribute to open-source projects. Whether you are a first-timer or an experienced contributor, Hacktoberfest is an excellent reason to commit a pull request! You can read more about the event here.
How a DS can contribute to open source
Though many Data Scientists are eager to contribute to open source, they often have fewer opportunities due to the nature of contributor tasks. There are simply not so many bugs or issues that require expertise in data, algorithms, or stats compared to wider software engineering skills.
For Hacktoberfest, we prepared special issues allowing Data Scientists to play to their strengths and dip their toes into open-source contribution. We asked contributors to help us add new statistical metrics and tests to detect data drift for production ML models. We developed guidelines and clear examples to ensure that both first-timers and seasoned contributors were welcome.
Call for contributions: data drift methods
We invited contributors to help us add new data drift metrics and tests to the open-source Evidently library. In addition to tests already available in the library, we proposed five more methods to add:
- Anderson-Darling stat test
- Cramér-von-Mises (CVM) stat test
- Fisher’s Exact test
- Hellinger distance
- 2-sample t-test for means comparison
We also asked contributors to be creative and opened a bonus issue to add new drift detection methods that one considers useful and prefers over existing methods in their daily practice.
Whatever the choice of a data drift method, each contribution should have contained the following:
- Test implementation
- Unit test
- Documentation update
- Notebook example update
What happened next?
We were astonished by the enthusiasm and dedication of our contributors! We are proud to say that we received 22 pull requests and merged 16 of them.
Thanks to our amazing contributors, Evidently now has 8 new methods to detect data drift. To celebrate our contributors and support the Hacktoberfest tradition of giving back to the global community, we planted 24 trees on 5 continents 🌳
[fs-toc-omit]A HUGE thank you to our contributors!
Though we could not merge all of the PRs, we are grateful to everyone who participated! Every pull request and every contribution counts. Evidently is built with the help of the community, and thanks to you, it becomes a better library, one contribution at a time.
Meet Evidently contributors
To express our gratitude, we invite you to celebrate every contribution together. Please give a warm welcome to our contributors! We asked some of them to share what drives them to contribute to open source, and here’s what they answered:
Applied Machine Learning and Explainability specialist
Added 4 tests to the Evidently library
“These are the things I can think of right now:
1. Understanding the tools and software deeply that we use regularly in the lifecycle.
2. Get to interact with people smarter than me and learn from them.
3. Understand smarter ways of implementing things from reading code written by others.
4. There are open-source libraries that keep up with the most recent changes in ways to implement and implement them, so we get to see it first hand in action and understand the pros and cons.
5. Personal satisfaction.”
Added 3 tests to the Evidently library
“For me, there are many things:
1. Learning and becoming a better Data Scientist
2. Supporting Open-source tools
3. Having fun working with Data (and the community) on Deploying concepts, especially Monitoring. The more I work with monitoring concepts, the more I like them.”
Machine Learning Engineer
Contributed to Fisher's Exact Test at the Evidently library
“I'm a Community-Driven person. I like open-source communities, and I love what we — as a community — are doing: improving tools that help data scientists do a daily job, learning together, becoming more professional, and writing clean code!”
Inderpreet Singh Chhabra
Data Scientist, Ex-Scientist @ Indian Space Research Organisation (ISRO)
Added Hellinger distance to the Evidently library
“Two principles from the physicist Richard Feynman resonate with me the most and I try to follow them:
1. What I cannot create I do not understand.
2. Know how to solve every problem that has been solved.
Working as a data scientist, I regularly come across many algorithms. I believe that to ensure I have understood them completely, writing a working code is important. Contributing to open source provides me with such opportunities and allows me to become a better data scientist through feedback from people with more experience than me. [It also allows] keeping up to date with the latest tech and frameworks. Also, it gives the satisfaction of contributing to something useful.”
Can I contribute after Hacktoberfest?
Sure! Evidently is an open-source project and is always open for contributions. Check our contribution guide to know where to start.
Join our Discord community
Jump to our Discord #-evidently-contribute channel to chat with fellow contributors and get support from maintainers.
[fs-toc-omit]Want to know more about Evidently?
Evidently is an open-source Python library for data scientists and ML engineers. It helps evaluate, test, and monitor the performance of ML models from validation to production. You can check it out on GitHub or explore the documentation.