We host regular Ask-Me-Anything series, where ML practitioners share their experiences with the Evidently Community.
Last week, our guest was Matt Squire, a CTO and Co-founder of Fuzzy Labs, a company that helps productionise ML models using open-source MLOps tools. We chatted about MLOps maturity, tools, pain points, and further development trajectories.
Sounds interesting? Read on for the recap of the AMA with Matt.
From speaking to teams, what do you think is the average MLOps maturity? And can you break it down per industry or use case?
I feel like it's pretty nascent right now. It feels very much like when DevOps was a new thing, and developers were starting to figure out things like infrastructure-as-code or orchestration.
And at that time, Docker and Terraform didn't even exist! So that makes me wonder what MLOps things are waiting to be invented.
As for industry breakdown, I think businesses whose product is ML are further along in maturity out of necessity. Tech-focused industries are too, inevitably.
Who do you see, anecdotally, to be the runner-up in ML adoption after tech industries? Are these financial companies? Retail?
We've had interesting conversations with a few retail-based companies who all have implemented or planned AI use cases, so maybe that. It seems like regulation is a slowing factor for finance and also medical.
The whole "maturity" conversation is currently defined by Microsoft's and Google's somewhat famous (but slightly abandoned) writeups. Do you think there's room for non-hyperscale companies and initiatives to take back some of the ground and redefine the terms of admission in terms of 'maturity'?
I definitely sympathise with this. Reminds me of this blog.
I'm scared of open-source-code-based business models. We have examples of companies that have survived and succeeded with open-source products. However, some more insight will be helpful. What kind of companies/products do you think can afford to be open source, and what can't?
This is a great question! Looking at it from the point of view of the tech industry as a whole, I think open source always wins out. Almost everything we rely on today is built on open source.
So there's perhaps an education piece in making non-techies more aware that this is very much a norm in the tech world.
There are definitely unique challenges to building a business on top of open source, and people would be right to fear that an open-source company would disappear. That said, there are some pretty well-proven examples of where it works, including big companies such as Red Hat.
It seems like support + enterprise offerings work well. The open-source product becomes famous through the community of people using it, and enough enterprises will then want to pay for support.
If there is space for open-core models that do not target the technical user, specifically—for example, feature stores with business analysts in mind or similar?
I see them as being as much for data scientists as business analysts. I'm curious whether other people see it differently, though?
What makes a good open-source tool great? E.g., is it great documentation, a lot of examples, elegant API, large community?
All of those, but also I think it's important for the tool to have a clear purpose, and not try to do too many different things. Things should be Unix-like, in my view.
Which tool do you think we as an MLOps community need but do not have?
Rather than a tool, one thing I'd like to see is a set of good bare-bones examples. That is, a Git repo that if I clone it and follow the steps, I'll have a model training pipeline that trains and deploys "hello world", with monitoring and data versioning. Using open source, of course :)
It is not just documentation. These examples can be code representing actual infrastructure, pipelines, etc.
What do you think is a reasonable number of tools to stitch together when making an internal MLOps platform? Is it 3-4 (e.g., one for workflow management, one for experiment management, one for deployment, one for monitoring), or could it be 10+?
If I'm honest, this is something we're still experimenting with. But my intuition is that it's not a big number. I think the role of MLOps engineer includes making everybody else better and more efficient by sharing the principles and patterns with the wider tech team, so data scientists and other engineers. That is to say, bring everybody else into the MLOps fold, instead of building a big MLOps team separately.
I imagine there is some ideal "ELK" stack the industry would converge to. How many categories will it have? What's your bet?
The MLOps ELK probably still includes the K :) I reckon the ingredients need to include:
- Pipelines / workflows / orchestration
- Deployment / serving
- Logging / observability / monitoring
What is your take on Notebooks within the MLOps landscape? Many claim they do not fit into production and foster bad practices. At the same time, we see more and more tools trying to wrap Notebooks into microservices so they can serve production services.
My own opinion is notebooks are a net negative for MLOps purposes. Turning them into microservices just seems to hide the problems. I'm largely in agreement with Joel Grus on this question here!
For some MLOps tasks (like annotation), there are feature-rich closed-source tools and not many open-source alternatives (if any). What would you recommend for those trying to get their models into production in this case? Use sub-par open-source tools, or connect an open-source stack with proprietary platforms that solve one piece of workflow really well?
Well, I suppose there are feature-rich closed-source tools in all sorts of categories, besides annotation, too. But I get the same impression that annotation is an area that's lacking a bit.
I think it's important not to be locked into a vendor. So, using something proprietary in some cases might be the best thing, but it's best done in a way that avoids lock-in. For example, SageMaker, but with ZenML, because you're not tied into SageMaker that way.
On labeling in particular, I've got my eye on Label Studio but I haven't had a chance to try it yet!
I have used a few data version control tools in the past. Although it is a hyped topic on social media, I find them lagging, and practitioners don't seem to use them. I wonder if it is because we are in early days or, potentially, the niche is small, and the added value is not significant yet. What's your take on the maturity of Data Version Control as a segment?
I think one issue with data version control is that the most suitable tool depends a lot on the nature of the data. Is it big or small data? Tabular, images, text? I suppose that could be a barrier to entry for many people.
There are really good, mature, open-source tools for data version control. Adoption is a different matter.
On our projects, we always recommend data version control. It's as fundamental as source code version control and has implications for everything else in the MLOps solution. But it's really important to work closely with the people who will actually use these tools, so that we understand what they really need, and they understand how to use the tool effectively.
MLOps challenges and trends
In your opinion, what are the main pain points in MLOps right now?
A few things come to mind:
- Many tools are heavy-weight and require a lot of work to deploy and run. For example, Kubeflow.
- Monitoring is hard, so there is lots of room for innovation here (getting better, thanks to Evidently!)
- People: how do we enable data scientists to get their stuff into production without burdening them with so many extra tools, etc.?
The "people point" is critical but often overlooked. It's no good to have a great stack of tools if nobody can actually use them.
How does MLOps fit in with the data-centric AI movement?
I say they're a natural fit for each other. A lot of what we do in MLOps is about keeping track of things: given a model or an experiment, what data, at what version, was used here? Given an inference, can we explain it in terms of the data? And can we use that to make a decision about what to do next with that data, and then run the whole train, test, deploy process again?
So I suppose I'm saying that MLOps enable that data-centric vision.
Where do you see MLOps fitting within the Modern Data Stack? And are there tools trying to bridge both worlds? In ML, we are used to having our data just there, ready to use, in a CSV or S3, and rarely or never do we think about data warehousing, metric stores, ETL, rETL, etc.
This is something we've been chatting about inside Fuzzy Labs quite a bit this week actually. If we want things like CI/CD for training models, then some data plumbing needs to be done somewhere. So is that plumbing part of MLOps, or a different set of skills?
My current take is that the parts that are ops-like do cross over with MLOps. So, the ETL pipeline needs to run somewhere, the data needs to live somewhere, the pipeline needs to be monitored, and all that is ops-type stuff. On the other hand, designing those ETL pipelines is really a data science + engineering task.
Our views on this are evolving, though, and I'm curious to hear other points of view.
Future of MLOps
I do agree, that's the next big thing. As a comparison, see how people started containerising their applications, each cloud provider had its own way of coordinating containers. But they all ended up supporting managed Kubernetes (alongside their own offerings), because that open source tool became the de-facto standard.
I can absolutely picture an open-source tool and the company behind it taking centre stage in the future, such that it ends up being adopted by all the big players.
More broadly, does the future lie in open source? I say yes, because in almost all areas of tech, that's already true. I wrote about this on our blog too :)
In your opinion, what does the future of MLOps tools look like? Will there be solutions laser-focused on one particular function (i.e. monitoring), or shall we expect one-stop-shop solutions?
Predicting the future in public seems a bit dangerous. But I'll take a stab at it!
- Open source wins
- Each tool does one thing well (laser-focused, as you say)
- A small set of tools that everyone uses because they work well together
- Tools are lightweight, with little maintenance overhead.
I definitely think the all-in-one MLOps platforms will find themselves obsolete. If I'm wrong, hi people in the future!
* The discussion was lightly edited for better readability.
[fs-toc-omit] Want to join the next AMA session?
Join our Discord community! Connect with maintainers, ask questions, and join AMAs with ML experts.
Join community ⟶