I already use DVC to keep track and store my datasets in my machine learning pipeline. I have also started to use MLflow to keep track of my experiments. However, I still don't know whether to use DVC for my model files or I use the MLflow artifact store for this purpose. Or maybe these two serve different purposes, and it may be good to do both! Can anyone help, please?
I personally think that MLflow does a great job at experiment tracking, but If you've already set dvc and you're already using it, it makes more sense to me to keep data, code and model in the context of the same commit, under the same roof, than having some dangling files in another system that requires you to track down a commit on the ui, and then get a link to the model manually. Using artifact logging is very useful if you need to see for example generated photos in real time, and stop training in the middle, or if you don't already have a data versioning system set up. By the way DAGsHub let's you combine both very easily.