Primarily, testing ML functionality involves creating and uploading a training data-set (the developers may have one already), uploading a smaller more specific testing data-set, and then confirming the algorithm is learning/reacting correctly.
For example, the following steps could be taken if this was an ML algorithm used for job recommendations. You’d first want to enter a couple thousand jobs. Then you’d set up a profile on the application containing specific data such as preference and location data. Afterwards, you’d upload a testing data-set containing jobs that should be attributed to you, and jobs which shouldn’t be attributed to you. Finally, you’d check that the ‘good’ jobs are recommended, while none of the ‘bad’ jobs are recommended.
This process of loading a large training data-set, and then a smaller testing data-set, should be applicable to most applications of black-box ML testing.