I think, we can in fact improve the performance of the multithreaded solution using user-level threads in multiprocessor environments.
In multiprocessor environments, it is guaranteed that there are many kernel level threads.
Since, in the question it is not given which type of mapping is used between User-level threads and kernel level threads, we can assume the mapping is one-to-one (which is most popular and is used by Linux and Windows).
In such a mapping, true parallelism is possible !
Have a look at the lecture series on operating systems by Georgia Tech. (https://classroom.udacity.com/courses/ud923/lessons/3065538763/concepts/31631188030923)
However, if you assume that the mapping is many-to-one, in that case, the performance would be same as Single processor system.