Testing and scholarly research are sort of similar. You have a problem, and you want to understand why that problem is occurring, for example. Both use quantitative and qualitative data. But, in research, you want to be conclusive, exhaustive, and categorical.
In testing, you just want to make the problem better. So, in that way, in testing, you don’t choose all the ways of understanding the problem, but a few methods. The key is to make sure to choose methods that actually help you assess the problem accurately. Success rate, for example, can help you assess if people are accomplishing a particular task. But, what if your goal is that users employ much of your site, then you want to measure how many pages they are viewing.
There is a useful diagram on the Neilsen Norman website that illustrates the ways that particular testing tools relate to behavioral or attitudinal data. The article also illustrates what issues can be best illustrated by quantitative data, for example, like how much or how often.
Quantitative data should likely be paired with qualitative data. After all, if you know that most of the people going to your app stop at a certain point, you don’t know why. It might be because it is so terribly boring, or because it is so terrible interesting. Or, it could be that the text becomes illegible. Or…well, it could be anything. So, pairing the quantitative data, often found in analysis, with qualitative data give you the information you need to understand the problem.
To go back to my original statement, testing help you know enough to fix a particular app or website. You can make the situation better for the user. Quantitative and qualitative data are the tools that you use to make these improvement decisions. But, in terms of scholarship, you would likely need to have many, many more point of feedback to make a categorical assessment. So, while you might be able to use a small study to fix a particular mobile app, this doesn’t necessarily help you make broad generalizations about all mobile apps.