Vittana.org

23 Advantages and Disadvantages of Qualitative Research

Investigating methodologies. Taking a closer look at ethnographic, anthropological, or naturalistic techniques. Data mining through observer recordings. This is what the world of qualitative research is all about. It is the comprehensive and complete data that is collected by having the courage to ask an open-ended question.

Print media has used the principles of qualitative research for generations. Now more industries are seeing the advantages that come from the extra data that is received by asking more than a “yes” or “no” question.

The advantages and disadvantages of qualitative research are quite unique. On one hand, you have the perspective of the data that is being collected. On the other hand, you have the techniques of the data collector and their own unique observations that can alter the information in subtle ways.

That’s why these key points are so important to consider.

What Are the Advantages of Qualitative Research?

1. Subject materials can be evaluated with greater detail. There are many time restrictions that are placed on research methods. The goal of a time restriction is to create a measurable outcome so that metrics can be in place. Qualitative research focuses less on the metrics of the data that is being collected and more on the subtleties of what can be found in that information. This allows for the data to have an enhanced level of detail to it, which can provide more opportunities to glean insights from it during examination.

2. Research frameworks can be fluid and based on incoming or available data. Many research opportunities must follow a specific pattern of questioning, data collection, and information reporting. Qualitative research offers a different approach. It can adapt to the quality of information that is being gathered. If the available data does not seem to be providing any results, the research can immediately shift gears and seek to gather data in a new direction. This offers more opportunities to gather important clues about any subject instead of being confined to a limited and often self-fulfilling perspective.

3. Qualitative research data is based on human experiences and observations. Humans have two very different operating systems. One is a subconscious method of operation, which is the fast and instinctual observations that are made when data is present. The other operating system is slower and more methodical, wanting to evaluate all sources of data before deciding. Many forms of research rely on the second operating system while ignoring the instinctual nature of the human mind. Qualitative research doesn’t ignore the gut instinct. It embraces it and the data that can be collected is often better for it.

4. Gathered data has a predictive quality to it. One of the common mistakes that occurs with qualitative research is an assumption that a personal perspective can be extrapolated into a group perspective. This is only possible when individuals grow up in similar circumstances, have similar perspectives about the world, and operate with similar goals. When these groups can be identified, however, the gathered individualistic data can have a predictive quality for those who are in a like-minded group. At the very least, the data has a predictive quality for the individual from whom it was gathered.

5. Qualitative research operates within structures that are fluid. Because the data being gathered through this type of research is based on observations and experiences, an experienced researcher can follow-up interesting answers with additional questions. Unlike other forms of research that require a specific framework with zero deviation, researchers can follow any data tangent which makes itself known and enhance the overall database of information that is being collected.

6. Data complexities can be incorporated into generated conclusions. Although our modern world tends to prefer statistics and verifiable facts, we cannot simply remove the human experience from the equation. Different people will have remarkably different perceptions about any statistic, fact, or event. This is because our unique experiences generate a different perspective of the data that we see. These complexities, when gathered into a singular database, can generate conclusions with more depth and accuracy, which benefits everyone.

7. Qualitative research is an open-ended process. When a researcher is properly prepared, the open-ended structures of qualitative research make it possible to get underneath superficial responses and rational thoughts to gather information from an individual’s emotional response. This is critically important to this form of researcher because it is an emotional response which often drives a person’s decisions or influences their behavior.

8. Creativity becomes a desirable quality within qualitative research. It can be difficult to analyze data that is obtained from individual sources because many people subconsciously answer in a way that they think someone wants. This desire to “please” another reduces the accuracy of the data and suppresses individual creativity. By embracing the qualitative research method, it becomes possible to encourage respondent creativity, allowing people to express themselves with authenticity. In return, the data collected becomes more accurate and can lead to predictable outcomes.

9. Qualitative research can create industry-specific insights. Brands and businesses today need to build relationships with their core demographics to survive. The terminology, vocabulary, and jargon that consumers use when looking at products or services is just as important as the reputation of the brand that is offering them. If consumers are receiving one context, but the intention of the brand is a different context, then the miscommunication can artificially restrict sales opportunities. Qualitative research gives brands access to these insights so they can accurately communicate their value propositions.

10. Smaller sample sizes are used in qualitative research, which can save on costs. Many qualitative research projects can be completed quickly and on a limited budget because they typically use smaller sample sizes that other research methods. This allows for faster results to be obtained so that projects can move forward with confidence that only good data is able to provide.

11. Qualitative research provides more content for creatives and marketing teams. When your job involves marketing, or creating new campaigns that target a specific demographic, then knowing what makes those people can be quite challenging. By going through the qualitative research approach, it becomes possible to congregate authentic ideas that can be used for marketing and other creative purposes. This makes communication between the two parties to be handled with more accuracy, leading to greater level of happiness for all parties involved.

12. Attitude explanations become possible with qualitative research. Consumer patterns can change on a dime sometimes, leaving a brand out in the cold as to what just happened. Qualitative research allows for a greater understanding of consumer attitudes, providing an explanation for events that occur outside of the predictive matrix that was developed through previous research. This allows the optimal brand/consumer relationship to be maintained.

What Are the Disadvantages of Qualitative Research?

1. The quality of the data gathered in qualitative research is highly subjective. This is where the personal nature of data gathering in qualitative research can also be a negative component of the process. What one researcher might feel is important and necessary to gather can be data that another researcher feels is pointless and won’t spend time pursuing it. Having individual perspectives and including instinctual decisions can lead to incredibly detailed data. It can also lead to data that is generalized or even inaccurate because of its reliance on researcher subjectivisms.

2. Data rigidity is more difficult to assess and demonstrate. Because individual perspectives are often the foundation of the data that is gathered in qualitative research, it is more difficult to prove that there is rigidity in the information that is collective. The human mind tends to remember things in the way it wants to remember them. That is why memories are often looked at fondly, even if the actual events that occurred may have been somewhat disturbing at the time. This innate desire to look at the good in things makes it difficult for researchers to demonstrate data validity.

3. Mining data gathered by qualitative research can be time consuming. The number of details that are often collected while performing qualitative research are often overwhelming. Sorting through that data to pull out the key points can be a time-consuming effort. It is also a subjective effort because what one researcher feels is important may not be pulled out by another researcher. Unless there are some standards in place that cannot be overridden, data mining through a massive number of details can almost be more trouble than it is worth in some instances.

4. Qualitative research creates findings that are valuable, but difficult to present. Presenting the findings which come out of qualitative research is a bit like listening to an interview on CNN. The interviewer will ask a question to the interviewee, but the goal is to receive an answer that will help present a database which presents a specific outcome to the viewer. The goal might be to have a viewer watch an interview and think, “That’s terrible. We need to pass a law to change that.” The subjective nature of the information, however, can cause the viewer to think, “That’s wonderful. Let’s keep things the way they are right now.” That is why findings from qualitative research are difficult to present. What a research gleans from the data can be very different from what an outside observer gleans from the data.

5. Data created through qualitative research is not always accepted. Because of the subjective nature of the data that is collected in qualitative research, findings are not always accepted by the scientific community. A second independent qualitative research effort which can produce similar findings is often necessary to begin the process of community acceptance.

6. Researcher influence can have a negative effect on the collected data. The quality of the data that is collected through qualitative research is highly dependent on the skills and observation of the researcher. If a researcher has a biased point of view, then their perspective will be included with the data collected and influence the outcome. There must be controls in place to help remove the potential for bias so the data collected can be reviewed with integrity. Otherwise, it would be possible for a researcher to make any claim and then use their bias through qualitative research to prove their point.

7. Replicating results can be very difficult with qualitative research. The scientific community wants to see results that can be verified and duplicated to accept research as factual. In the world of qualitative research, this can be very difficult to accomplish. Not only do you have the variability of researcher bias for which to account within the data, but there is also the informational bias that is built into the data itself from the provider. This means the scope of data gathering can be extremely limited, even if the structure of gathering information is fluid, because of each unique perspective.

8. Difficult decisions may require repetitive qualitative research periods. The smaller sample sizes of qualitative research may be an advantage, but they can also be a disadvantage for brands and businesses which are facing a difficult or potentially controversial decision. A small sample is not always representative of a larger population demographic, even if there are deep similarities with the individuals involve. This means a follow-up with a larger quantitative sample may be necessary so that data points can be tracked with more accuracy, allowing for a better overall decision to be made.

9. Unseen data can disappear during the qualitative research process. The amount of trust that is placed on the researcher to gather, and then draw together, the unseen data that is offered by a provider is enormous. The research is dependent upon the skill of the researcher being able to connect all the dots. If the researcher can do this, then the data can be meaningful and help brands and progress forward with their mission. If not, there is no way to alter course until after the first results are received. Then a new qualitative process must begin.

10. Researchers must have industry-related expertise. You can have an excellent researcher on-board for a project, but if they are not familiar with the subject matter, they will have a difficult time gathering accurate data. For qualitative research to be accurate, the interviewer involved must have specific skills, experiences, and expertise in the subject matter being studied. They must also be familiar with the material being evaluated and have the knowledge to interpret responses that are received. If any piece of this skill set is missing, the quality of the data being gathered can be open to interpretation.

11. Qualitative research is not statistically representative. The one disadvantage of qualitative research which is always present is its lack of statistical representation. It is a perspective-based method of research only, which means the responses given are not measured. Comparisons can be made and this can lead toward the duplication which may be required, but for the most part, quantitative data is required for circumstances which need statistical representation and that is not part of the qualitative research process.

The advantages and disadvantages of qualitative research make it possible to gather and analyze individualistic data on deeper levels. This makes it possible to gain new insights into consumer thoughts, demographic behavioral patterns, and emotional reasoning processes. When a research can connect the dots of each information point that is gathered, the information can lead to personalized experiences, better value in products and services, and ongoing brand development.

How representative is that qualitative data anyway?

7/8/16 / Mollie Boettcher

Corona Insights Logo

  • Qualitative Research

When we do qualitative research, our clients often wonder how representative the qualitative data is of the target population they are working with.  It’s a valid question.  To answer, I have to go back to the purpose of conducting qualitative research in the first place.

The purpose of qualitative research is to understand people’s perceptions, opinions, and beliefs, as well as what is causing them to think in this way.  Unlike quantitative research, the purpose is not to generalize the results to the population of interest.  If eight out of ten participants in a focus group share the same opinion, can we say that 80% of people believe that particular opinion?  No, definitely not, but you can be pretty confident that it will be a prevalent opinion in the population.

qualitative research is not statistically representative

Still not sure which methodology will best be able to answer your research questions?  We can help you choose !

  • Previous Post / Do you have kids? Wait – let me restate that.
  • Car vs. Bike / Next Post

Comments are closed.

Logo for Open Educational Resources

Chapter 5. Sampling

Introduction.

Most Americans will experience unemployment at some point in their lives. Sarah Damaske ( 2021 ) was interested in learning about how men and women experience unemployment differently. To answer this question, she interviewed unemployed people. After conducting a “pilot study” with twenty interviewees, she realized she was also interested in finding out how working-class and middle-class persons experienced unemployment differently. She found one hundred persons through local unemployment offices. She purposefully selected a roughly equal number of men and women and working-class and middle-class persons for the study. This would allow her to make the kinds of comparisons she was interested in. She further refined her selection of persons to interview:

I decided that I needed to be able to focus my attention on gender and class; therefore, I interviewed only people born between 1962 and 1987 (ages 28–52, the prime working and child-rearing years), those who worked full-time before their job loss, those who experienced an involuntary job loss during the past year, and those who did not lose a job for cause (e.g., were not fired because of their behavior at work). ( 244 )

The people she ultimately interviewed compose her sample. They represent (“sample”) the larger population of the involuntarily unemployed. This “theoretically informed stratified sampling design” allowed Damaske “to achieve relatively equal distribution of participation across gender and class,” but it came with some limitations. For one, the unemployment centers were located in primarily White areas of the country, so there were very few persons of color interviewed. Qualitative researchers must make these kinds of decisions all the time—who to include and who not to include. There is never an absolutely correct decision, as the choice is linked to the particular research question posed by the particular researcher, although some sampling choices are more compelling than others. In this case, Damaske made the choice to foreground both gender and class rather than compare all middle-class men and women or women of color from different class positions or just talk to White men. She leaves the door open for other researchers to sample differently. Because science is a collective enterprise, it is most likely someone will be inspired to conduct a similar study as Damaske’s but with an entirely different sample.

This chapter is all about sampling. After you have developed a research question and have a general idea of how you will collect data (observations or interviews), how do you go about actually finding people and sites to study? Although there is no “correct number” of people to interview, the sample should follow the research question and research design. You might remember studying sampling in a quantitative research course. Sampling is important here too, but it works a bit differently. Unlike quantitative research, qualitative research involves nonprobability sampling. This chapter explains why this is so and what qualities instead make a good sample for qualitative research.

Quick Terms Refresher

  • The population is the entire group that you want to draw conclusions about.
  • The sample is the specific group of individuals that you will collect data from.
  • Sampling frame is the actual list of individuals that the sample will be drawn from. Ideally, it should include the entire target population (and nobody who is not part of that population).
  • Sample size is how many individuals (or units) are included in your sample.

The “Who” of Your Research Study

After you have turned your general research interest into an actual research question and identified an approach you want to take to answer that question, you will need to specify the people you will be interviewing or observing. In most qualitative research, the objects of your study will indeed be people. In some cases, however, your objects might be content left by people (e.g., diaries, yearbooks, photographs) or documents (official or unofficial) or even institutions (e.g., schools, medical centers) and locations (e.g., nation-states, cities). Chances are, whatever “people, places, or things” are the objects of your study, you will not really be able to talk to, observe, or follow every single individual/object of the entire population of interest. You will need to create a sample of the population . Sampling in qualitative research has different purposes and goals than sampling in quantitative research. Sampling in both allows you to say something of interest about a population without having to include the entire population in your sample.

We begin this chapter with the case of a population of interest composed of actual people. After we have a better understanding of populations and samples that involve real people, we’ll discuss sampling in other types of qualitative research, such as archival research, content analysis, and case studies. We’ll then move to a larger discussion about the difference between sampling in qualitative research generally versus quantitative research, then we’ll move on to the idea of “theoretical” generalizability, and finally, we’ll conclude with some practical tips on the correct “number” to include in one’s sample.

Sampling People

To help think through samples, let’s imagine we want to know more about “vaccine hesitancy.” We’ve all lived through 2020 and 2021, and we know that a sizable number of people in the United States (and elsewhere) were slow to accept vaccines, even when these were freely available. By some accounts, about one-third of Americans initially refused vaccination. Why is this so? Well, as I write this in the summer of 2021, we know that some people actively refused the vaccination, thinking it was harmful or part of a government plot. Others were simply lazy or dismissed the necessity. And still others were worried about harmful side effects. The general population of interest here (all adult Americans who were not vaccinated by August 2021) may be as many as eighty million people. We clearly cannot talk to all of them. So we will have to narrow the number to something manageable. How can we do this?

Null

First, we have to think about our actual research question and the form of research we are conducting. I am going to begin with a quantitative research question. Quantitative research questions tend to be simpler to visualize, at least when we are first starting out doing social science research. So let us say we want to know what percentage of each kind of resistance is out there and how race or class or gender affects vaccine hesitancy. Again, we don’t have the ability to talk to everyone. But harnessing what we know about normal probability distributions (see quantitative methods for more on this), we can find this out through a sample that represents the general population. We can’t really address these particular questions if we only talk to White women who go to college with us. And if you are really trying to generalize the specific findings of your sample to the larger population, you will have to employ probability sampling , a sampling technique where a researcher sets a selection of a few criteria and chooses members of a population randomly. Why randomly? If truly random, all the members have an equal opportunity to be a part of the sample, and thus we avoid the problem of having only our friends and neighbors (who may be very different from other people in the population) in the study. Mathematically, there is going to be a certain number that will be large enough to allow us to generalize our particular findings from our sample population to the population at large. It might surprise you how small that number can be. Election polls of no more than one thousand people are routinely used to predict actual election outcomes of millions of people. Below that number, however, you will not be able to make generalizations. Talking to five people at random is simply not enough people to predict a presidential election.

In order to answer quantitative research questions of causality, one must employ probability sampling. Quantitative researchers try to generalize their findings to a larger population. Samples are designed with that in mind. Qualitative researchers ask very different questions, though. Qualitative research questions are not about “how many” of a certain group do X (in this case, what percentage of the unvaccinated hesitate for concern about safety rather than reject vaccination on political grounds). Qualitative research employs nonprobability sampling . By definition, not everyone has an equal opportunity to be included in the sample. The researcher might select White women they go to college with to provide insight into racial and gender dynamics at play. Whatever is found by doing so will not be generalizable to everyone who has not been vaccinated, or even all White women who have not been vaccinated, or even all White women who have not been vaccinated who are in this particular college. That is not the point of qualitative research at all. This is a really important distinction, so I will repeat in bold: Qualitative researchers are not trying to statistically generalize specific findings to a larger population . They have not failed when their sample cannot be generalized, as that is not the point at all.

In the previous paragraph, I said it would be perfectly acceptable for a qualitative researcher to interview five White women with whom she goes to college about their vaccine hesitancy “to provide insight into racial and gender dynamics at play.” The key word here is “insight.” Rather than use a sample as a stand-in for the general population, as quantitative researchers do, the qualitative researcher uses the sample to gain insight into a process or phenomenon. The qualitative researcher is not going to be content with simply asking each of the women to state her reason for not being vaccinated and then draw conclusions that, because one in five of these women were concerned about their health, one in five of all people were also concerned about their health. That would be, frankly, a very poor study indeed. Rather, the qualitative researcher might sit down with each of the women and conduct a lengthy interview about what the vaccine means to her, why she is hesitant, how she manages her hesitancy (how she explains it to her friends), what she thinks about others who are unvaccinated, what she thinks of those who have been vaccinated, and what she knows or thinks she knows about COVID-19. The researcher might include specific interview questions about the college context, about their status as White women, about the political beliefs they hold about racism in the US, and about how their own political affiliations may or may not provide narrative scripts about “protective whiteness.” There are many interesting things to ask and learn about and many things to discover. Where a quantitative researcher begins with clear parameters to set their population and guide their sample selection process, the qualitative researcher is discovering new parameters, making it impossible to engage in probability sampling.

Looking at it this way, sampling for qualitative researchers needs to be more strategic. More theoretically informed. What persons can be interviewed or observed that would provide maximum insight into what is still unknown? In other words, qualitative researchers think through what cases they could learn the most from, and those are the cases selected to study: “What would be ‘bias’ in statistical sampling, and therefore a weakness, becomes intended focus in qualitative sampling, and therefore a strength. The logic and power of purposeful sampling like in selecting information-rich cases for study in depth. Information-rich cases are those from which one can learn a great deal about issues of central importance to the purpose of the inquiry, thus the term purposeful sampling” ( Patton 2002:230 ; emphases in the original).

Before selecting your sample, though, it is important to clearly identify the general population of interest. You need to know this before you can determine the sample. In our example case, it is “adult Americans who have not yet been vaccinated.” Depending on the specific qualitative research question, however, it might be “adult Americans who have been vaccinated for political reasons” or even “college students who have not been vaccinated.” What insights are you seeking? Do you want to know how politics is affecting vaccination? Or do you want to understand how people manage being an outlier in a particular setting (unvaccinated where vaccinations are heavily encouraged if not required)? More clearly stated, your population should align with your research question . Think back to the opening story about Damaske’s work studying the unemployed. She drew her sample narrowly to address the particular questions she was interested in pursuing. Knowing your questions or, at a minimum, why you are interested in the topic will allow you to draw the best sample possible to achieve insight.

Once you have your population in mind, how do you go about getting people to agree to be in your sample? In qualitative research, it is permissible to find people by convenience. Just ask for people who fit your sample criteria and see who shows up. Or reach out to friends and colleagues and see if they know anyone that fits. Don’t let the name convenience sampling mislead you; this is not exactly “easy,” and it is certainly a valid form of sampling in qualitative research. The more unknowns you have about what you will find, the more convenience sampling makes sense. If you don’t know how race or class or political affiliation might matter, and your population is unvaccinated college students, you can construct a sample of college students by placing an advertisement in the student paper or posting a flyer on a notice board. Whoever answers is your sample. That is what is meant by a convenience sample. A common variation of convenience sampling is snowball sampling . This is particularly useful if your target population is hard to find. Let’s say you posted a flyer about your study and only two college students responded. You could then ask those two students for referrals. They tell their friends, and those friends tell other friends, and, like a snowball, your sample gets bigger and bigger.

Researcher Note

Gaining Access: When Your Friend Is Your Research Subject

My early experience with qualitative research was rather unique. At that time, I needed to do a project that required me to interview first-generation college students, and my friends, with whom I had been sharing a dorm for two years, just perfectly fell into the sample category. Thus, I just asked them and easily “gained my access” to the research subject; I know them, we are friends, and I am part of them. I am an insider. I also thought, “Well, since I am part of the group, I can easily understand their language and norms, I can capture their honesty, read their nonverbal cues well, will get more information, as they will be more opened to me because they trust me.” All in all, easy access with rich information. But, gosh, I did not realize that my status as an insider came with a price! When structuring the interview questions, I began to realize that rather than focusing on the unique experiences of my friends, I mostly based the questions on my own experiences, assuming we have similar if not the same experiences. I began to struggle with my objectivity and even questioned my role; am I doing this as part of the group or as a researcher? I came to know later that my status as an insider or my “positionality” may impact my research. It not only shapes the process of data collection but might heavily influence my interpretation of the data. I came to realize that although my inside status came with a lot of benefits (especially for access), it could also bring some drawbacks.

—Dede Setiono, PhD student focusing on international development and environmental policy, Oregon State University

The more you know about what you might find, the more strategic you can be. If you wanted to compare how politically conservative and politically liberal college students explained their vaccine hesitancy, for example, you might construct a sample purposively, finding an equal number of both types of students so that you can make those comparisons in your analysis. This is what Damaske ( 2021 ) did. You could still use convenience or snowball sampling as a way of recruitment. Post a flyer at the conservative student club and then ask for referrals from the one student that agrees to be interviewed. As with convenience sampling, there are variations of purposive sampling as well as other names used (e.g., judgment, quota, stratified, criterion, theoretical). Try not to get bogged down in the nomenclature; instead, focus on identifying the general population that matches your research question and then using a sampling method that is most likely to provide insight, given the types of questions you have.

There are all kinds of ways of being strategic with sampling in qualitative research. Here are a few of my favorite techniques for maximizing insight:

  • Consider using “extreme” or “deviant” cases. Maybe your college houses a prominent anti-vaxxer who has written about and demonstrated against the college’s policy on vaccines. You could learn a lot from that single case (depending on your research question, of course).
  • Consider “intensity”: people and cases and circumstances where your questions are more likely to feature prominently (but not extremely or deviantly). For example, you could compare those who volunteer at local Republican and Democratic election headquarters during an election season in a study on why party matters. Those who volunteer are more likely to have something to say than those who are more apathetic.
  • Maximize variation, as with the case of “politically liberal” versus “politically conservative,” or include an array of social locations (young vs. old; Northwest vs. Southeast region). This kind of heterogeneity sampling can capture and describe the central themes that cut across the variations: any common patterns that emerge, even in this wildly mismatched sample, are probably important to note!
  • Rather than maximize the variation, you could select a small homogenous sample to describe some particular subgroup in depth. Focus groups are often the best form of data collection for homogeneity sampling.
  • Think about which cases are “critical” or politically important—ones that “if it happens here, it would happen anywhere” or a case that is politically sensitive, as with the single “blue” (Democratic) county in a “red” (Republican) state. In both, you are choosing a site that would yield the most information and have the greatest impact on the development of knowledge.
  • On the other hand, sometimes you want to select the “typical”—the typical college student, for example. You are trying to not generalize from the typical but illustrate aspects that may be typical of this case or group. When selecting for typicality, be clear with yourself about why the typical matches your research questions (and who might be excluded or marginalized in doing so).
  • Finally, it is often a good idea to look for disconfirming cases : if you are at the stage where you have a hypothesis (of sorts), you might select those who do not fit your hypothesis—you will surely learn something important there. They may be “exceptions that prove the rule” or exceptions that force you to alter your findings in order to make sense of these additional cases.

In addition to all these sampling variations, there is the theoretical approach taken by grounded theorists in which the researcher samples comparative people (or events) on the basis of their potential to represent important theoretical constructs. The sample, one can say, is by definition representative of the phenomenon of interest. It accompanies the constant comparative method of analysis. In the words of the funders of Grounded Theory , “Theoretical sampling is sampling on the basis of the emerging concepts, with the aim being to explore the dimensional range or varied conditions along which the properties of the concepts vary” ( Strauss and Corbin 1998:73 ).

When Your Population is Not Composed of People

I think it is easiest for most people to think of populations and samples in terms of people, but sometimes our units of analysis are not actually people. They could be places or institutions. Even so, you might still want to talk to people or observe the actions of people to understand those places or institutions. Or not! In the case of content analyses (see chapter 17), you won’t even have people involved at all but rather documents or films or photographs or news clippings. Everything we have covered about sampling applies to other units of analysis too. Let’s work through some examples.

Case Studies

When constructing a case study, it is helpful to think of your cases as sample populations in the same way that we considered people above. If, for example, you are comparing campus climates for diversity, your overall population may be “four-year college campuses in the US,” and from there you might decide to study three college campuses as your sample. Which three? Will you use purposeful sampling (perhaps [1] selecting three colleges in Oregon that are different sizes or [2] selecting three colleges across the US located in different political cultures or [3] varying the three colleges by racial makeup of the student body)? Or will you select three colleges at random, out of convenience? There are justifiable reasons for all approaches.

As with people, there are different ways of maximizing insight in your sample selection. Think about the following rationales: typical, diverse, extreme, deviant, influential, crucial, or even embodying a particular “pathway” ( Gerring 2008 ). When choosing a case or particular research site, Rubin ( 2021 ) suggests you bear in mind, first, what you are leaving out by selecting this particular case/site; second, what you might be overemphasizing by studying this case/site and not another; and, finally, whether you truly need to worry about either of those things—“that is, what are the sources of bias and how bad are they for what you are trying to do?” ( 89 ).

Once you have selected your cases, you may still want to include interviews with specific people or observations at particular sites within those cases. Then you go through possible sampling approaches all over again to determine which people will be contacted.

Content: Documents, Narrative Accounts, And So On

Although not often discussed as sampling, your selection of documents and other units to use in various content/historical analyses is subject to similar considerations. When you are asking quantitative-type questions (percentages and proportionalities of a general population), you will want to follow probabilistic sampling. For example, I created a random sample of accounts posted on the website studentloanjustice.org to delineate the types of problems people were having with student debt ( Hurst 2007 ). Even though my data was qualitative (narratives of student debt), I was actually asking a quantitative-type research question, so it was important that my sample was representative of the larger population (debtors who posted on the website). On the other hand, when you are asking qualitative-type questions, the selection process should be very different. In that case, use nonprobabilistic techniques, either convenience (where you are really new to this data and do not have the ability to set comparative criteria or even know what a deviant case would be) or some variant of purposive sampling. Let’s say you were interested in the visual representation of women in media published in the 1950s. You could select a national magazine like Time for a “typical” representation (and for its convenience, as all issues are freely available on the web and easy to search). Or you could compare one magazine known for its feminist content versus one antifeminist. The point is, sample selection is important even when you are not interviewing or observing people.

Goals of Qualitative Sampling versus Goals of Quantitative Sampling

We have already discussed some of the differences in the goals of quantitative and qualitative sampling above, but it is worth further discussion. The quantitative researcher seeks a sample that is representative of the population of interest so that they may properly generalize the results (e.g., if 80 percent of first-gen students in the sample were concerned with costs of college, then we can say there is a strong likelihood that 80 percent of first-gen students nationally are concerned with costs of college). The qualitative researcher does not seek to generalize in this way . They may want a representative sample because they are interested in typical responses or behaviors of the population of interest, but they may very well not want a representative sample at all. They might want an “extreme” or deviant case to highlight what could go wrong with a particular situation, or maybe they want to examine just one case as a way of understanding what elements might be of interest in further research. When thinking of your sample, you will have to know why you are selecting the units, and this relates back to your research question or sets of questions. It has nothing to do with having a representative sample to generalize results. You may be tempted—or it may be suggested to you by a quantitatively minded member of your committee—to create as large and representative a sample as you possibly can to earn credibility from quantitative researchers. Ignore this temptation or suggestion. The only thing you should be considering is what sample will best bring insight into the questions guiding your research. This has implications for the number of people (or units) in your study as well, which is the topic of the next section.

What is the Correct “Number” to Sample?

Because we are not trying to create a generalizable representative sample, the guidelines for the “number” of people to interview or news stories to code are also a bit more nebulous. There are some brilliant insightful studies out there with an n of 1 (meaning one person or one account used as the entire set of data). This is particularly so in the case of autoethnography, a variation of ethnographic research that uses the researcher’s own subject position and experiences as the basis of data collection and analysis. But it is true for all forms of qualitative research. There are no hard-and-fast rules here. The number to include is what is relevant and insightful to your particular study.

That said, humans do not thrive well under such ambiguity, and there are a few helpful suggestions that can be made. First, many qualitative researchers talk about “saturation” as the end point for data collection. You stop adding participants when you are no longer getting any new information (or so very little that the cost of adding another interview subject or spending another day in the field exceeds any likely benefits to the research). The term saturation was first used here by Glaser and Strauss ( 1967 ), the founders of Grounded Theory. Here is their explanation: “The criterion for judging when to stop sampling the different groups pertinent to a category is the category’s theoretical saturation . Saturation means that no additional data are being found whereby the sociologist can develop properties of the category. As he [or she] sees similar instances over and over again, the researcher becomes empirically confident that a category is saturated. [They go] out of [their] way to look for groups that stretch diversity of data as far as possible, just to make certain that saturation is based on the widest possible range of data on the category” ( 61 ).

It makes sense that the term was developed by grounded theorists, since this approach is rather more open-ended than other approaches used by qualitative researchers. With so much left open, having a guideline of “stop collecting data when you don’t find anything new” is reasonable. However, saturation can’t help much when first setting out your sample. How do you know how many people to contact to interview? What number will you put down in your institutional review board (IRB) protocol (see chapter 8)? You may guess how many people or units it will take to reach saturation, but there really is no way to know in advance. The best you can do is think about your population and your questions and look at what others have done with similar populations and questions.

Here are some suggestions to use as a starting point: For phenomenological studies, try to interview at least ten people for each major category or group of people . If you are comparing male-identified, female-identified, and gender-neutral college students in a study on gender regimes in social clubs, that means you might want to design a sample of thirty students, ten from each group. This is the minimum suggested number. Damaske’s ( 2021 ) sample of one hundred allows room for up to twenty-five participants in each of four “buckets” (e.g., working-class*female, working-class*male, middle-class*female, middle-class*male). If there is more than one comparative group (e.g., you are comparing students attending three different colleges, and you are comparing White and Black students in each), you can sometimes reduce the number for each group in your sample to five for, in this case, thirty total students. But that is really a bare minimum you will want to go. A lot of people will not trust you with only “five” cases in a bucket. Lareau ( 2021:24 ) advises a minimum of seven or nine for each bucket (or “cell,” in her words). The point is to think about what your analyses might look like and how comfortable you will be with a certain number of persons fitting each category.

Because qualitative research takes so much time and effort, it is rare for a beginning researcher to include more than thirty to fifty people or units in the study. You may not be able to conduct all the comparisons you might want simply because you cannot manage a larger sample. In that case, the limits of who you can reach or what you can include may influence you to rethink an original overcomplicated research design. Rather than include students from every racial group on a campus, for example, you might want to sample strategically, thinking about the most contrast (insightful), possibly excluding majority-race (White) students entirely, and simply using previous literature to fill in gaps in our understanding. For example, one of my former students was interested in discovering how race and class worked at a predominantly White institution (PWI). Due to time constraints, she simplified her study from an original sample frame of middle-class and working-class domestic Black and international African students (four buckets) to a sample frame of domestic Black and international African students (two buckets), allowing the complexities of class to come through individual accounts rather than from part of the sample frame. She wisely decided not to include White students in the sample, as her focus was on how minoritized students navigated the PWI. She was able to successfully complete her project and develop insights from the data with fewer than twenty interviewees. [1]

But what if you had unlimited time and resources? Would it always be better to interview more people or include more accounts, documents, and units of analysis? No! Your sample size should reflect your research question and the goals you have set yourself. Larger numbers can sometimes work against your goals. If, for example, you want to help bring out individual stories of success against the odds, adding more people to the analysis can end up drowning out those individual stories. Sometimes, the perfect size really is one (or three, or five). It really depends on what you are trying to discover and achieve in your study. Furthermore, studies of one hundred or more (people, documents, accounts, etc.) can sometimes be mistaken for quantitative research. Inevitably, the large sample size will push the researcher into simplifying the data numerically. And readers will begin to expect generalizability from such a large sample.

To summarize, “There are no rules for sample size in qualitative inquiry. Sample size depends on what you want to know, the purpose of the inquiry, what’s at stake, what will be useful, what will have credibility, and what can be done with available time and resources” ( Patton 2002:244 ).

How did you find/construct a sample?

Since qualitative researchers work with comparatively small sample sizes, getting your sample right is rather important. Yet it is also difficult to accomplish. For instance, a key question you need to ask yourself is whether you want a homogeneous or heterogeneous sample. In other words, do you want to include people in your study who are by and large the same, or do you want to have diversity in your sample?

For many years, I have studied the experiences of students who were the first in their families to attend university. There is a rather large number of sampling decisions I need to consider before starting the study. (1) Should I only talk to first-in-family students, or should I have a comparison group of students who are not first-in-family? (2) Do I need to strive for a gender distribution that matches undergraduate enrollment patterns? (3) Should I include participants that reflect diversity in gender identity and sexuality? (4) How about racial diversity? First-in-family status is strongly related to some ethnic or racial identity. (5) And how about areas of study?

As you can see, if I wanted to accommodate all these differences and get enough study participants in each category, I would quickly end up with a sample size of hundreds, which is not feasible in most qualitative research. In the end, for me, the most important decision was to maximize the voices of first-in-family students, which meant that I only included them in my sample. As for the other categories, I figured it was going to be hard enough to find first-in-family students, so I started recruiting with an open mind and an understanding that I may have to accept a lack of gender, sexuality, or racial diversity and then not be able to say anything about these issues. But I would definitely be able to speak about the experiences of being first-in-family.

—Wolfgang Lehmann, author of “Habitus Transformation and Hidden Injuries”

Examples of “Sample” Sections in Journal Articles

Think about some of the studies you have read in college, especially those with rich stories and accounts about people’s lives. Do you know how the people were selected to be the focus of those stories? If the account was published by an academic press (e.g., University of California Press or Princeton University Press) or in an academic journal, chances are that the author included a description of their sample selection. You can usually find these in a methodological appendix (book) or a section on “research methods” (article).

Here are two examples from recent books and one example from a recent article:

Example 1 . In It’s Not like I’m Poor: How Working Families Make Ends Meet in a Post-welfare World , the research team employed a mixed methods approach to understand how parents use the earned income tax credit, a refundable tax credit designed to provide relief for low- to moderate-income working people ( Halpern-Meekin et al. 2015 ). At the end of their book, their first appendix is “Introduction to Boston and the Research Project.” After describing the context of the study, they include the following description of their sample selection:

In June 2007, we drew 120 names at random from the roughly 332 surveys we gathered between February and April. Within each racial and ethnic group, we aimed for one-third married couples with children and two-thirds unmarried parents. We sent each of these families a letter informing them of the opportunity to participate in the in-depth portion of our study and then began calling the home and cell phone numbers they provided us on the surveys and knocking on the doors of the addresses they provided.…In the end, we interviewed 115 of the 120 families originally selected for the in-depth interview sample (the remaining five families declined to participate). ( 22 )

Was their sample selection based on convenience or purpose? Why do you think it was important for them to tell you that five families declined to be interviewed? There is actually a trick here, as the names were pulled randomly from a survey whose sample design was probabilistic. Why is this important to know? What can we say about the representativeness or the uniqueness of whatever findings are reported here?

Example 2 . In When Diversity Drops , Park ( 2013 ) examines the impact of decreasing campus diversity on the lives of college students. She does this through a case study of one student club, the InterVarsity Christian Fellowship (IVCF), at one university (“California University,” a pseudonym). Here is her description:

I supplemented participant observation with individual in-depth interviews with sixty IVCF associates, including thirty-four current students, eight former and current staff members, eleven alumni, and seven regional or national staff members. The racial/ethnic breakdown was twenty-five Asian Americans (41.6 percent), one Armenian (1.6 percent), twelve people who were black (20.0 percent), eight Latino/as (13.3 percent), three South Asian Americans (5.0 percent), and eleven people who were white (18.3 percent). Twenty-nine were men, and thirty-one were women. Looking back, I note that the higher number of Asian Americans reflected both the group’s racial/ethnic composition and my relative ease about approaching them for interviews. ( 156 )

How can you tell this is a convenience sample? What else do you note about the sample selection from this description?

Example 3. The last example is taken from an article published in the journal Research in Higher Education . Published articles tend to be more formal than books, at least when it comes to the presentation of qualitative research. In this article, Lawson ( 2021 ) is seeking to understand why female-identified college students drop out of majors that are dominated by male-identified students (e.g., engineering, computer science, music theory). Here is the entire relevant section of the article:

Method Participants Data were collected as part of a larger study designed to better understand the daily experiences of women in MDMs [male-dominated majors].…Participants included 120 students from a midsize, Midwestern University. This sample included 40 women and 40 men from MDMs—defined as any major where at least 2/3 of students are men at both the university and nationally—and 40 women from GNMs—defined as any may where 40–60% of students are women at both the university and nationally.… Procedure A multi-faceted approach was used to recruit participants; participants were sent targeted emails (obtained based on participants’ reported gender and major listings), campus-wide emails sent through the University’s Communication Center, flyers, and in-class presentations. Recruitment materials stated that the research focused on the daily experiences of college students, including classroom experiences, stressors, positive experiences, departmental contexts, and career aspirations. Interested participants were directed to email the study coordinator to verify eligibility (at least 18 years old, man/woman in MDM or woman in GNM, access to a smartphone). Sixteen interested individuals were not eligible for the study due to the gender/major combination. ( 482ff .)

What method of sample selection was used by Lawson? Why is it important to define “MDM” at the outset? How does this definition relate to sampling? Why were interested participants directed to the study coordinator to verify eligibility?

Final Words

I have found that students often find it difficult to be specific enough when defining and choosing their sample. It might help to think about your sample design and sample recruitment like a cookbook. You want all the details there so that someone else can pick up your study and conduct it as you intended. That person could be yourself, but this analogy might work better if you have someone else in mind. When I am writing down recipes, I often think of my sister and try to convey the details she would need to duplicate the dish. We share a grandmother whose recipes are full of handwritten notes in the margins, in spidery ink, that tell us what bowl to use when or where things could go wrong. Describe your sample clearly, convey the steps required accurately, and then add any other details that will help keep you on track and remind you why you have chosen to limit possible interviewees to those of a certain age or class or location. Imagine actually going out and getting your sample (making your dish). Do you have all the necessary details to get started?

Table 5.1. Sampling Type and Strategies

Further Readings

Fusch, Patricia I., and Lawrence R. Ness. 2015. “Are We There Yet? Data Saturation in Qualitative Research.” Qualitative Report 20(9):1408–1416.

Saunders, Benjamin, Julius Sim, Tom Kinstone, Shula Baker, Jackie Waterfield, Bernadette Bartlam, Heather Burroughs, and Clare Jinks. 2018. “Saturation in Qualitative Research: Exploring Its Conceptualization and Operationalization.”  Quality & Quantity  52(4):1893–1907.

  • Rubin ( 2021 ) suggests a minimum of twenty interviews (but safer with thirty) for an interview-based study and a minimum of three to six months in the field for ethnographic studies. For a content-based study, she suggests between five hundred and one thousand documents, although some will be “very small” ( 243–244 ). ↵

The process of selecting people or other units of analysis to represent a larger population. In quantitative research, this representation is taken quite literally, as statistically representative.  In qualitative research, in contrast, sample selection is often made based on potential to generate insight about a particular topic or phenomenon.

The actual list of individuals that the sample will be drawn from. Ideally, it should include the entire target population (and nobody who is not part of that population).  Sampling frames can differ from the larger population when specific exclusions are inherent, as in the case of pulling names randomly from voter registration rolls where not everyone is a registered voter.  This difference in frame and population can undercut the generalizability of quantitative results.

The specific group of individuals that you will collect data from.  Contrast population.

The large group of interest to the researcher.  Although it will likely be impossible to design a study that incorporates or reaches all members of the population of interest, this should be clearly defined at the outset of a study so that a reasonable sample of the population can be taken.  For example, if one is studying working-class college students, the sample may include twenty such students attending a particular college, while the population is “working-class college students.”  In quantitative research, clearly defining the general population of interest is a necessary step in generalizing results from a sample.  In qualitative research, defining the population is conceptually important for clarity.

A sampling strategy in which the sample is chosen to represent (numerically) the larger population from which it is drawn by random selection.  Each person in the population has an equal chance of making it into the sample.  This is often done through a lottery or other chance mechanisms (e.g., a random selection of every twelfth name on an alphabetical list of voters).  Also known as random sampling .

The selection of research participants or other data sources based on availability or accessibility, in contrast to purposive sampling .

A sample generated non-randomly by asking participants to help recruit more participants the idea being that a person who fits your sampling criteria probably knows other people with similar criteria.

Broad codes that are assigned to the main issues emerging in the data; identifying themes is often part of initial coding . 

A form of case selection focusing on examples that do not fit the emerging patterns. This allows the researcher to evaluate rival explanations or to define the limitations of their research findings. While disconfirming cases are found (not sought out), researchers should expand their analysis or rethink their theories to include/explain them.

A methodological tradition of inquiry and approach to analyzing qualitative data in which theories emerge from a rigorous and systematic process of induction.  This approach was pioneered by the sociologists Glaser and Strauss (1967).  The elements of theory generated from comparative analysis of data are, first, conceptual categories and their properties and, second, hypotheses or generalized relations among the categories and their properties – “The constant comparing of many groups draws the [researcher’s] attention to their many similarities and differences.  Considering these leads [the researcher] to generate abstract categories and their properties, which, since they emerge from the data, will clearly be important to a theory explaining the kind of behavior under observation.” (36).

The result of probability sampling, in which a sample is chosen to represent (numerically) the larger population from which it is drawn by random selection.  Each person in the population has an equal chance of making it into the random sample.  This is often done through a lottery or other chance mechanisms (e.g., the random selection of every twelfth name on an alphabetical list of voters).  This is typically not required in qualitative research but rather essential for the generalizability of quantitative research.

A form of case selection or purposeful sampling in which cases that are unusual or special in some way are chosen to highlight processes or to illuminate gaps in our knowledge of a phenomenon.   See also extreme case .

The point at which you can conclude data collection because every person you are interviewing, the interaction you are observing, or content you are analyzing merely confirms what you have already noted.  Achieving saturation is often used as the justification for the final sample size.

The accuracy with which results or findings can be transferred to situations or people other than those originally studied.  Qualitative studies generally are unable to use (and are uninterested in) statistical generalizability where the sample population is said to be able to predict or stand in for a larger population of interest.  Instead, qualitative researchers often discuss “theoretical generalizability,” in which the findings of a particular study can shed light on processes and mechanisms that may be at play in other settings.  See also statistical generalization and theoretical generalization .

A term used by IRBs to denote all materials aimed at recruiting participants into a research study (including printed advertisements, scripts, audio or video tapes, or websites).  Copies of this material are required in research protocols submitted to IRB.

Introduction to Qualitative Research Methods Copyright © 2023 by Allison Hurst is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License , except where otherwise noted.

Log in using your username and password

  • Search More Search for this keyword Advanced search
  • Latest content
  • Publish with us
  • About the journal
  • Meet the editors
  • Specialist reviews
  • BMJ Journals More You are viewing from: Google Indexer

You are here

  • Volume 2, Issue 1
  • Defining representativeness of study samples in medical and population health research
  • Article Text
  • Article info
  • Citation Tools
  • Rapid Responses
  • Article metrics

Download PDF

  • http://orcid.org/0000-0001-7177-1847 Jacqueline E Rudolph ,
  • Yongqi Zhong ,
  • Priya Duggal ,
  • Shruti H Mehta and
  • Department of Epidemiology , Bloomberg School of Public Health, Johns Hopkins University , Baltimore , MD , USA
  • Correspondence to Dr Jacqueline E Rudolph, Department of Epidemiology, Johns Hopkins University, Baltimore, MD 21205, USA; jacqueline.rudolph{at}jhu.edu

Medical and population health science researchers frequently make ambiguous statements about whether they believe their study sample or results are representative of some (implicit or explicit) target population. This article provides a comprehensive definition of representativeness, with the goal of capturing the different ways in which a study can be representative of a target population. It is proposed that a study is representative if the estimate obtained in the study sample is generalisable to the target population (owing to representative sampling, estimation of stratum specific effects, or quantitative methods to generalise or transport estimates) or the interpretation of the results is generalisable to the target population (based on fundamental scientific premises and substantive background knowledge). This definition is explored in the context of four covid-19 studies, ranging from laboratory science to descriptive epidemiology. All statements regarding representativeness should make clear the way in which the study results generalise, the target population the results are being generalised to, and the assumptions that must hold for that generalisation to be scientifically or statistically justifiable.

  • Public health
  • Epidemiology
  • Research design

Data availability statement

Data sharing not applicable as no datasets generated and/or analysed for this study.

This is an open access article distributed in accordance with the Creative Commons Attribution Non Commercial (CC BY-NC 4.0) license, which permits others to distribute, remix, adapt, build upon this work non-commercially, and license their derivative works on different terms, provided the original work is properly cited, appropriate credit is given, any changes made indicated, and the use is non-commercial. See:  http://creativecommons.org/licenses/by-nc/4.0/ .

https://doi.org/10.1136/bmjmed-2022-000399

Statistics from Altmetric.com

Request permissions.

If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.

KEY MESSAGES

Researchers frequently refer to whether their sample is or is not representative without clarifying whether they mean that their sample is a simple random sample of their target population or that the results from their sample are merely reflective of what would be seen in the target population

This article provides a comprehensive definition of what it means for a study to be representative, and examines this definition in the context of examples with different study designs

When publishing research, researchers should critically assess whether a study sample is representative of a clearly defined target population, by carefully considering the manner in which they think the results generalise to the target population and the assumptions underlying that hypothesis

Introduction

It is common if not a requirement for medical and population health science researchers to consider the inferences from a study beyond the context of their analysis. Accordingly, many papers mention whether their study sample is representative of, or study results generalise to, some implicit or explicit target population; others refer to a lack of generalisability or representativeness as a limitation. Despite being frequently discussed and debated, 1–4 in common practice, the meaning of representativeness remains ambiguous. Here, we propose a comprehensive definition of representativeness and discuss it in the context of different study designs. We presume no bias in the study’s results; in any real world study, bias will need to be weighed alongside whether the sample is representative or its results are generalisable or applicable to a target population. 5

What is representativeness?

The ambiguity in meaning arises in part because the word “representative” has a broader meaning in English and a more technical definition, and the definition being used is not always clear. In a 2013 series of commentaries on representativeness, 1–4 the concept was defined as occurring when the study sample is a simple random sample of the target population (ie, the sample that arises through representative sampling). A second definition is that the study sample and the results obtained merely resemble what would be expected in the target population, perhaps based on a similarity in personal characteristics. 6 The first definition is more precise and implies a high standard for study design, while the second encompasses a variety of possible interpretations.

Here, we bridge these two uses of the word “representative” and attempt to concretise the second, broader definition. We define a study sample to be representative of a well defined target population if the results estimated in that sample are generalisable to the target population. We consider two ways in which study results can generalise to the target population: in estimate and in interpretation. Box 1 lists a summary of key terms used in this article and figures 1–2 show examples applying the definition of representativeness.

Effect measure modifiers: variables that influence (ie, weaken or strengthen) the relation between the treatment and outcome

Estimate: numerical result (eg, mean, risk, risk difference, risk ratio, odds ratio) obtained in the study sample

Generalisable: findings in the study sample study apply to an overlapping target population (ie, study sample is at least a partial subset of the target population)

Interpretation: knowledge or information learnt from the numerical estimate, such as the direction of effect or other study conclusions

Key covariates: variables that affect the outcome, which might be effect measure modifier on some scale (eg, risk difference, risk ratio, odds ratio) and must be considered if generalising the estimate to a target population

Representative: a study sample is representative of a well defined target population if either the estimate obtained in that sample or the interpretation of the results in that sample are generalisable to the target population

Target population: population to which the researcher seeks to make inference

Transportable: findings in the study sample study apply to a non-overlapping target population (ie, study sample is not a subset of the target population)

  • Download figure
  • Open in new tab
  • Download powerpoint

Example where a study sample (which is a simple random sample of the target population) is representative because its results generalise in interpretation and in estimate. Shaded box=treatment group; hashed lines=outcome group. Colours represent different levels of an effect measure modifier on the risk difference scale, which did not affect selection into the study sample

Example where a study sample (which is a convenience sample of the target population) is representative because its results generalise in interpretation even though they do not generalise in estimate. Shaded box=treatment group; hashed lines=outcome group. Colours represent different levels of an effect measure modifier on the risk difference scale, which affected selection into the study sample

To help clarify these definitions, we use as an example a randomised controlled trial that was conducted to measure the efficacy of molnupiravir for treatment of covid-19. 7 Among unvaccinated adults with mild to moderate covid-19 who were not in the hospital, researchers found that the risk of hospital admission or death at 29 days among participants randomised to molnupiravir was 6.8%, compared with 9.7% of participants randomised to placebo. They concluded that treatment with molnupiravir within five days of infection reduced the risk of hospital admission or death. In boxes 2–4 , we explore the definitions in the context of other study designs. 8–10 As with the trial, we use the question, study design, and sample description of these publications to construct a theoretical example. We do not delve into the specific details of the study or comment on whether the researchers fully achieved what we discuss.

Defining representativeness in laboratory science studies

At the start of the covid-19 pandemic, no antiviral drugs for infection or disease had been approved. To assess whether antiviral drug molnupiravir was effective for treating covid-19, researchers in one animal model study gave molnupiravir to mice with human lung tissue before and after infection with SARS-CoV-2, using doses scaled from appropriate human levels to the mouse model. 8 They found that a 2 day course of treatment, started 24 hours after infection, significantly reduced SARS-CoV-2 viraemia in lung tissue.

Target population

All humans with recent SARS-CoV-2 infection.

Generalisability of interpretation

Yes. We likely can hypothesise that the beneficial effect of molnupiravir observed in the mice would be observed in humans, based on the validity of the human lung tissue model and on the observation of a similar pathological response to covid-19 in the lung tissue of the mice as has been seen in the lung tissue of patients with covid-19.

Generalisability of estimate

No. While the study used human lung tissue in mice and used an appropriately scaled dose, the lung tissue was otherwise isolated from human biology, and mouse immune responses differ from those seen in humans in a manner that would be very difficult to quantify.

In this animal model study, generalising the interpretation of the results was the primary goal. Generalisation of the estimate was not relevant, as the drug would be tested further in human studies. When it comes to animal model studies (or cell line studies), we need to recognise that an underlying assumption is that the strength of the unaccounted-for effect measure modifiers (which are likely unknown) and the difference in the distribution of the effect measure modifiers between the study sample and human target population is not large enough to change the inference being made. This strong assumption is why animal studies are followed up by clinical trials to ensure that interpretation does indeed generalise and to obtain a quantifiable estimate of the effect in humans.

Defining representativeness in observational studies

Researchers used testing, hospital, and vaccine registry databases to build an observational cohort of vaccinated adults living in New York, with age matched unvaccinated controls. 9 Their goal was to assess the effectiveness of covid-19 vaccines for preventing SARS-CoV-2 infection and covid-19 related hospital admission in the general population. They found that vaccine effectiveness for preventing infection was highest in the week of 1 May 2021 (93.4%) (when prevalence of the delta variant was negligible), but that effectiveness declined as the delta variant became more prevalent, with a low of 73.5% in the week of 10 July 2021. By contrast, the effectiveness for preventing hospital admission did not wane during this same calendar period. The researchers concluded that their findings were evidence in support of booster vaccines.

Adult residents of New York state.

Yes. We have little reason to suspect that vaccines would not be effective against covid-19 and hospital admission or that we would observe different trends in vaccine effectiveness over time among New York residents who were not included in this study (or residents of other US states).

Additional data needed. The registry based study included a wide range of ages and the different vaccine types (Pfizer, Moderna, and Johnson & Johnson). The paper did not report the distribution of comorbid conditions such as asthma, which could be potential effect measure modifiers. With such information, researchers may be able to determine whether the estimate could be generalised to the broader New York population.

Here, generalising the interpretation and the estimate are both important. Generalising the estimate to the target population requires more effort both from the researchers designing the study and from those analysing the data but could be incredibly useful for informing covid-19 prevention efforts in New York. If we wished to generalise to target populations beyond New York (eg, the entire US), we would need to make assumptions about whether there are effect measure modifiers that differ between New York and the entire US and whether we have them measured.

Defining representativeness in descriptive studies

Researchers sought to capture the burden of covid-19 among people who inject drugs in the San Diego-Tijuana area. 10 Participants from both cities were recruited using street outreach and mobile vans. Blood samples and nasal swabs were collected to test for the presence of SARS-CoV-2 antibodies and RNA. None of the 485 participants had detectable SARS-CoV-2 RNA, but 140 (36.3%) were seropositive based on the presence of antibodies. This proportion was larger than the prevalence reported in the general population for either city. No trends were seen in prevalence of antibodies to SARS-CoV-2 over the study period (October 2020-June 2021).

People who use drugs by injection in the San Diego-Tijuana region.

Yes. It is reasonable that the target population has a higher prevalence of SARS-CoV-2 than the general population, even beyond the time frame and sample studied.

Perhaps. Under an appropriate sampling and recruitment strategy, the 36.3% prevalence of SARS-CoV-2 could be generalised to the full sample of the target population, at least during the time frame examined. We would be unable to generalise the estimate to other points in the pandemic, with different SARS-CoV-2 strains and levels of community exchange.

Just as in the observational study, generalising both the estimate and the interpretation are important for assessing the relevance of this study. We note here that the target population selected was much narrower than those of the previous studies, but this reflects the research and public health goals of the study. The researchers likely could not make statements regarding representativeness to broader target populations (eg, all people who inject drugs in the US) without further evidence.

Generalisable in estimate

A sample is representative if its results are generalisable in estimate. For a given estimand (eg, risk difference, odds ratio, population mean), the estimate obtained in the study sample is the same within a margin of error as what would be estimated in the target population. In the molnupiravir randomised controlled trial, 7 we might hypothesise that the risk difference comparing molnupiravir with placebo estimated in the trial is the same risk difference as would be estimated in the target population of all adults with recent SARS-CoV-2 infection who were not in the hospital. Generalising the estimate obtained in a given study sample might be considered the primary goal when intending to quantitatively inform policy interventions or when obtaining effect estimates in the target population is impossible or infeasible. 11

Generalisability in estimate can be achieved if the distributions of key covariates are the same as in the target population, as would occur in expectation with random sampling. Thus, generalising the estimate aligns closely with the definition of representativeness based on representative sampling. These key covariates are those that affect the variable under study (eg, hospital admission or death) and thus are potential effect measure modifiers of the effect of a treatment on that variable. By effect measure modifiers, we mean variables where the effect of the treatment differs by levels of that variable on some scale (eg, risk difference, risk ratio, odds ratio). In our example, age might be an effect measure modifier because the effect of molnupiravir on hospital admission or death (as quantified by the risk difference) might differ across ages.

More generally, even if the distribution of the key covariates differs between the sample and target population, the sample might still be representative within stratums of the key covariates, such that the stratum specific estimates (eg, risk difference within age categories) can be generalised from the sample to the target population. While this generalisation requires that all the key covariates be measured, the proportion of the sample in the covariate stratums need not exactly match the proportion who fall into that subgroup in the target population.

If we apply this definition of generalisability in estimate to the molnupiravir trial example, suppose our target population is all individuals recently infected with SARS-CoV-2 who were not in the hospital. In this case, we would not be able to generalise the trial’s estimate to the entire target population, even if we could control for post-randomisation factors such as non-adherence, because the trial sample did not include vaccinated individuals. However, the target population does include vaccinated individuals, and it is reasonable to assume that the effect of molnupiravir on disease progression to hospital admission or death would vary by vaccination status. On the other hand, we might be able to generalise our results within the stratum of unvaccinated individuals, provided all other effect measure modifiers were similar. In this case, we would say that the sample is representative within that stratum.

Generalisable in interpretation

A sample is representative if its results are generalisable in interpretation. While the estimates obtained in the sample are not quantitatively the same within a margin of error as those that would be estimated in the target population, we can hypothesise, based on background knowledge, that the interpretation (which could be the direction of effect, general inference from the results, or knowledge gained from an experiment) would remain the same. 12 For example, we might hypothesise that molnupiravir is generally protective against hospital admission or death from covid-19, even in samples other than the study sample. Generalising the interpretation aligns with the broad definition of representative, which states that a study sample resembles what would be expected in the target population.

We often generalise the interpretation of our own results to external populations, and any study that generalises in estimate will also generalise in interpretation. The primary goal in studies examining fundamental laws of nature or asking research questions that are relatively independent of historical and environmental context are to generalise the interpretation, rather than the estimate. Generalising the interpretation should be done cautiously, however, because it is based on hypotheses that the mechanisms or biological processes under investigation in the study sample are (at least approximately) identical to those that would be seen in the target population.

If we apply this definition of generalisability in interpretation to the molnupiravir trial example, 7 it might be reasonable to hypothesise that molnupiravir would have a beneficial impact if given to those individuals infected with SARS-CoV-2, even beyond the enrolled sample of participants with moderate illness who were not in the hospital. We might base this hypothesis on our understanding of the drug’s biological mechanism and the validity of a properly conducted, double blind, placebo controlled trial. In this example, we generalise the interpretation despite the fact that the risk difference comparing molnupiravir with placebo estimated in the trial would differ from the risk difference estimated in the target population (again assuming that the target population is all individuals with recent SARS-CoV-2 infection who were not in the hospital).

In summary, we consider a sample to be representative of a target population if its results can be generalised to that target population either in estimate or in interpretation. Any statements made regarding the representativeness of the study need to make this further qualification. Is it the estimate obtained or the interpretation of the results that are generalisable to the target population? Researchers should also do what they can to safeguard their results from being applied incorrectly. Even in studies with a strong scientific rationale for generalising the interpretation of results to the target population, researchers might need to mention that the estimate obtained in the sample should not be naively generalised to the target population.

Stating which form of representativeness was the goal of the study might also be useful. In the example of the molnupiravir randomised controlled trial, 7 generalising the interpretation regarding drug efficacy to the target population might have been the primary goal. Many trials have this same goal, because the investigators often over sample individuals at high risk for the outcome in order to increase the power of the study. (Even so, clinical trials have received some criticism that they rarely represent a more general target population. 5 ) If it was possible, generalising the estimate to the target population would be useful for predicting how molnupiravir would perform in practice but might not be immediately required for the study results to be meaningful. Further studies would likely need to be conducted to generalise the interpretation to other target populations, such as children recently infected with SARS-CoV2.

Several points relate to defining representativeness and are worth discussing. Firstly, irrespective of the way in which a sample is representative, the target population must be clearly defined. Stating that a sample is representative is meaningless unless researchers specify what population it represents or its results are being applied to. 5 As an example, we showed how specifying different target populations (all individuals v all unvaccinated individuals who were not in the hospital) for the molnupiravir randomised controlled trial had different implications for whether the results were generalisable in estimate.

Secondly, researchers must be clear about the assumptions required for generalising to the target population. When generalising the estimate, these assumptions might be made based on knowledge of whether the study was designed using a simple random sample or whether stratification by relevant key covariates is possible. When generalising the interpretation, the assumptions might be made based on a knowledge of basic scientific premises or the validity of a related animal model. If researchers attempted to generalise the interpretation but the scientific principles underlying that generalisation did not hold (eg, the validity of the animal model for describing human physiology), then the assumptions would be violated, and the inferences in the study would not be representative. In either case, the way to truly test whether the assumptions held would be to estimate the effect of interest in the target population. While we often generalise in estimate because designing a study in the target population would not be feasible, we generally consider such a study necessary to prove hypotheses regarding generalisation of interpretation, especially when the sample is highly removed from the target population (eg, cell line v human population).

Thirdly, a natural extension of generalising the (overall or stratum specific) estimate to a target population are methods to estimate the overall mean of an outcome or the average effect of a treatment on an outcome (rather than a stratum specific estimate) in the target population. 5 13 14 While the study sample might not be representative of the target population as observed, it could be made representative by using methods for generalisability or transportability, such as weighting or standardization to control for the key covariates or effect measure modifiers that differ between the samples. 15 These approaches require measuring and accounting for all relevant key covariates, meeting certain identifiability conditions, and often making model specification assumptions. 13 14 Even further, any study that is representative in interpretation could theoretically be made representative in estimate if all relevant effect measure modifiers were measured and accounted for; however, that is not always possible when the study sample is distant from the target population (eg, laboratory mice to humans).

Fourthly, the concepts of representativeness and generalisability discussed above also relate to the term “applicability” used in certain risk-of-bias tools, such as the PROBAST and QUADAS. 16 17 All concepts centre on the idea that it is important to assess a study and its results in terms of how well they can be related to some target population. While we discussed causal and descriptive studies in this article, the two tools mentioned apply this concept to predictive and diagnostic studies.

Finally, one question that has been raised is whether generalising the interpretation or the estimate is intrinsically more important for health research and for science broadly. It could be argued that generalising the interpretation is the primary aim of scientific inference and thus should be our goal in most studies. 1 The underlying premise is that the goal of science is the discovery of universal knowledge about nature that will hold true in most instances. If we view health research from this viewpoint, then generalising the interpretation is what matters. By contrast, generalisation of the estimate can never be universal. The estimate obtained in a particular study sample will always be tied to a specific scientific or public health question, and the study design and will vary based on the distribution of key covariates across time and populations. However, to inform policies and interventions in the real world, we must be able to predict health outcomes in human populations beyond those we studied. Therefore, generalisation of the estimate (whether obtained via study design or analytical methods) is an important goal. A further argument could be that these endeavours of statistical inference are just as informative for science as the inferences above. Science can be about discovering laws of nature; it can also seek to understand particular facets of nature. For some areas of health research, such as epidemiology and other population health sciences, the facet of nature under study is disease as it occurs in humans at a population level, and true understanding of the disease under study will be contextualised by time, place, history, and social environment. Consideration for how these factors have changed from the original setting to some new time or target population and how these changes might affect the estimate obtained is critical.

While such theoretical debates are important, our comprehensive definition of representativeness does not treat either generalisation of estimate or interpretation as inherently more relevant. That evaluation largely depends on the research question and study design at hand. Health researchers both develop the universal knowledge related to the health of populations and investigate how that knowledge can be applied to improve the health of populations, and the two ends of the research spectrum are fundamentally linked. What is important, then, is that researchers are clear on the manner in which their results can be applied to the target population when they say their study is representative and the assumptions underlying that statement.

Conclusions

We have established the idea that a study sample can be representative of a target population if one of the following is true: the estimate obtained in the study sample is generalisable to the target population or the interpretation of the study results is generalisable to the target population. Whether a study sample can be representative of a target population through the first definition depends on the study design or whether the variables affecting the outcome (which could be effect measure modifiers of the effect of interest) have been measured. On the other hand, even in the absence of simple random sampling or measurement of all key covariates, we can say that the study is representative in terms of its interpretation, direction of effect, or inference, because this requires less stringent assumptions than generalising the study estimate. 12 The example studies provided give guidance on how one might determine whether the study sample from different types of research is representative and whether, for the specific research question, generalising the estimate or the interpretation was the priority.

  • Rothman KJ ,
  • Gallacher JEJ ,
  • Richiardi L ,
  • Westreich D ,
  • Edwards JK ,
  • Lesko CR , et al
  • Jayk Bernal A ,
  • Gomes da Silva MM ,
  • Musungaie DB , et al
  • Gralinski LE ,
  • Johnson CE , et al
  • Rosenberg ES ,
  • Dorabawila V ,
  • Easton D , et al
  • Strathdee SA ,
  • Abramovitz D ,
  • Harvey-Vera A , et al
  • Rudolph JE ,
  • Eron JJ , et al
  • Buchanan AL ,
  • Westreich D , et al
  • Degtiar I ,
  • Moons KGM ,
  • Riley RD , et al
  • ↵ Bristol U of. QUADAS-2 [Internet]. University of Bristol . 2022 . Available : https://www.bristol.ac.uk/population-health-sciences/projects/quadas/quadas-2/

Twitter @jerudolph13

Contributors All authors were responsible for the concept of the article. JER wrote the original draft. YZ, PD, SHM, and BL reviewed and edited the manuscript. JER is guarantor. The corresponding author attests that all listed authors meet authorship criteria and that no others meeting the criteria have been omitted.

Funding This work was supported in part by National Institutes of Health grants R01-CA250851, R01-AI170240, U01-DA036297, and R01-DA057673. The funders had no role in the writing of the report or decision to submit the article for publication.

Competing interests All authors have completed the ICMJE uniform disclosure form at http://www.icmje.org/disclosure-of-interest/ and declare: support from the National Institutes of Health for the submitted work; all authors declare no financial relationships with any organisations that might have an interest in the submitted work in the previous three years; no other relationships or activities that could appear to have influenced the submitted work.

Provenance and peer review Not commissioned; externally peer reviewed.

Read the full text or download the PDF:

Qualitative vs. quantitative data in research: what's the difference?

Qualitative vs. quantitative data in research: what's the difference?

If you're reading this, you likely already know the importance of data analysis. And you already know it can be incredibly complex.

At its simplest, research and it's data can be broken down into two different categories: quantitative and qualitative. But what's the difference between each? And when should you use them? And how can you use them together?

Understanding the differences between qualitative and quantitative data is key to any research project. Knowing both approaches can help you in understanding your data better—and ultimately understand your customers better. Quick takeaways:

Quantitative research uses objective, numerical data to answer questions like "what" and "how often." Conversely, qualitative research seeks to answer questions like "why" and "how," focusing on subjective experiences to understand motivations and reasons.

Quantitative data is collected through methods like surveys and experiments and analyzed statistically to identify patterns. Qualitative data is gathered through interviews or observations and analyzed by categorizing information to understand themes and insights.

Effective data analysis combines quantitative data for measurable insights with qualitative data for contextual depth.

What is quantitative data?

Qualitative and quantitative data differ in their approach and the type of data they collect.

Quantitative data refers to any information that can be quantified — that is, numbers. If it can be counted or measured, and given a numerical value, it's quantitative in nature. Think of it as a measuring stick.

Quantitative variables can tell you "how many," "how much," or "how often."

Some examples of quantitative data :  

How many people attended last week's webinar? 

How much revenue did our company make last year? 

How often does a customer rage click on this app?

To analyze these research questions and make sense of this quantitative data, you’d normally use a form of statistical analysis —collecting, evaluating, and presenting large amounts of data to discover patterns and trends. Quantitative data is conducive to this type of analysis because it’s numeric and easier to analyze mathematically.

Computers now rule statistical analytics, even though traditional methods have been used for years. But today’s data volumes make statistics more valuable and useful than ever. When you think of statistical analysis now, you think of powerful computers and algorithms that fuel many of the software tools you use today.

Popular quantitative data collection methods are surveys, experiments, polls, and more.

Quantitative Data 101: What is quantitative data?

Take a deeper dive into what quantitative data is, how it works, how to analyze it, collect it, use it, and more.

Learn more about quantitative data →

What is qualitative data?

Unlike quantitative data, qualitative data is descriptive, expressed in terms of language rather than numerical values.

Qualitative data analysis describes information and cannot be measured or counted. It refers to the words or labels used to describe certain characteristics or traits.

You would turn to qualitative data to answer the "why?" or "how?" questions. It is often used to investigate open-ended studies, allowing participants (or customers) to show their true feelings and actions without guidance.

Some examples of qualitative data:

Why do people prefer using one product over another?

How do customers feel about their customer service experience?

What do people think about a new feature in the app?

Think of qualitative data as the type of data you'd get if you were to ask someone why they did something. Popular data collection methods are in-depth interviews, focus groups, or observation.

Start growing with data and Fullstory.

Request your personalized demo of the Fullstory Digital Experience Intelligence platform .

What are the differences between qualitative vs. quantitative data?

When it comes to conducting data research, you’ll need different collection, hypotheses and analysis methods, so it’s important to understand the key differences between quantitative and qualitative data:

Quantitative data is numbers-based, countable, or measurable. Qualitative data is interpretation-based, descriptive, and relating to language.

Quantitative data tells us how many, how much, or how often in calculations. Qualitative data can help us to understand why, how, or what happened behind certain behaviors .

Quantitative data is fixed and universal. Qualitative data is subjective and unique.

Quantitative research methods are measuring and counting. Qualitative research methods are interviewing and observing.

Quantitative data is analyzed using statistical analysis. Qualitative data is analyzed by grouping the data into categories and themes.

Qualtitative vs quantitative examples

As you can see, both provide immense value for any data collection and are key to truly finding answers and patterns. 

More examples of quantitative and qualitative data

You’ve most likely run into quantitative and qualitative data today, alone. For the visual learner, here are some examples of both quantitative and qualitative data: 

Quantitative data example

The customer has clicked on the button 13 times. 

The engineer has resolved 34 support tickets today. 

The team has completed 7 upgrades this month. 

14 cartons of eggs were purchased this month.

Qualitative data example

My manager has curly brown hair and blue eyes.

My coworker is funny, loud, and a good listener. 

The customer has a very friendly face and a contagious laugh.

The eggs were delicious.

The fundamental difference is that one type of data answers primal basics and one answers descriptively. 

What does this mean for data quality and analysis? If you just analyzed quantitative data, you’d be missing core reasons behind what makes a data collection meaningful. You need both in order to truly learn from data—and truly learn from your customers. 

What are the advantages and disadvantages of each?

Both types of data has their own pros and cons. 

Advantages of quantitative data

It’s relatively quick and easy to collect and it’s easier to draw conclusions from. 

When you collect quantitative data, the type of results will tell you which statistical tests are appropriate to use. 

As a result, interpreting your data and presenting those findings is straightforward and less open to error and subjectivity.

Another advantage is that you can replicate it. Replicating a study is possible because your data collection is measurable and tangible for further applications.

Disadvantages of quantitative data

Quantitative data doesn’t always tell you the full story (no matter what the perspective). 

With choppy information, it can be inconclusive.

Quantitative research can be limited, which can lead to overlooking broader themes and relationships.

By focusing solely on numbers, there is a risk of missing larger focus information that can be beneficial.

Advantages of qualitative data

Qualitative data offers rich, in-depth insights and allows you to explore context.

It’s great for exploratory purposes.

Qualitative research delivers a predictive element for continuous data.

Disadvantages of qualitative data

It’s not a statistically representative form of data collection because it relies upon the experience of the host (who can lose data).

It can also require multiple data sessions, which can lead to misleading conclusions.

The takeaway is that it’s tough to conduct a successful data analysis without both. They both have their advantages and disadvantages and, in a way, they complement each other. 

Now, of course, in order to analyze both types of data, information has to be collected first.

Let's get into the research.

Quantitative and qualitative research

The core difference between qualitative and quantitative research lies in their focus and methods of data collection and analysis. This distinction guides researchers in choosing an appropriate approach based on their specific research needs.

Using mixed methods of both can also help provide insights form combined qualitative and quantitative data.

Best practices of each help to look at the information under a broader lens to get a unique perspective. Using both methods is helpful because they collect rich and reliable data, which can be further tested and replicated.

What is quantitative research?

Quantitative research is based on the collection and interpretation of numeric data. It's all about the numbers and focuses on measuring (using inferential statistics ) and generalizing results. Quantitative research seeks to collect numerical data that can be transformed into usable statistics.

It relies on measurable data to formulate facts and uncover patterns in research. By employing statistical methods to analyze the data, it provides a broad overview that can be generalized to larger populations.

In terms of digital experience data, it puts everything in terms of numbers (or discrete data )—like the number of users clicking a button, bounce rates , time on site, and more. 

Some examples of quantitative research: 

What is the amount of money invested into this service?

What is the average number of times a button was dead clicked ?

How many customers are actually clicking this button?

Essentially, quantitative research is an easy way to see what’s going on at a 20,000-foot view. 

Each data set (or customer action, if we’re still talking digital experience) has a numerical value associated with it and is quantifiable information that can be used for calculating statistical analysis so that decisions can be made. 

You can use statistical operations to discover feedback patterns (with any representative sample size) in the data under examination. The results can be used to make predictions, find averages, test causes and effects, and generalize results to larger measurable data pools. 

Unlike qualitative methodology, quantitative research offers more objective findings as they are based on more reliable numeric data.

Quantitative data collection methods

A survey is one of the most common research methods with quantitative data that involves questioning a large group of people. Questions are usually closed-ended and are the same for all participants. An unclear questionnaire can lead to distorted research outcomes.

Similar to surveys, polls yield quantitative data. That is, you poll a number of people and apply a numeric value to how many people responded with each answer.

Experiments

An experiment is another common method that usually involves a control group and an experimental group . The experiment is controlled and the conditions can be manipulated accordingly. You can examine any type of records involved if they pertain to the experiment, so the data is extensive. 

What is qualitative research?

Qualitative research does not simply help to collect data. It gives a chance to understand the trends and meanings of natural actions. It’s flexible and iterative.

Qualitative research focuses on the qualities of users—the actions that drive the numbers. It's descriptive research. The qualitative approach is subjective, too. 

It focuses on describing an action, rather than measuring it.

Some examples of qualitative research: 

The sunflowers had a fresh smell that filled the office.

All the bagels with bites taken out of them had cream cheese.

The man had blonde hair with a blue hat.

Qualitative research utilizes interviews, focus groups, and observations to gather in-depth insights.

This approach shines when the research objective calls for exploring ideas or uncovering deep insights rather than quantifying elements.

Qualitative data collection methods

An interview is the most common qualitative research method. This method involves personal interaction (either in real life or virtually) with a participant. It’s mostly used for exploring attitudes and opinions regarding certain issues.

Interviews are very popular methods for collecting data in product design .

Focus groups

Data analysis by focus group is another method where participants are guided by a host to collect data. Within a group (either in person or online), each member shares their opinion and experiences on a specific topic, allowing researchers to gather perspectives and deepen their understanding of the subject matter.

Digital Leadership Webinar: Accelerating Growth with Quantitative Data and Analytics

Learn how the best-of-the-best are connecting quantitative data and experience to accelerate growth.

So which type of data is better for data analysis?

So how do you determine which type is better for data analysis ?

Quantitative data is structured and accountable. This type of data is formatted in a way so it can be organized, arranged, and searchable. Think about this data as numbers and values found in spreadsheets—after all, you would trust an Excel formula.

Qualitative data is considered unstructured. This type of data is formatted (and known for) being subjective, individualized, and personalized. Anything goes. Because of this, qualitative data is inferior if it’s the only data in the study. However, it’s still valuable. 

Because quantitative data is more concrete, it’s generally preferred for data analysis. Numbers don’t lie. But for complete statistical analysis, using both qualitative and quantitative yields the best results. 

At Fullstory, we understand the importance of data, which is why we created a behavioral data platform that analyzes customer data for better insights. Our platform delivers a complete, retroactive view of how people interact with your site or app—and analyzes every point of user interaction so you can scale.

Unlock business-critical data with Fullstory

A perfect digital customer experience is often the difference between company growth and failure. And the first step toward building that experience is quantifying who your customers are, what they want, and how to provide them what they need.

Access to product analytics is the most efficient and reliable way to collect valuable quantitative data about funnel analysis, customer journey maps , user segments, and more.

But creating a perfect digital experience means you need organized and digestible quantitative data—but also access to qualitative data. Understanding the why is just as important as the what itself.

Fullstory's DXI platform combines the quantitative insights of product analytics with picture-perfect session replay for complete context that helps you answer questions, understand issues, and uncover customer opportunities.

Start a free 14-day trial to see how Fullstory can help you combine your most invaluable quantitative and qualitative insights and eliminate blind spots.

About the author

Our team of experts is committed to introducing people to important topics surrounding analytics, digital experience intelligence, product development, and more.

Related posts

Jordan Morrow demystifies AI and data literacy, offering tips to help you navigate these technologies confidently without needing to be a tech wizard.

Irv shares steps to setting up real-time monitoring and alerts for app issues using Fullstory.

Discover Fullstory's edge in digital analytics: unmatched efficiency, user-friendly design, & comprehensive insights for forward-thinking enterprises.

qualitative research is not statistically representative

CRO Platform

Test your insights. Run experiments. Win. Or learn. And then win.

eCommerce Customer Analytics Platform

Customer Survey Platform

  • Managed Services
  • How we Help

Acquisition matters. But retention matters more. Understand, monitor & nurture the best customers.

  • Certified Partners
  • Case Studies
  • Ebooks, Tools, Templates
  • Digital Marketing Glossary
  • eCommerce Growth Stories
  • eCommerce Growth Show
  • Help & Technical Documentation

CRO Guide   >  Chapter 3.1

Qualitative Research: Definition, Methodology, Limitation, Examples

These are tips and tricks on how to use qualitative research to better understand your audience and improve your ROI. Also learn the difference between qualitative and quantitative data.

gathering data

Table of Contents

There is a fundamental distinction between data types: qualitative and quantitative. Typically, we call data ‘quantitative’ if it is in numerical form, and ‘qualitative’ if it’s not.

Marketers love to get into customers’ minds. But for that, they need to do a qualitative research. Face-to-face interviews, focus groups, or qualitative observations can provide valuable insights about your products, your market, and your customers’ opinions and motivations.

What is Qualitative Research

Qualitative research is a market research method that focuses on obtaining data through open-ended and conversational communication. This method focuses on the “why” rather than the “what” people think about you.

Let’s say you have an online shop that addresses a general audience. You do a demographic analysis and you find out that most of your customers are male. Naturally, you will want to find out  why  women are not buying from you. And that’s what qualitative research will help you find out.

Quantitative vs. Qualitative Research

Qualitative and quantitative research side by side in a table

Image source

Quantitative research is concerned with measurement and numbers, while qualitative research is concerned with understanding and words.

Quantitative research is used to quantify the problem. Its main goal is to generate numerical data or data that can be turned into statistics. It uses measurable data to formulate facts and uncover patterns in research.

Quantitative data collection methods include various forms of surveys (online surveys, paper surveys, mobile surveys, kiosk surveys, etc.), face-to-face interviews, telephone interviews, longitudinal studies, website interceptors, online polls, and systematic observations.

On the other hand, qualitative research is used to gain an understanding of underlying reasons, opinions, and motivations. It provides insights into the problem or helps to develop ideas or hypotheses for potential quantitative research.

Qualitative data collection methods include focus groups (group discussions), individual interviews, and participation/observation.

The statistical data of quantitative methods obtained from many people reveal a broad, generalizable set of findings. In contrast, qualitative methods produce a large amount of detailed information about a smaller number of people that results in rich understanding but reduces generalizability.

Qualitative Research Methodology

Once the marketer has decided that their research questions will provide data that is qualitative in nature, the next step is to choose the appropriate qualitative approach.

The approach chosen will take into account the purpose of the research, the role of the researcher, the data collected, the method of data analysis  and how the results will be presented. The most common approaches include:

  • Narrative : explores the life of an individual, tells their story;
  • Phenomenology : attempts to understand or explain life experiences or phenomena;
  • Grounded theory : investigates the process, action, or interaction with the goal of developing a theory “grounded” in observations;
  • Ethnography : describes and interprets an ethnic, cultural, or social group;
  • Case study : examines episodic events in a definable framework, develops in-depth analyses of single or multiple cases, generally explains “how”.

Types of Qualitative Research Methods

Qualitative research methods are designed in a manner that they help reveal the behavior and perception of a target audience regarding a particular topic.

The most frequently used qualitative research methods are one-on-one interviews, focus groups, ethnographic research, case study research, record keeping, and qualitative observation.

1. One-on-one interviews

Conducting one-on-one interviews is one of the most common qualitative research methods. One of the advantages of this method is that it provides a great opportunity to gather precise data about what people think and their motivations.

Spending time talking to customers not only helps marketers understand who their clients are, but it also helps with customer care: clients love hearing from brands. This strengthens the relationship between a brand and its clients and paves the way for customer testimonials.

These interviews can be performed face-to-face or on the phone and usually last between half an hour and two hours or more.

When a one-on-one interview is conducted face-to-face, it also gives the marketer the opportunity to read the body language of the respondent and match the responses.

2. Focus groups

Focus groups are another commonly used qualitative research method. The ideal size of a focus group is usually between five and eight participants.

If the topic is of minor concern to participants, and if they have little experience with the topic, then a group size of 10 could be productive.

As the topic becomes more important, if people have more expertise on the topic, or if they are likely to have strong feelings about the topic, then the group size should be restricted to five or six people.

The main goal of a focus group is to find answers to the “why”, “what”, and “how” questions.

One advantage that focus groups have is that the marketer doesn’t necessarily have to interact with the group in person. Nowadays focus groups can be sent as online surveys on various devices.

Focus groups are an expensive option compared to the other qualitative research methods, which is why they are typically used to explain complex processes. Focus groups are especially useful when it comes to market research on new products and testing new concepts.

3. Ethnographic research

Ethnographic research is the most in-depth observational method that studies individuals in their naturally occurring environment.

This method aims at understanding the cultures, challenges, motivations, and settings that occur.

Ethnographic research requires the marketer to adapt to the target audiences’ environments (a different organization, a different city, or even a remote location), which is why geographical constraints can be an issue while collecting data.

This type of research can last from a few days to a few years. It’s challenging and time-consuming and solely depends on the expertise of the marketer to be able to analyze, observe, and infer the data.

4. Case study research

The case study method has grown into a valuable qualitative research method. This type of research method is usually used in education or social sciences.

Case study research may seem difficult to operate, but it’s actually one of the simplest ways of conducting research as it involves a deep dive and thorough understanding of the data collection methods and inferring the data.

5. Record keeping

Record keeping is similar to going to the library: you go over books or any other reference material to collect relevant data. This method uses already existing reliable documents and similar sources of information as a data source.

6. Qualitative observation

Qualitative observation is a method that uses subjective methodologies to gather systematic information or data. This method deals with the five major sensory organs and their functioning, sight, smell, touch, taste, and hearing.

Qualitative observation doesn’t involve measurements or numbers but instead characteristics.

Examples of Qualitative Research

1. online grocery shop with a predominantly male audience.

Let’s go back to the previous example. You have an online grocery shop. By nature, it addresses a general audience, but after you do a demographic analysis you find out that most of your customers are male.

One good method to determine why women are not buying from you is to hold one-on-one interviews with potential customers in the category.

Interviewing a sample of potential female customers should reveal why they don’t find your store appealing. The reasons could range from not stocking enough products for women to the fact that you also sell sex toys for example.

Tapping into different market segments will have a positive impact on your revenue.

2. Software company launching a new product

Focus groups are great for establishing product-market fit.

Let’s assume you are a software company who wants to launch a new product and you hold a focus group with 12 people. Although getting their feedback regarding users’ experience with the product is a good thing, this sample is too small to define how the entire market will react to your product.

So what you can do instead is holding multiple focus groups in 20 different geographic regions. Each region should be hosting a group of 12 for each market segment; you can even segment your audience based on age. This would be a better way to establish credibility in the feedback you receive.

3. Alan Peshkin’s “God’s Choice: The Total World of a Fundamentalist Christian School”

Moving from a fictional example to a real-life one, let’s analyze Alan Peshkin’s 1986 book “God’s Choice: The Total World of a Fundamentalist Christian School”.

Peshkin studied the culture of Bethany Baptist Academy by interviewing the students, parents, teachers, and members of the community alike, and spending eighteen months observing them to provide a comprehensive and in-depth analysis of Christian schooling as an alternative to public education.

Peshkin described Bethany Baptist Academy as having institutional unity of purpose, a dedicated faculty, an administration that backs teachers in enforcing classroom disciplines, cheerful students, rigorous homework, committed parents, and above all grounded in positive moral values and a character building environment.

However, it lacked cultural diversity, which meant that students were trained in one-dimensional thought, entirely cut off from viewpoints that differ with their teacher’s biblical interpretations, and a heavily censored library.

Even after discovering all this, Peshkin still presented the school in a positive light and stated that public schools have much to learn from such schools.

Peshkin’s in-depth study represents a qualitative research that uses observations and unstructured interviews, without any assumptions or hypothesis. He utilizes descriptive or non-quantifiable data on Bethany Baptist Academy specifically, without attempting to generalize the findings to other Christian schools.

4. Understanding buyers’ trends

Another way marketers can use quality research is to understand buyers’ trends. To do this, marketers need to look at historical data for both their company and their industry and identify where buyers are purchasing items in higher volumes.

For example, electronics distributors know that the holiday season is a peak market for sales while life insurance agents find that spring and summer wedding months are good seasons for targeting new clients.

5. Determining products/services missing from the market

Conducting your own research isn’t always necessary. If there are significant breakthroughs in your industry, you can use industry data and adapt it to your marketing needs.

The influx of hacking and hijacking of cloud-based information has made Internet security a topic of many industry reports lately. A software company could use these reports to better understand the problems his clients are facing.

As a result, the company can provide solutions prospects already know they need.

Real-time Customer Lifetime Value (CLV) Benchmark Report

See where your business stands compared to 1,000+ e-stores in different industries.

35 reports by industry and business size.

Limitations of qualitative research

The disadvantages of qualitative research are quite unique. The techniques of the data collector and their own unique observations can alter the information in subtle ways. That being said, these are the qualitative research’ limitations:

1. It’s a time-consuming process

The main drawback of qualitative research is that the process is time-consuming. Another problem is that the interpretations are limited. Personal experience and knowledge influence observations and conclusions.

Thus, a qualitative research might take several weeks or months. Also, since this process delves into personal interaction for data collection, discussions often tend to deviate from the main issue to be studied.

2. You can’t verify the results of qualitative research

Because qualitative research is open-ended, participants have more control over the content of the data collected. So the marketer is not able to verify the results objectively against the scenarios stated by the respondents.

3. It’s a labor-intensive approach

Qualitative research requires a labor-intensive analysis process such as categorization, recoding, etc. Similarly, qualitative research requires well-experienced marketers to obtain the needed data from a group of respondents.

4. It’s difficult to investigate causality

Qualitative research requires thoughtful planning to ensure the obtained results are accurate. There is no way to analyze qualitative data mathematically. This type of research is based more on opinion and judgment rather than results. Because all qualitative studies are unique they are difficult to replicate.

5. Qualitative research is not statistically representative

Because qualitative research is a perspective-based method of research, the responses given are not measured.

Comparisons can be made and this can lead toward duplication, but for the most part, quantitative data is required for circumstances which need statistical representation and that is not part of the qualitative research process.

While doing a qualitative research, it’s important to cross-reference the data obtained with the quantitative data. By continuously surveying prospects and customers marketers can build a stronger database of useful information.

qualitative research is not statistically representative

Do Conversion Rate Optimization the Right way.

Explore helps you make the most out of your CRO efforts through advanced A/B testing, surveys, advanced segmentation and optimised customer journeys.

An isometric image of an adobe adobe adobe adobe ad.

If you haven’t subscribed yet to our newsletter, now is your chance!

A man posing happily in front of a vivid purple background for an engaging blog post.

Like what you’re reading?

Join the informed ecommerce crowd.

Stay connected to what’s hot in eCommerce. We will never bug you with irrelevant info.

By clicking the Button, you confirm that you agree with our Terms and Conditions .

Continue your Conversion Rate Optimization Journey

  • Last modified: January 3, 2023
  • Conversion Rate Optimization , User Research

Valentin Radu

Valentin Radu

Omniconvert logo on a black background.

We’re a team of people that want to empower marketers around the world to create marketing campaigns that matter to consumers in a smart way. Meet us at the intersection of creativity, integrity, and development, and let us show you how to optimize your marketing.

Our Software

  • > Book a Demo
  • > Partner Program
  • Blog Sitemap
  • Terms and Conditions
  • Privacy & Security
  • Cookies Policy
  • REVEAL Terms and Conditions

World Bank Blogs Logo

Qualitative Analysis with Representative Samples

Julian ashwin, vijayendra rao.

Economists almost never analyze qualitative data. We typically analyze quantitative data from structured survey questions because they are easier to administer to large representative samples of respondents, and easier to analyze using standard econometric methods. However, many questions of interest to economists may be better captured with open-ended qualitative interviews rather than structured questionnaires. These include important concepts like well-being, social norms, cultural change, vulnerability, resilience, decision-making, processes of change in interventions and experiments, and aspirations.

Structured questions work best on concepts where the possible range of responses can be predicted in advance by the researcher and, perhaps more importantly, where respondents have the same understanding of the latent construct underlying the question as the researcher. For concepts where respondents have heterogeneous understandings of the concept and where unpredictable probes and follow ups may be necessary, it may be preferable to allow the respondent freedom to respond in an open-ended conversational style and in the manner of their choosing. Additionally, a trained interviewer can then probe an issue in a relatively unstructured manner by iteratively asking follow-up questions in a more conversational style. This process also has the advantage of eliciting information that is more “reflexive ” and "bottom-up”, i.e. driven more by the respondent rather than primarily designed ex-ante by the researcher.

Open-ended approaches to interviews have not been employed much by economists because analyzing them is hard and almost impossible to do at scale with statistically representative samples. They are primarily the domain of qualitative researchers in anthropology, sociology and related fields who mull over recordings or transcripts of interviews for considerable periods of time, listening, reading, interpreting, and carefully coding them within the context of a theory or conceptual framework. Coding remains a labor-intensive process typically done by trained social scientists and is an essential step in conducting nuanced analysis of qualitative data that is based on human interpretation. Interpretative qualitative analysis is consequently associated with small sample studies. Typically, a dataset of, say, 100 interviews is considered a large-N qualitative dataset.   This small sample challenge that has been intrinsic to qualitative methods has resulted in a large methodological literature on qualitative and case-study methods focusing on justifying and interpreting data from interviews gathered from samples that are not designed to be statistically representative of larger populations. Their general approach has been to inductively draw out inferences that reflexively expand our understanding of an issue, or to inform theory, rather than claim statistical representativeness.

The advent of Natural Language Processing methods (NLP) tools that treat text-as-data have led some economists to begin to make the case for the analysis of open-ended interviews, which has coincided with growing interest in “ narrative economics .”   NLP methods broadly fall into two categories: unsupervised methods that extract measures from the text that explain the variation across documents, for example grouping documents into different topics; and supervised methods that extract measures from the text that explain some contextual information about the document. Social science applications of NLP to open ended interviews have generally focused on unsupervised methods. These methods essentially reduce the dimensionality of the text to make it more analyzable.   Researchers then interpret the data by analyzing these computer-generated representations of text. A potential drawback here is that rather than basing interpretation on a reading of the documents, researchers interpret the simplified representations of the text. Furthermore, as unsupervised models are typically “unguided”, i.e., they do not seek to extract a particular signal from text, it can often be that the resulting measures are not suited to the research question of interest. Therefore, unsupervised NLP methods generally do not permit the kind of interpretative qualitative analysis done by anthropologists and sociologists where humans rather than machines code and classify the data.

In a recent paper we develop a “supervised” NLP method that allows open-ended interviews, and other forms of text, to be analyzed using interpretative human coding. As supervised methods require documents to be “labelled”, we use interpretative human coding to generate these labels, thus following the logic of traditional qualitative analysis as closely as possible. Briefly, a sub-sample of the transcripts of open-ended interviews are coded by a small team of trained coders who read the transcripts, decide on a “coding-tree,” and then code the transcripts using qualitative analysis software which is designed for this purpose. This human coded sub-sample is then used as a training set to predict the codes on the full, statistically representative sample. The annotated data on the “enhanced” sample is then analyzed using standard statistical analysis, correcting for the additional noise introduced by the predictions. Our method allows social scientists to analyze representative samples of open-ended qualitative interviews, and to do so by inductively creating a coding structure that emerges from a close, human reading of a sub-sample of interviews that are then used to predict codes on the larger sample. We see this as an organic extension of traditional, interpretative, human-coded qualitative analysis, but done at scale.

This method has interpretative advantages over “unsupervised” NLP methods, but it also has an advantage over methods which map text against pre-defined dictionaries of, for example, positive and negative words. While these dictionaries are very well developed in some areas, such as sentiment analysis for economic and financial news , in many cases a relevant dictionary may not be available for a particular research question, language or cultural context. Furthermore, before an in-depth reading of the documents it might not be clear what sort of dictionary might be relevant.    Working with human codes in a sub-set of the data falls in the category of “supervised” NLP methods – but gives us a training set that is specific to the sample being analyzed, and thus has the potential for nuanced, context-specific analysis. It is thus analogous to a dictionary created specifically for the analytic sample. Furthermore, we do not need to guess at the interpretation of our text-based measures, as they are a product of our interpretative coding process. We believe the method has wide applicability for a variety of questions.

In our paper we apply this method to study parents’ aspirations for their children by analyzing data from open-ended interviews conducted on a sample of approximately 2,200 Rohingya refuges and their Bangladeshi hosts in Cox’s Bazaar, Bangladesh.   Aspirations are an interesting subject to apply this method, because an open-ended approach allows us to study dimensions of aspirations that are difficult to capture in structured questionnaires.   The literature on aspirations in development economics focuses on what the philosopher Agnes Callard has called “ambition” - specific goals that parents may have for their children such as a level of education, or a profession. Open-ended interviews allow us to expand this to explore its moral and spiritual dimensions - what Callard calls “aspiration” to distinguish it from “ambition” - such as being a “good person” or being religiously inclined. They also allow us to study what the anthropologist Arjun Appadurai has called the “capacity to aspire” or the capacity to navigate your way to achieving a given goal.

The respondents all participated in an extensive household survey in 2019 covering questions related to demographics, assets, living standards, migration history and trauma.   In two subsequent survey rounds in 2020 and 2021 an adult member of the household was asked the following open-ended questions on aspirations (along with other topics):

1)       Can you tell me about the hopes and dreams you have for your children?

2)       What have you done to help them achieve these goals?

Their responses to these questions - with some conversational interaction and probing by the interviewers – averaged for about 10 minutes.   The interviews were recorded, transcribed, and translated into English.    We randomly selected 300 transcripts in the first round, and 400 in the next round, stratified by refugee status and gender, to be carefully read and coded by researchers led by a highly qualified qualitative sociologist.   Based on a close reading of a subset of our interviews, we develop a coding tree that categorized aspirations, ambitions, and navigational capacity along a range of dimensions, as shown in Figure 1.

Figure 1

We apply the method we develop to scale up this interpretative qualitative coding to our full sample of 2,200 interviews. This allows us to differentiate between, and analyze the correlates of, ambition, navigational capacity, and aspirations among Rohingya refugees and their Bangladeshi hosts.   We demonstrate in the paper that they are independent concepts that have distinctly different determinants which suggest different policy responses. For example, we find that while refugees generally have lower ambitions for their children, they generally display a higher capacity to achieve those ambitions. We also find that subjects are more likely to express secular aspirations for male children than for their female children. Table 1 below shows the estimated coefficients on indicators for refugee status and female eldest child in regressions for secular aspirations and low ability. With the smaller sample size of the human annotated interviews (the first and third columns) these coefficients are not significant at 5%, but with the larger enhanced sample (the second and fourth columns) the coefficients are highly significant. This illustrates the key advantage of our method – we are able to use the nuanced and detailed codes that emerge from interpretative qualitative analysis, but at a sample size that allows for statistical inference.

Table 1

The application to aspirations illustrates some of the advantages of our combination of interpretative human coding with supervised NLP methods. Our metrics can capture nuance that traditional quantitative survey questions might miss. Some concepts, like navigational capacity, would be inherently difficult to measure with a structured survey question. Other variables like educational ambition are easier to quantify, but we show that our qualitative variables add important color and context to more quantitative measures of ambition.

Unsupervised NLP analysis provides too coarse of a decomposition of the text, which may not be suited to many research questions, as we show by comparing our results to those using a Structural Topic Model . This topic model shows that there are clearly differences in the language used by, for instance, hosts and refugees. However, interpreting these differences in terms of aspirations, ambition and navigational capacity is difficult. Unsupervised methods can thus uncover interesting dimensions of variation in text data, but they will often not give interpretable answers to specific research questions.  

An alternative to our combination of interpretative human coding and NLP would be to simply code the entire sample manually. However, this is not only expensive and impractical, it also loses the advantage of being annotated by a small highly-quality team of trained social scientists who can discuss and agree on an interpretation.

In a series of simulations, illustrated in Figure 2, we show that for most researchers enhancing their human coded sample with machine annotation is likely to be optimal.   Figure 2 shows the evolution of two of the regression coefficients reported in Table 1 as we increase the number of human-annotated interviews.   In the left hand panel we show the distribution of the coefficient on female child in a regression for secular aspirations, and in the right hand panel the distribution of the coefficient on refugee status in a regression for low ability. As we move from the left to the right the number of human annotated interviews increases from 200 to 700. The distribution of the estimated coefficients on the human annotated sample is shown in blue, with the enhanced sample coefficient distribution shown in red.   As the size of the human annotated sample increases the distribution of the estimated coefficients get tighter as the estimate gets more precise. In all cases the coefficients are more precise in the enhanced sample than in human annotated sample, thanks to the larger sample size. Crucially, we can see that the benefits of enhancing the sample are seen even when a relatively small number of interviews are human annotated. Given the expense associated with expert human annotation, we therefore find that for most (budget constrained) researchers machine annotating part of their sample is likely to be optimal.

Figure 2

In future work, we intend to apply our methodology to a range of further questions including wellbeing and discrimination, both in Cox’s Bazaar and elsewhere. We are also exploring the potential benefits of applying pre-trained large language models, such as the well-known ChatGPT, in combination with qualitative analysis.

Julian Ashwin, Postdoctoral Researcher, London Business School

Postdoctoral Researcher, London Business School

Vijayendra Rao's picture

Lead Economist, Development Research Group, World Bank

Join the Conversation

  • Share on mail
  • comments added

Sampling considerations in qualitative research

Two weeks ago I talked about the importance of developing a recruitment strategy when designing a research project. This week we will do a brief overview of sampling for qualitative research, but it is a huge and complicated issue

Daniel Turner

Daniel Turner

Two weeks ago I talked about the importance of developing a recruitment strategy when designing a research project. This week we will do a brief overview of sampling for qualitative research, but it is a huge and complicated issue. There’s a great chapter ‘Designing and Selecting Samples’ in the book Qualitative Research Practice ( Ritchie et al 2013 ) which goes over many of these methods in detail.

Your research questions and methodological approach (ie grounded theory) will guide you to the right sampling methods for your study – there is never a one-size-fits-all approach in qualitative research! For more detail on this, especially on the importance of culturally embedded sampling, there is a well cited article by Luborsky and Rubinstein (1995) . But it’s also worth talking to colleagues, supervisors and peers to get advice and feedback on your proposals.

Marshall (1996) briefly describes three different approaches to qualitative sampling: judgement/purposeful sampling, theoretical sampling and convenience sampling.

But before you choose any approach, you need to decide what you are trying to achieve with your sampling. Do you have a specific group of people that you need to have in your study, or should it be representative of the general population? Are you trying to discover something about a niche, or something that is generalizable to everyone? A lot of qualitative research is about a specific group of people, and Marshall notes: “This is a more intellectual strategy than the simple demographic stratification of epidemiological studies, though age, gender and social class might be important variables. If the subjects are known to the research, they may be stratified according to known public attitudes or beliefs.”

Broadly speaking, convenience, judgement and theoretical sampling can be seen as purposeful – deliberately selecting people of interest in some way. However, randomly selecting people from a large population is still a desirable approach in some qualitative research. Because qualitative studies tend to have a small sample size due to the in-depth nature of engagement with each participant, this can have an impact if you want a representative sample. If you randomly select 15 people, you might by chance end up with more women than men, or a younger than desired sample. That is why qualitative studies may use a little bit of purposeful sampling, finding people to make sure the final profile matches the desired sampling frame. For much more on this, check out the last blog post article on recruitment .

Sample size will often also depend on conceptual approach: if you are testing a prior hypothesis, you may be able to get away with a smaller sample size, while a grounded theory approach to develop new insights might need a larger group of respondents to test that the findings are applicable. Here, you are likely to take a ‘theoretical sampling’ approach (Glaser and Strauss 1967) where you specifically choose people who have experiences that would contribute to a theoretical construct. This is often iterative, in that after reviewing the data (for theoretical insights) the researcher goes out again to find other participants the model suggests might be of interest.

The convenience sampling approach which Marshal mentions as being the ‘least rigorous technique’ is where researchers target the most ‘easily accessible’ respondents. This could even be friends, family or faculty. This approach can rarely be methodologically justified, and is unlikely to provide a representative sample. However, it is endemic in many fields, especially psychology, where researchers tend to turn to easily accessible psychology students for experiments : skewing the results towards white, rich, well-educated Western students.

Now we turn to snowball sampling ( Goodman 1961 ). This is different from purposeful sampling in that new respondents are suggested by others. In general, this is most suited to work with ‘marginalised or hard-to-reach’ populations, where responders are not often forthcoming ( Sadler et al 2010 ). For example, people may not be open about their drug use, political views or living with stigmatising conditions, yet often form closely connected networks. Thus, by gaining trust with one person in the group, others can be recommended to the researcher. However, it is important to note the limitations with this approach. Here, there is the risk of systemic bias: if the first person you recruit is not representive in some way, their referrals may not be either. So you may be looking at people living with HIV/AIDS, and recruit through a support group that is formed entirely of men: they are unlikely to suggest women for the study.

For these reasons there are limits to the generalisability and appropriateness of snowball sampling for most subjects of inquiry, and it should not be taken as an easy fix. Yet while many practitioners explain the limitations with snowball research, it can be very well suited for certain kinds of social and action research, this article by Noy (2008) outlines some of the potential benefits to power relations and studying social networks.

Finally, there is the issue of sample size and ‘saturation’. This is when there is enough data collected to confidently answer the research questions. For a lot of qualitative research this means collected and coded data as well, especially if using some variant of grounded theory. However, saturation is often a source of anxiety for researchers: see for example the amusingly titled article “Are We There Yet?” by Fusch and Ness (2015) . Unlike quantitative studies where a sample size can be determined by the desired effect size and confidence interval in a chosen statistical test, it is more difficult to put an exact number on the right number of participant responses. This is especially because responses are themselves qualitative, not just numbers in a list: so one response may be more data rich than another.

While a general rule of thumb would indicate there is no harm in collecting more data than is strictly necessary, there is always a practical limitation, especially in resource and time constrained post-graduate studies. It can also be more difficult to recruit than anticipated, and many projects working with very specific or hard-to-reach groups can struggle to find a large enough sample size. This is not always a disaster, but may require a re-examination of the research questions, to see what insights and conclusions are still obtainable.

Generally, researchers should have a target sample size and definition of what data saturation will look like for their project before they begin sampling and recruitment. Don’t forget that qualitative case studies may only include one respondent or data point, and in some situations that can be appropriate. However, getting the sampling approach and sample size right is something that comes with experience, advice and practice.

As I always seem to be saying in this blog, it’s also worth considering the intended audience for your research outputs. If you want to publish in a certain journal or academic discipline, it may not be responsive to research based on qualitative methods with small or ‘non-representative’ samples. Silverman (2013 p424) mentions this explicitly with examples of students who had publications rejected for these reasons.

So as ever, plan ahead for what you want to achieve for your research project, the questions you want to answer, and work backwards to choose the appropriate methodology, methods and sample for your work. Also, check the companion article about recruitment , most of these issues need to be considered in tandem.

Once you have your data, Quirkos can be a great way to analyse it , whether your sample size has one or dozens of respondents! There is a free trial and example data sets to see for yourself if it suits your way of working, and much more information in these pages. We also have a newly relaunched forum , with specific sections on qualitative methodology if you wanted to ask questions, or comment on anything raised in this blog series.

Sign up for more like this.

Statistically nonrepresentative stratified sampling: A sampling technique for qualitative studies

  • Research Note
  • Published: March 1986
  • Volume 9 , pages 54–57, ( 1986 )

Cite this article

  • Jan E. Trost 1  

3948 Accesses

151 Citations

4 Altmetric

Explore all metrics

When doing quantitative studies, we usually need a statistically representative sample. The same is often not the case with qualitative studies where we need a sample with variations along the independent variables. The technique described here is simple to use and guarantees variations. A statistically representative sample, on the other hand, if small, often gives few variations. Thus, one could say that the technique presented is a kind of statistically nonrepresentative stratified sampling useful in qualitative studies.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price includes VAT (Russian Federation)

Instant access to the full article PDF.

Rent this article via DeepDyve

Institutional subscriptions

Similar content being viewed by others

qualitative research is not statistically representative

What is Qualitative in Qualitative Research

Patrik Aspers & Ugo Corte

Reporting reliability, convergent and discriminant validity with structural equation modeling: A review and best-practice recommendations

Gordon W. Cheung, Helena D. Cooper-Thomas, … Linda C. Wang

qualitative research is not statistically representative

Sampling Techniques for Quantitative Research

Bott, Elisabeth 1957 Family and Social Network. London: Tavistock.

Google Scholar  

Glaser, Barney G. and Anselm L. Strauss 1967 The Discovery of Grounded Theory. Chicago: Aldine.

Komarowsky, Mirra 1967 Blue Collar Marriage. New York: Vintage.

Rubin, Lillian Breslow 1976 Worlds of Pain. New York: Basic Books.

Download references

Author information

Authors and affiliations.

The Kinsey Institute for Research in Sex, Gender, and Reproduction, Indiana University, 47405, Bloomington, Indiana

Jan E. Trost

You can also search for this author in PubMed   Google Scholar

Additional information

The author wishes to thank the editors of Qualitiative Sociology for valuable comments and suggestions made on an earlier version of this note.

Rights and permissions

Reprints and permissions

About this article

Trost, J.E. Statistically nonrepresentative stratified sampling: A sampling technique for qualitative studies. Qual Sociol 9 , 54–57 (1986). https://doi.org/10.1007/BF00988249

Download citation

Issue Date : March 1986

DOI : https://doi.org/10.1007/BF00988249

Share this article

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

  • Social Psychology
  • Representative Sample
  • Qualitative Study
  • Quantitative Study
  • Social Issue
  • Find a journal
  • Publish with us
  • Track your research

U.S. flag

An official website of the United States government

The .gov means it’s official. Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

The site is secure. The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

  • Publications
  • Account settings

Preview improvements coming to the PMC website in October 2024. Learn More or Try it out now .

  • Advanced Search
  • Journal List
  • Indian J Psychol Med
  • v.42(1); Jan-Feb 2020

Sample Size and its Importance in Research

Chittaranjan andrade.

Clinical Psychopharmacology Unit, Department of Clinical Psychopharmacology and Neurotoxicology, National Institute of Mental Health and Neurosciences, Bengaluru, Karnataka, India

The sample size for a study needs to be estimated at the time the study is proposed; too large a sample is unnecessary and unethical, and too small a sample is unscientific and also unethical. The necessary sample size can be calculated, using statistical software, based on certain assumptions. If no assumptions can be made, then an arbitrary sample size is set for a pilot study. This article discusses sample size and how it relates to matters such as ethics, statistical power, the primary and secondary hypotheses in a study, and findings from larger vs. smaller samples.

Studies are conducted on samples because it is usually impossible to study the entire population. Conclusions drawn from samples are intended to be generalized to the population, and sometimes to the future as well. The sample must therefore be representative of the population. This is best ensured by the use of proper methods of sampling. The sample must also be adequate in size – in fact, no more and no less.

SAMPLE SIZE AND ETHICS

A sample that is larger than necessary will be better representative of the population and will hence provide more accurate results. However, beyond a certain point, the increase in accuracy will be small and hence not worth the effort and expense involved in recruiting the extra patients. Furthermore, an overly large sample would inconvenience more patients than might be necessary for the study objectives; this is unethical. In contrast, a sample that is smaller than necessary would have insufficient statistical power to answer the primary research question, and a statistically nonsignificant result could merely be because of inadequate sample size (Type 2 or false negative error). Thus, a small sample could result in the patients in the study being inconvenienced with no benefit to future patients or to science. This is also unethical.

In this regard, inconvenience to patients refers to the time that they spend in clinical assessments and to the psychological and physical discomfort that they experience in assessments such as interviews, blood sampling, and other procedures.

ESTIMATING SAMPLE SIZE

So how large should a sample be? In hypothesis testing studies, this is mathematically calculated, conventionally, as the sample size necessary to be 80% certain of identifying a statistically significant outcome should the hypothesis be true for the population, with P for statistical significance set at 0.05. Some investigators power their studies for 90% instead of 80%, and some set the threshold for significance at 0.01 rather than 0.05. Both choices are uncommon because the necessary sample size becomes large, and the study becomes more expensive and more difficult to conduct. Many investigators increase the sample size by 10%, or by whatever proportion they can justify, to compensate for expected dropout, incomplete records, biological specimens that do not meet laboratory requirements for testing, and other study-related problems.

Sample size calculations require assumptions about expected means and standard deviations, or event risks, in different groups; or, upon expected effect sizes. For example, a study may be powered to detect an effect size of 0.5; or a response rate of 60% with drug vs. 40% with placebo.[ 1 ] When no guesstimates or expectations are possible, pilot studies are conducted on a sample that is arbitrary in size but what might be considered reasonable for the field.

The sample size may need to be larger in multicenter studies because of statistical noise (due to variations in patient characteristics, nonspecific treatment characteristics, rating practices, environments, etc. between study centers).[ 2 ] Sample size calculations can be performed manually or using statistical software; online calculators that provide free service can easily be identified by search engines. G*Power is an example of a free, downloadable program for sample size estimation. The manual and tutorial for G*Power can also be downloaded.

PRIMARY AND SECONDARY ANALYSES

The sample size is calculated for the primary hypothesis of the study. What is the difference between the primary hypothesis, primary outcome and primary outcome measure? As an example, the primary outcome may be a reduction in the severity of depression, the primary outcome measure may be the Montgomery-Asberg Depression Rating Scale (MADRS) and the primary hypothesis may be that reduction in MADRS scores is greater with the drug than with placebo. The primary hypothesis is tested in the primary analysis.

Studies almost always have many hypotheses; for example, that the study drug will outperform placebo on measures of depression, suicidality, anxiety, disability and quality of life. The sample size necessary for adequate statistical power to test each of these hypotheses will be different. Because a study can have only one sample size, it can be powered for only one outcome, the primary outcome. Therefore, the study would be either overpowered or underpowered for the other outcomes. These outcomes are therefore called secondary outcomes, and are associated with secondary hypotheses, and are tested in secondary analyses. Secondary analyses are generally considered exploratory because when many hypotheses in a study are each tested at a P < 0.05 level for significance, some may emerge statistically significant by chance (Type 1 or false positive errors).[ 3 ]

INTERPRETING RESULTS

Here is an interesting question. A test of the primary hypothesis yielded a P value of 0.07. Might we conclude that our sample was underpowered for the study and that, had our sample been larger, we would have identified a significant result? No! The reason is that larger samples will more accurately represent the population value, whereas smaller samples could be off the mark in either direction – towards or away from the population value. In this context, readers should also note that no matter how small the P value for an estimate is, the population value of that estimate remains the same.[ 4 ]

On a parting note, it is unlikely that population values will be null. That is, for example, that the response rate to the drug will be exactly the same as that to placebo, or that the correlation between height and age at onset of schizophrenia will be zero. If the sample size is large enough, even such small differences between groups, or trivial correlations, would be detected as being statistically significant. This does not mean that the findings are clinically significant.

Financial support and sponsorship

Conflicts of interest.

There are no conflicts of interest.

IMAGES

  1. Qualitative Research: Definition, Types, Methods and Examples

    qualitative research is not statistically representative

  2. Qualitative vs Quantitative Research: What's the Difference?

    qualitative research is not statistically representative

  3. Understanding Qualitative Research: An In-Depth Study Guide

    qualitative research is not statistically representative

  4. Qualitative Research: Definition, Methodology, Limitation, Examples (2022)

    qualitative research is not statistically representative

  5. Quantitative Research vs Qualitative Research

    qualitative research is not statistically representative

  6. SOLUTION: Limitations of qualitative and quantitative research methods

    qualitative research is not statistically representative

VIDEO

  1. SCW 2601 Assignment 1 Introduction to Law for Social Work 2024

  2. Qualitative Research Reporting Standards: How are qualitative articles different from quantitative?

  3. Quantitative vs Qualitative Research The Differences Explained Scribbr 🎓

  4. Exploring Qualitative and Quantitative Research Methods and why you should use them

  5. Comparison of Quantitative & Qualitative Research

  6. Difference between Qualitative research and Quantitative research

COMMENTS

  1. Big enough? Sampling in qualitative inquiry

    Any senior researcher, or seasoned mentor, has a practiced response to the 'how many' question. Mine tends to start with a reminder about the different philosophical assumptions undergirding qualitative and quantitative research projects (Staller, 2013).As Abrams (2010) points out, this difference leads to "major differences in sampling goals and strategies."(p.537).

  2. 23 Advantages and Disadvantages of Qualitative Research

    A small sample is not always representative of a larger population demographic, even if there are deep similarities with the individuals involve. ... 11. Qualitative research is not statistically representative. The one disadvantage of qualitative research which is always present is its lack of statistical representation. It is a perspective ...

  3. How representative is that qualitative data anyway?

    While qualitative data is not statistically representative of a population, we still have guidelines that we follow to make sure we are capturing reliable data. For example, we suggest conducting at least three focus groups per unique segment. Qualitative research is fluid by nature, so data gathered from across three groups allows us to see ...

  4. Sample sizes for saturation in qualitative research: A systematic

    Sample sizes in qualitative research are guided by data adequacy, so an effective sample size is less about numbers (n's) and more about the ability of data to provide a rich and nuanced account of the phenomenon studied. Ultimately, determining and justifying sample sizes for qualitative research cannot be detached from the study ...

  5. Chapter 5. Sampling

    The process of selecting people or other units of analysis to represent a larger population. In quantitative research, this representation is taken quite literally, as statistically representative. In qualitative research, in contrast, sample selection is often made based on potential to generate insight about a particular topic or phenomenon.

  6. PDF Qualitative research: its value and applicability

    Qualitative research has a rich tradition in the social sciences. Since the late 19th century, researchers interested ... that is representative of the general population, with the ... The results are not intended to be statistically generalisable, although any theory they generate might well be. 'Qualitative research cannot really

  7. Defining representativeness of study samples in medical and population

    Medical and population health science researchers frequently make ambiguous statements about whether they believe their study sample or results are representative of some (implicit or explicit) target population. This article provides a comprehensive definition of representativeness, with the goal of capturing the different ways in which a study can be representative of a target population.

  8. Planning Qualitative Research: Design and Decision Making for New

    While many books and articles guide various qualitative research methods and analyses, there is currently no concise resource that explains and differentiates among the most common qualitative approaches. We believe novice qualitative researchers, students planning the design of a qualitative study or taking an introductory qualitative research course, and faculty teaching such courses can ...

  9. PDF Sampling Strategies in Qualitative Research

    representative sample does not automatically lead to generalizable findings; between these two issues are potential 'measurement errors', connected to a wide array of practical problems. Relatedly, working with a non-representative sample does not mean you can automatically assume that generalizability is not possible.

  10. Qualitative Study

    Qualitative research is a type of research that explores and provides deeper insights into real-world problems.[1] Instead of collecting numerical data points or intervene or introduce treatments just like in quantitative research, qualitative research helps generate hypotheses as well as further investigate and understand quantitative data. Qualitative research gathers participants ...

  11. How to use and assess qualitative research methods

    Abstract. This paper aims to provide an overview of the use and assessment of qualitative research methods in the health sciences. Qualitative research can be defined as the study of the nature of phenomena and is especially appropriate for answering questions of why something is (not) observed, assessing complex multi-component interventions ...

  12. Qualitative vs. Quantitative Data in Research: The Difference

    Qualitative research delivers a predictive element for continuous data. Disadvantages of qualitative data. It's not a statistically representative form of data collection because it relies upon the experience of the host (who can lose data). It can also require multiple data sessions, which can lead to misleading conclusions.

  13. Qualitative Research: Definition, Methodology, Limitation, Examples

    5. Qualitative research is not statistically representative. Because qualitative research is a perspective-based method of research, the responses given are not measured. Comparisons can be made and this can lead toward duplication, but for the most part, quantitative data is required for circumstances which need statistical representation and ...

  14. Qualitative Analysis with Representative Samples

    Our method allows social scientists to analyze representative samples of open-ended qualitative interviews, and to do so by inductively creating a coding structure that emerges from a close, human reading of a sub-sample of interviews that are then used to predict codes on the larger sample. We see this as an organic extension of traditional ...

  15. PDF PowerPoint Presentation

    2. Generating research hypotheses that can be tested using more quant.tat.ve approaches. 3. Stimulating new .deas and creative concepts. 4. Diagnosing the potential for prob ems with a new program, service, or product. 5. Generating impressions of products, programs, services, institutions, or other objects of interest.

  16. Validity, reliability, and generalizability in qualitative research

    Most qualitative research studies, if not all, are meant to study a specific issue or phenomenon in a certain population or ethnic group, of a focused locality in a particular context, hence generalizability of qualitative research findings is usually not an expected attribute. However, with rising trend of knowledge synthesis from qualitative ...

  17. PDF Sampling,representativeness and generalizability

    6 if research is not carried out on a representative sample, its findings are not generalizable; 7 findings of qualitative researchers are not generalizable. These sentences have become such common-places that they form an undisputed part of most researchers' background assumptions. However, survey researchers do not realize that in social

  18. Sampling considerations in qualitative research

    Because qualitative studies tend to have a small sample size due to the in-depth nature of engagement with each participant, this can have an impact if you want a representative sample. If you randomly select 15 people, you might by chance end up with more women than men, or a younger than desired sample. That is why qualitative studies may use ...

  19. PDF Qualitative Research with Older Diverse Populations

    The appropriate sample size for a study will depend on the research questions being asked and the methods being used Qualitative study samples do not aim to be statistically representative, but rather to generate data to answer a particular research question (e.g., what is the range of experiences of older

  20. (PDF) Generalising from qualitative research: case studies from VET in

    In this context, the value of qualitative research is often questioned because 'you cannot make generalisations from results when the sample is not statistically representative of the whole ...

  21. Statistically nonrepresentative stratified sampling: A sampling

    When doing quantitative studies, we usually need a statistically representative sample. The same is often not the case with qualitative studies where we need a sample with variations along the independent variables. The technique described here is simple to use and guarantees variations. A statistically representative sample, on the other hand, if small, often gives few variations. Thus, one ...

  22. Defining representativeness of study samples in medical and population

    We define a study sample to be representative of a well defined target population if the results estimated in that sample are generalisable to the target population. We consider two ways in which study results can generalise to the target population: in estimate and in interpretation. Box 1 lists a summary of key terms used in this article and ...

  23. Sample Size and its Importance in Research

    The sample size for a study needs to be estimated at the time the study is proposed; too large a sample is unnecessary and unethical, and too small a sample is unscientific and also unethical. The necessary sample size can be calculated, using statistical software, based on certain assumptions. If no assumptions can be made, then an arbitrary ...