Skip to content

Read the latest news stories about Mailman faculty, research, and events. 

Departments

We integrate an innovative skills-based curriculum, research collaborations, and hands-on field experience to prepare students.

Learn more about our research centers, which focus on critical issues in public health.

Our Faculty

Meet the faculty of the Mailman School of Public Health. 

Become a Student

Life and community, how to apply.

Learn how to apply to the Mailman School of Public Health. 

Content Analysis

Content analysis is a research tool used to determine the presence of certain words, themes, or concepts within some given qualitative data (i.e. text). Using content analysis, researchers can quantify and analyze the presence, meanings, and relationships of such certain words, themes, or concepts. As an example, researchers can evaluate language used within a news article to search for bias or partiality. Researchers can then make inferences about the messages within the texts, the writer(s), the audience, and even the culture and time of surrounding the text.

Description

Sources of data could be from interviews, open-ended questions, field research notes, conversations, or literally any occurrence of communicative language (such as books, essays, discussions, newspaper headlines, speeches, media, historical documents). A single study may analyze various forms of text in its analysis. To analyze the text using content analysis, the text must be coded, or broken down, into manageable code categories for analysis (i.e. “codes”). Once the text is coded into code categories, the codes can then be further categorized into “code categories” to summarize data even further.

Three different definitions of content analysis are provided below.

Definition 1: “Any technique for making inferences by systematically and objectively identifying special characteristics of messages.” (from Holsti, 1968)

Definition 2: “An interpretive and naturalistic approach. It is both observational and narrative in nature and relies less on the experimental elements normally associated with scientific research (reliability, validity, and generalizability) (from Ethnography, Observational Research, and Narrative Inquiry, 1994-2012).

Definition 3: “A research technique for the objective, systematic and quantitative description of the manifest content of communication.” (from Berelson, 1952)

Uses of Content Analysis

Identify the intentions, focus or communication trends of an individual, group or institution

Describe attitudinal and behavioral responses to communications

Determine the psychological or emotional state of persons or groups

Reveal international differences in communication content

Reveal patterns in communication content

Pre-test and improve an intervention or survey prior to launch

Analyze focus group interviews and open-ended questions to complement quantitative data

Types of Content Analysis

There are two general types of content analysis: conceptual analysis and relational analysis. Conceptual analysis determines the existence and frequency of concepts in a text. Relational analysis develops the conceptual analysis further by examining the relationships among concepts in a text. Each type of analysis may lead to different results, conclusions, interpretations and meanings.

Conceptual Analysis

Typically people think of conceptual analysis when they think of content analysis. In conceptual analysis, a concept is chosen for examination and the analysis involves quantifying and counting its presence. The main goal is to examine the occurrence of selected terms in the data. Terms may be explicit or implicit. Explicit terms are easy to identify. Coding of implicit terms is more complicated: you need to decide the level of implication and base judgments on subjectivity (an issue for reliability and validity). Therefore, coding of implicit terms involves using a dictionary or contextual translation rules or both.

To begin a conceptual content analysis, first identify the research question and choose a sample or samples for analysis. Next, the text must be coded into manageable content categories. This is basically a process of selective reduction. By reducing the text to categories, the researcher can focus on and code for specific words or patterns that inform the research question.

General steps for conducting a conceptual content analysis:

1. Decide the level of analysis: word, word sense, phrase, sentence, themes

2. Decide how many concepts to code for: develop a pre-defined or interactive set of categories or concepts. Decide either: A. to allow flexibility to add categories through the coding process, or B. to stick with the pre-defined set of categories.

Option A allows for the introduction and analysis of new and important material that could have significant implications to one’s research question.

Option B allows the researcher to stay focused and examine the data for specific concepts.

3. Decide whether to code for existence or frequency of a concept. The decision changes the coding process.

When coding for the existence of a concept, the researcher would count a concept only once if it appeared at least once in the data and no matter how many times it appeared.

When coding for the frequency of a concept, the researcher would count the number of times a concept appears in a text.

4. Decide on how you will distinguish among concepts:

Should text be coded exactly as they appear or coded as the same when they appear in different forms? For example, “dangerous” vs. “dangerousness”. The point here is to create coding rules so that these word segments are transparently categorized in a logical fashion. The rules could make all of these word segments fall into the same category, or perhaps the rules can be formulated so that the researcher can distinguish these word segments into separate codes.

What level of implication is to be allowed? Words that imply the concept or words that explicitly state the concept? For example, “dangerous” vs. “the person is scary” vs. “that person could cause harm to me”. These word segments may not merit separate categories, due the implicit meaning of “dangerous”.

5. Develop rules for coding your texts. After decisions of steps 1-4 are complete, a researcher can begin developing rules for translation of text into codes. This will keep the coding process organized and consistent. The researcher can code for exactly what he/she wants to code. Validity of the coding process is ensured when the researcher is consistent and coherent in their codes, meaning that they follow their translation rules. In content analysis, obeying by the translation rules is equivalent to validity.

6. Decide what to do with irrelevant information: should this be ignored (e.g. common English words like “the” and “and”), or used to reexamine the coding scheme in the case that it would add to the outcome of coding?

7. Code the text: This can be done by hand or by using software. By using software, researchers can input categories and have coding done automatically, quickly and efficiently, by the software program. When coding is done by hand, a researcher can recognize errors far more easily (e.g. typos, misspelling). If using computer coding, text could be cleaned of errors to include all available data. This decision of hand vs. computer coding is most relevant for implicit information where category preparation is essential for accurate coding.

8. Analyze your results: Draw conclusions and generalizations where possible. Determine what to do with irrelevant, unwanted, or unused text: reexamine, ignore, or reassess the coding scheme. Interpret results carefully as conceptual content analysis can only quantify the information. Typically, general trends and patterns can be identified.

Relational Analysis

Relational analysis begins like conceptual analysis, where a concept is chosen for examination. However, the analysis involves exploring the relationships between concepts. Individual concepts are viewed as having no inherent meaning and rather the meaning is a product of the relationships among concepts.

To begin a relational content analysis, first identify a research question and choose a sample or samples for analysis. The research question must be focused so the concept types are not open to interpretation and can be summarized. Next, select text for analysis. Select text for analysis carefully by balancing having enough information for a thorough analysis so results are not limited with having information that is too extensive so that the coding process becomes too arduous and heavy to supply meaningful and worthwhile results.

There are three subcategories of relational analysis to choose from prior to going on to the general steps.

Affect extraction: an emotional evaluation of concepts explicit in a text. A challenge to this method is that emotions can vary across time, populations, and space. However, it could be effective at capturing the emotional and psychological state of the speaker or writer of the text.

Proximity analysis: an evaluation of the co-occurrence of explicit concepts in the text. Text is defined as a string of words called a “window” that is scanned for the co-occurrence of concepts. The result is the creation of a “concept matrix”, or a group of interrelated co-occurring concepts that would suggest an overall meaning.

Cognitive mapping: a visualization technique for either affect extraction or proximity analysis. Cognitive mapping attempts to create a model of the overall meaning of the text such as a graphic map that represents the relationships between concepts.

General steps for conducting a relational content analysis:

1. Determine the type of analysis: Once the sample has been selected, the researcher needs to determine what types of relationships to examine and the level of analysis: word, word sense, phrase, sentence, themes. 2. Reduce the text to categories and code for words or patterns. A researcher can code for existence of meanings or words. 3. Explore the relationship between concepts: once the words are coded, the text can be analyzed for the following:

Strength of relationship: degree to which two or more concepts are related.

Sign of relationship: are concepts positively or negatively related to each other?

Direction of relationship: the types of relationship that categories exhibit. For example, “X implies Y” or “X occurs before Y” or “if X then Y” or if X is the primary motivator of Y.

4. Code the relationships: a difference between conceptual and relational analysis is that the statements or relationships between concepts are coded. 5. Perform statistical analyses: explore differences or look for relationships among the identified variables during coding. 6. Map out representations: such as decision mapping and mental models.

Reliability and Validity

Reliability : Because of the human nature of researchers, coding errors can never be eliminated but only minimized. Generally, 80% is an acceptable margin for reliability. Three criteria comprise the reliability of a content analysis:

Stability: the tendency for coders to consistently re-code the same data in the same way over a period of time.

Reproducibility: tendency for a group of coders to classify categories membership in the same way.

Accuracy: extent to which the classification of text corresponds to a standard or norm statistically.

Validity : Three criteria comprise the validity of a content analysis:

Closeness of categories: this can be achieved by utilizing multiple classifiers to arrive at an agreed upon definition of each specific category. Using multiple classifiers, a concept category that may be an explicit variable can be broadened to include synonyms or implicit variables.

Conclusions: What level of implication is allowable? Do conclusions correctly follow the data? Are results explainable by other phenomena? This becomes especially problematic when using computer software for analysis and distinguishing between synonyms. For example, the word “mine,” variously denotes a personal pronoun, an explosive device, and a deep hole in the ground from which ore is extracted. Software can obtain an accurate count of that word’s occurrence and frequency, but not be able to produce an accurate accounting of the meaning inherent in each particular usage. This problem could throw off one’s results and make any conclusion invalid.

Generalizability of the results to a theory: dependent on the clear definitions of concept categories, how they are determined and how reliable they are at measuring the idea one is seeking to measure. Generalizability parallels reliability as much of it depends on the three criteria for reliability.

Advantages of Content Analysis

Directly examines communication using text

Allows for both qualitative and quantitative analysis

Provides valuable historical and cultural insights over time

Allows a closeness to data

Coded form of the text can be statistically analyzed

Unobtrusive means of analyzing interactions

Provides insight into complex models of human thought and language use

When done well, is considered a relatively “exact” research method

Content analysis is a readily-understood and an inexpensive research method

A more powerful tool when combined with other research methods such as interviews, observation, and use of archival records. It is very useful for analyzing historical material, especially for documenting trends over time.

Disadvantages of Content Analysis

Can be extremely time consuming

Is subject to increased error, particularly when relational analysis is used to attain a higher level of interpretation

Is often devoid of theoretical base, or attempts too liberally to draw meaningful inferences about the relationships and impacts implied in a study

Is inherently reductive, particularly when dealing with complex texts

Tends too often to simply consist of word counts

Often disregards the context that produced the text, as well as the state of things after the text is produced

Can be difficult to automate or computerize

Textbooks & Chapters  

Berelson, Bernard. Content Analysis in Communication Research.New York: Free Press, 1952.

Busha, Charles H. and Stephen P. Harter. Research Methods in Librarianship: Techniques and Interpretation.New York: Academic Press, 1980.

de Sola Pool, Ithiel. Trends in Content Analysis. Urbana: University of Illinois Press, 1959.

Krippendorff, Klaus. Content Analysis: An Introduction to its Methodology. Beverly Hills: Sage Publications, 1980.

Fielding, NG & Lee, RM. Using Computers in Qualitative Research. SAGE Publications, 1991. (Refer to Chapter by Seidel, J. ‘Method and Madness in the Application of Computer Technology to Qualitative Data Analysis’.)

Methodological Articles  

Hsieh HF & Shannon SE. (2005). Three Approaches to Qualitative Content Analysis.Qualitative Health Research. 15(9): 1277-1288.

Elo S, Kaarianinen M, Kanste O, Polkki R, Utriainen K, & Kyngas H. (2014). Qualitative Content Analysis: A focus on trustworthiness. Sage Open. 4:1-10.

Application Articles  

Abroms LC, Padmanabhan N, Thaweethai L, & Phillips T. (2011). iPhone Apps for Smoking Cessation: A content analysis. American Journal of Preventive Medicine. 40(3):279-285.

Ullstrom S. Sachs MA, Hansson J, Ovretveit J, & Brommels M. (2014). Suffering in Silence: a qualitative study of second victims of adverse events. British Medical Journal, Quality & Safety Issue. 23:325-331.

Owen P. (2012).Portrayals of Schizophrenia by Entertainment Media: A Content Analysis of Contemporary Movies. Psychiatric Services. 63:655-659.

Choosing whether to conduct a content analysis by hand or by using computer software can be difficult. Refer to ‘Method and Madness in the Application of Computer Technology to Qualitative Data Analysis’ listed above in “Textbooks and Chapters” for a discussion of the issue.

QSR NVivo:  http://www.qsrinternational.com/products.aspx

Atlas.ti:  http://www.atlasti.com/webinars.html

R- RQDA package:  http://rqda.r-forge.r-project.org/

Rolly Constable, Marla Cowell, Sarita Zornek Crawford, David Golden, Jake Hartvigsen, Kathryn Morgan, Anne Mudgett, Kris Parrish, Laura Thomas, Erika Yolanda Thompson, Rosie Turner, and Mike Palmquist. (1994-2012). Ethnography, Observational Research, and Narrative Inquiry. Writing@CSU. Colorado State University. Available at: https://writing.colostate.edu/guides/guide.cfm?guideid=63 .

As an introduction to Content Analysis by Michael Palmquist, this is the main resource on Content Analysis on the Web. It is comprehensive, yet succinct. It includes examples and an annotated bibliography. The information contained in the narrative above draws heavily from and summarizes Michael Palmquist’s excellent resource on Content Analysis but was streamlined for the purpose of doctoral students and junior researchers in epidemiology.

At Columbia University Mailman School of Public Health, more detailed training is available through the Department of Sociomedical Sciences- P8785 Qualitative Research Methods.

Join the Conversation

Have a question about methods? Join us on Facebook

  • Privacy Policy

Buy Me a Coffee

Research Method

Home » Content Analysis – Methods, Types and Examples

Content Analysis – Methods, Types and Examples

Table of Contents

Content Analysis

Content Analysis

Definition:

Content analysis is a research method used to analyze and interpret the characteristics of various forms of communication, such as text, images, or audio. It involves systematically analyzing the content of these materials, identifying patterns, themes, and other relevant features, and drawing inferences or conclusions based on the findings.

Content analysis can be used to study a wide range of topics, including media coverage of social issues, political speeches, advertising messages, and online discussions, among others. It is often used in qualitative research and can be combined with other methods to provide a more comprehensive understanding of a particular phenomenon.

Types of Content Analysis

There are generally two types of content analysis:

Quantitative Content Analysis

This type of content analysis involves the systematic and objective counting and categorization of the content of a particular form of communication, such as text or video. The data obtained is then subjected to statistical analysis to identify patterns, trends, and relationships between different variables. Quantitative content analysis is often used to study media content, advertising, and political speeches.

Qualitative Content Analysis

This type of content analysis is concerned with the interpretation and understanding of the meaning and context of the content. It involves the systematic analysis of the content to identify themes, patterns, and other relevant features, and to interpret the underlying meanings and implications of these features. Qualitative content analysis is often used to study interviews, focus groups, and other forms of qualitative data, where the researcher is interested in understanding the subjective experiences and perceptions of the participants.

Methods of Content Analysis

There are several methods of content analysis, including:

Conceptual Analysis

This method involves analyzing the meanings of key concepts used in the content being analyzed. The researcher identifies key concepts and analyzes how they are used, defining them and categorizing them into broader themes.

Content Analysis by Frequency

This method involves counting and categorizing the frequency of specific words, phrases, or themes that appear in the content being analyzed. The researcher identifies relevant keywords or phrases and systematically counts their frequency.

Comparative Analysis

This method involves comparing the content of two or more sources to identify similarities, differences, and patterns. The researcher selects relevant sources, identifies key themes or concepts, and compares how they are represented in each source.

Discourse Analysis

This method involves analyzing the structure and language of the content being analyzed to identify how the content constructs and represents social reality. The researcher analyzes the language used and the underlying assumptions, beliefs, and values reflected in the content.

Narrative Analysis

This method involves analyzing the content as a narrative, identifying the plot, characters, and themes, and analyzing how they relate to the broader social context. The researcher identifies the underlying messages conveyed by the narrative and their implications for the broader social context.

Content Analysis Conducting Guide

Here is a basic guide to conducting a content analysis:

  • Define your research question or objective: Before starting your content analysis, you need to define your research question or objective clearly. This will help you to identify the content you need to analyze and the type of analysis you need to conduct.
  • Select your sample: Select a representative sample of the content you want to analyze. This may involve selecting a random sample, a purposive sample, or a convenience sample, depending on the research question and the availability of the content.
  • Develop a coding scheme: Develop a coding scheme or a set of categories to use for coding the content. The coding scheme should be based on your research question or objective and should be reliable, valid, and comprehensive.
  • Train coders: Train coders to use the coding scheme and ensure that they have a clear understanding of the coding categories and procedures. You may also need to establish inter-coder reliability to ensure that different coders are coding the content consistently.
  • Code the content: Code the content using the coding scheme. This may involve manually coding the content, using software, or a combination of both.
  • Analyze the data: Once the content is coded, analyze the data using appropriate statistical or qualitative methods, depending on the research question and the type of data.
  • Interpret the results: Interpret the results of the analysis in the context of your research question or objective. Draw conclusions based on the findings and relate them to the broader literature on the topic.
  • Report your findings: Report your findings in a clear and concise manner, including the research question, methodology, results, and conclusions. Provide details about the coding scheme, inter-coder reliability, and any limitations of the study.

Applications of Content Analysis

Content analysis has numerous applications across different fields, including:

  • Media Research: Content analysis is commonly used in media research to examine the representation of different groups, such as race, gender, and sexual orientation, in media content. It can also be used to study media framing, media bias, and media effects.
  • Political Communication : Content analysis can be used to study political communication, including political speeches, debates, and news coverage of political events. It can also be used to study political advertising and the impact of political communication on public opinion and voting behavior.
  • Marketing Research: Content analysis can be used to study advertising messages, consumer reviews, and social media posts related to products or services. It can provide insights into consumer preferences, attitudes, and behaviors.
  • Health Communication: Content analysis can be used to study health communication, including the representation of health issues in the media, the effectiveness of health campaigns, and the impact of health messages on behavior.
  • Education Research : Content analysis can be used to study educational materials, including textbooks, curricula, and instructional materials. It can provide insights into the representation of different topics, perspectives, and values.
  • Social Science Research: Content analysis can be used in a wide range of social science research, including studies of social media, online communities, and other forms of digital communication. It can also be used to study interviews, focus groups, and other qualitative data sources.

Examples of Content Analysis

Here are some examples of content analysis:

  • Media Representation of Race and Gender: A content analysis could be conducted to examine the representation of different races and genders in popular media, such as movies, TV shows, and news coverage.
  • Political Campaign Ads : A content analysis could be conducted to study political campaign ads and the themes and messages used by candidates.
  • Social Media Posts: A content analysis could be conducted to study social media posts related to a particular topic, such as the COVID-19 pandemic, to examine the attitudes and beliefs of social media users.
  • Instructional Materials: A content analysis could be conducted to study the representation of different topics and perspectives in educational materials, such as textbooks and curricula.
  • Product Reviews: A content analysis could be conducted to study product reviews on e-commerce websites, such as Amazon, to identify common themes and issues mentioned by consumers.
  • News Coverage of Health Issues: A content analysis could be conducted to study news coverage of health issues, such as vaccine hesitancy, to identify common themes and perspectives.
  • Online Communities: A content analysis could be conducted to study online communities, such as discussion forums or social media groups, to understand the language, attitudes, and beliefs of the community members.

Purpose of Content Analysis

The purpose of content analysis is to systematically analyze and interpret the content of various forms of communication, such as written, oral, or visual, to identify patterns, themes, and meanings. Content analysis is used to study communication in a wide range of fields, including media studies, political science, psychology, education, sociology, and marketing research. The primary goals of content analysis include:

  • Describing and summarizing communication: Content analysis can be used to describe and summarize the content of communication, such as the themes, topics, and messages conveyed in media content, political speeches, or social media posts.
  • Identifying patterns and trends: Content analysis can be used to identify patterns and trends in communication, such as changes over time, differences between groups, or common themes or motifs.
  • Exploring meanings and interpretations: Content analysis can be used to explore the meanings and interpretations of communication, such as the underlying values, beliefs, and assumptions that shape the content.
  • Testing hypotheses and theories : Content analysis can be used to test hypotheses and theories about communication, such as the effects of media on attitudes and behaviors or the framing of political issues in the media.

When to use Content Analysis

Content analysis is a useful method when you want to analyze and interpret the content of various forms of communication, such as written, oral, or visual. Here are some specific situations where content analysis might be appropriate:

  • When you want to study media content: Content analysis is commonly used in media studies to analyze the content of TV shows, movies, news coverage, and other forms of media.
  • When you want to study political communication : Content analysis can be used to study political speeches, debates, news coverage, and advertising.
  • When you want to study consumer attitudes and behaviors: Content analysis can be used to analyze product reviews, social media posts, and other forms of consumer feedback.
  • When you want to study educational materials : Content analysis can be used to analyze textbooks, instructional materials, and curricula.
  • When you want to study online communities: Content analysis can be used to analyze discussion forums, social media groups, and other forms of online communication.
  • When you want to test hypotheses and theories : Content analysis can be used to test hypotheses and theories about communication, such as the framing of political issues in the media or the effects of media on attitudes and behaviors.

Characteristics of Content Analysis

Content analysis has several key characteristics that make it a useful research method. These include:

  • Objectivity : Content analysis aims to be an objective method of research, meaning that the researcher does not introduce their own biases or interpretations into the analysis. This is achieved by using standardized and systematic coding procedures.
  • Systematic: Content analysis involves the use of a systematic approach to analyze and interpret the content of communication. This involves defining the research question, selecting the sample of content to analyze, developing a coding scheme, and analyzing the data.
  • Quantitative : Content analysis often involves counting and measuring the occurrence of specific themes or topics in the content, making it a quantitative research method. This allows for statistical analysis and generalization of findings.
  • Contextual : Content analysis considers the context in which the communication takes place, such as the time period, the audience, and the purpose of the communication.
  • Iterative : Content analysis is an iterative process, meaning that the researcher may refine the coding scheme and analysis as they analyze the data, to ensure that the findings are valid and reliable.
  • Reliability and validity : Content analysis aims to be a reliable and valid method of research, meaning that the findings are consistent and accurate. This is achieved through inter-coder reliability tests and other measures to ensure the quality of the data and analysis.

Advantages of Content Analysis

There are several advantages to using content analysis as a research method, including:

  • Objective and systematic : Content analysis aims to be an objective and systematic method of research, which reduces the likelihood of bias and subjectivity in the analysis.
  • Large sample size: Content analysis allows for the analysis of a large sample of data, which increases the statistical power of the analysis and the generalizability of the findings.
  • Non-intrusive: Content analysis does not require the researcher to interact with the participants or disrupt their natural behavior, making it a non-intrusive research method.
  • Accessible data: Content analysis can be used to analyze a wide range of data types, including written, oral, and visual communication, making it accessible to researchers across different fields.
  • Versatile : Content analysis can be used to study communication in a wide range of contexts and fields, including media studies, political science, psychology, education, sociology, and marketing research.
  • Cost-effective: Content analysis is a cost-effective research method, as it does not require expensive equipment or participant incentives.

Limitations of Content Analysis

While content analysis has many advantages, there are also some limitations to consider, including:

  • Limited contextual information: Content analysis is focused on the content of communication, which means that contextual information may be limited. This can make it difficult to fully understand the meaning behind the communication.
  • Limited ability to capture nonverbal communication : Content analysis is limited to analyzing the content of communication that can be captured in written or recorded form. It may miss out on nonverbal communication, such as body language or tone of voice.
  • Subjectivity in coding: While content analysis aims to be objective, there may be subjectivity in the coding process. Different coders may interpret the content differently, which can lead to inconsistent results.
  • Limited ability to establish causality: Content analysis is a correlational research method, meaning that it cannot establish causality between variables. It can only identify associations between variables.
  • Limited generalizability: Content analysis is limited to the data that is analyzed, which means that the findings may not be generalizable to other contexts or populations.
  • Time-consuming: Content analysis can be a time-consuming research method, especially when analyzing a large sample of data. This can be a disadvantage for researchers who need to complete their research in a short amount of time.

About the author

' src=

Muhammad Hassan

Researcher, Academic Writer, Web developer

You may also like

Probability Histogram

Probability Histogram – Definition, Examples and...

Substantive Framework

Substantive Framework – Types, Methods and...

Factor Analysis

Factor Analysis – Steps, Methods and Examples

Graphical Methods

Graphical Methods – Types, Examples and Guide

Critical Analysis

Critical Analysis – Types, Examples and Writing...

Grounded Theory

Grounded Theory – Methods, Examples and Guide

Have a language expert improve your writing

Run a free plagiarism check in 10 minutes, automatically generate references for free.

  • Knowledge Base
  • Methodology

Content Analysis | A Step-by-Step Guide with Examples

Published on 5 May 2022 by Amy Luo . Revised on 5 December 2022.

Content analysis is a research method used to identify patterns in recorded communication. To conduct content analysis, you systematically collect data from a set of texts, which can be written, oral, or visual:

  • Books, newspapers, and magazines
  • Speeches and interviews
  • Web content and social media posts
  • Photographs and films

Content analysis can be both quantitative (focused on counting and measuring) and qualitative (focused on interpreting and understanding). In both types, you categorise or ‘code’ words, themes, and concepts within the texts and then analyse the results.

Table of contents

What is content analysis used for, advantages of content analysis, disadvantages of content analysis, how to conduct content analysis.

Researchers use content analysis to find out about the purposes, messages, and effects of communication content. They can also make inferences about the producers and audience of the texts they analyse.

Content analysis can be used to quantify the occurrence of certain words, phrases, subjects, or concepts in a set of historical or contemporary texts.

In addition, content analysis can be used to make qualitative inferences by analysing the meaning and semantic relationship of words and concepts.

Because content analysis can be applied to a broad range of texts, it is used in a variety of fields, including marketing, media studies, anthropology, cognitive science, psychology, and many social science disciplines. It has various possible goals:

  • Finding correlations and patterns in how concepts are communicated
  • Understanding the intentions of an individual, group, or institution
  • Identifying propaganda and bias in communication
  • Revealing differences in communication in different contexts
  • Analysing the consequences of communication content, such as the flow of information or audience responses

Prevent plagiarism, run a free check.

  • Unobtrusive data collection

You can analyse communication and social interaction without the direct involvement of participants, so your presence as a researcher doesn’t influence the results.

  • Transparent and replicable

When done well, content analysis follows a systematic procedure that can easily be replicated by other researchers, yielding results with high reliability .

  • Highly flexible

You can conduct content analysis at any time, in any location, and at low cost. All you need is access to the appropriate sources.

Focusing on words or phrases in isolation can sometimes be overly reductive, disregarding context, nuance, and ambiguous meanings.

Content analysis almost always involves some level of subjective interpretation, which can affect the reliability and validity of the results and conclusions.

  • Time intensive

Manually coding large volumes of text is extremely time-consuming, and it can be difficult to automate effectively.

If you want to use content analysis in your research, you need to start with a clear, direct  research question .

Next, you follow these five steps.

Step 1: Select the content you will analyse

Based on your research question, choose the texts that you will analyse. You need to decide:

  • The medium (e.g., newspapers, speeches, or websites) and genre (e.g., opinion pieces, political campaign speeches, or marketing copy)
  • The criteria for inclusion (e.g., newspaper articles that mention a particular event, speeches by a certain politician, or websites selling a specific type of product)
  • The parameters in terms of date range, location, etc.

If there are only a small number of texts that meet your criteria, you might analyse all of them. If there is a large volume of texts, you can select a sample .

Step 2: Define the units and categories of analysis

Next, you need to determine the level at which you will analyse your chosen texts. This means defining:

  • The unit(s) of meaning that will be coded. For example, are you going to record the frequency of individual words and phrases, the characteristics of people who produced or appear in the texts, the presence and positioning of images, or the treatment of themes and concepts?
  • The set of categories that you will use for coding. Categories can be objective characteristics (e.g., aged 30–40, lawyer, parent) or more conceptual (e.g., trustworthy, corrupt, conservative, family-oriented).

Step 3: Develop a set of rules for coding

Coding involves organising the units of meaning into the previously defined categories. Especially with more conceptual categories, it’s important to clearly define the rules for what will and won’t be included to ensure that all texts are coded consistently.

Coding rules are especially important if multiple researchers are involved, but even if you’re coding all of the text by yourself, recording the rules makes your method more transparent and reliable.

Step 4: Code the text according to the rules

You go through each text and record all relevant data in the appropriate categories. This can be done manually or aided with computer programs, such as QSR NVivo , Atlas.ti , and Diction , which can help speed up the process of counting and categorising words and phrases.

Step 5: Analyse the results and draw conclusions

Once coding is complete, the collected data is examined to find patterns and draw conclusions in response to your research question. You might use statistical analysis to find correlations or trends, discuss your interpretations of what the results mean, and make inferences about the creators, context, and audience of the texts.

Cite this Scribbr article

If you want to cite this source, you can copy and paste the citation or click the ‘Cite this Scribbr article’ button to automatically add the citation to our free Reference Generator.

Luo, A. (2022, December 05). Content Analysis | A Step-by-Step Guide with Examples. Scribbr. Retrieved 12 March 2024, from https://www.scribbr.co.uk/research-methods/content-analysis-explained/

Is this article helpful?

Amy Luo

Other students also liked

How to do thematic analysis | guide & examples, data collection methods | step-by-step guide & examples, qualitative vs quantitative research | examples & methods.

U.S. flag

An official website of the United States government

The .gov means it’s official. Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

The site is secure. The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

  • Publications
  • Account settings
  • Advanced Search
  • Journal List
  • Afr J Emerg Med
  • v.7(3); 2017 Sep

A hands-on guide to doing content analysis

Christen erlingsson.

a Department of Health and Caring Sciences, Linnaeus University, Kalmar 391 82, Sweden

Petra Brysiewicz

b School of Nursing & Public Health, University of KwaZulu-Natal, Durban 4041, South Africa

Associated Data

There is a growing recognition for the important role played by qualitative research and its usefulness in many fields, including the emergency care context in Africa. Novice qualitative researchers are often daunted by the prospect of qualitative data analysis and thus may experience much difficulty in the data analysis process. Our objective with this manuscript is to provide a practical hands-on example of qualitative content analysis to aid novice qualitative researchers in their task.

African relevance

  • • Qualitative research is useful to deepen the understanding of the human experience.
  • • Novice qualitative researchers may benefit from this hands-on guide to content analysis.
  • • Practical tips and data analysis templates are provided to assist in the analysis process.

Introduction

There is a growing recognition for the important role played by qualitative research and its usefulness in many fields, including emergency care research. An increasing number of health researchers are currently opting to use various qualitative research approaches in exploring and describing complex phenomena, providing textual accounts of individuals’ “life worlds”, and giving voice to vulnerable populations our patients so often represent. Many articles and books are available that describe qualitative research methods and provide overviews of content analysis procedures [1] , [2] , [3] , [4] , [5] , [6] , [7] , [8] , [9] , [10] . Some articles include step-by-step directions intended to clarify content analysis methodology. What we have found in our teaching experience is that these directions are indeed very useful. However, qualitative researchers, especially novice researchers, often struggle to understand what is happening on and between steps, i.e., how the steps are taken.

As research supervisors of postgraduate health professionals, we often meet students who present brilliant ideas for qualitative studies that have potential to fill current gaps in the literature. Typically, the suggested studies aim to explore human experience. Research questions exploring human experience are expediently studied through analysing textual data e.g., collected in individual interviews, focus groups, documents, or documented participant observation. When reflecting on the proposed study aim together with the student, we often suggest content analysis methodology as the best fit for the study and the student, especially the novice researcher. The interview data are collected and the content analysis adventure begins. Students soon realise that data based on human experiences are complex, multifaceted and often carry meaning on multiple levels.

For many novice researchers, analysing qualitative data is found to be unexpectedly challenging and time-consuming. As they soon discover, there is no step-wise analysis process that can be applied to the data like a pattern cutter at a textile factory. They may become extremely annoyed and frustrated during the hands-on enterprise of qualitative content analysis.

The novice researcher may lament, “I’ve read all the methodology but don’t really know how to start and exactly what to do with my data!” They grapple with qualitative research terms and concepts, for example; differences between meaning units, codes, categories and themes, and regarding increasing levels of abstraction from raw data to categories or themes. The content analysis adventure may now seem to be a chaotic undertaking. But, life is messy, complex and utterly fascinating. Experiencing chaos during analysis is normal. Good advice for the qualitative researcher is to be open to the complexity in the data and utilise one’s flow of creativity.

Inspired primarily by descriptions of “conventional content analysis” in Hsieh and Shannon [3] , “inductive content analysis” in Elo and Kyngäs [5] and “qualitative content analysis of an interview text” in Graneheim and Lundman [1] , we have written this paper to help the novice qualitative researcher navigate the uncertainty in-between the steps of qualitative content analysis. We will provide advice and practical tips, as well as data analysis templates, to attempt to ease frustration and hopefully, inspire readers to discover how this exciting methodology contributes to developing a deeper understanding of human experience and our professional contexts.

Overview of qualitative content analysis

Synopsis of content analysis.

A common starting point for qualitative content analysis is often transcribed interview texts. The objective in qualitative content analysis is to systematically transform a large amount of text into a highly organised and concise summary of key results. Analysis of the raw data from verbatim transcribed interviews to form categories or themes is a process of further abstraction of data at each step of the analysis; from the manifest and literal content to latent meanings ( Fig. 1 and Table 1 ).

An external file that holds a picture, illustration, etc.
Object name is gr1.jpg

Example of analysis leading to higher levels of abstraction; from manifest to latent content.

Glossary of terms as used in this hands-on guide to doing content analysis. *

The initial step is to read and re-read the interviews to get a sense of the whole, i.e., to gain a general understanding of what your participants are talking about. At this point you may already start to get ideas of what the main points or ideas are that your participants are expressing. Then one needs to start dividing up the text into smaller parts, namely, into meaning units. One then condenses these meaning units further. While doing this, you need to ensure that the core meaning is still retained. The next step is to label condensed meaning units by formulating codes and then grouping these codes into categories. Depending on the study’s aim and quality of the collected data, one may choose categories as the highest level of abstraction for reporting results or you can go further and create themes [1] , [2] , [3] , [5] , [8] .

Content analysis as a reflective process

You must mould the clay of the data , tapping into your intuition while maintaining a reflective understanding of how your own previous knowledge is influencing your analysis, i.e., your pre-understanding. In qualitative methodology, it is imperative to vigilantly maintain an awareness of one’s pre-understanding so that this does not influence analysis and/or results. This is the difficult balancing task of keeping a firm grip on one’s assumptions, opinions, and personal beliefs, and not letting them unconsciously steer your analysis process while simultaneously, and knowingly, utilising one’s pre-understanding to facilitate a deeper understanding of the data.

Content analysis, as in all qualitative analysis, is a reflective process. There is no “step 1, 2, 3, done!” linear progression in the analysis. This means that identifying and condensing meaning units, coding, and categorising are not one-time events. It is a continuous process of coding and categorising then returning to the raw data to reflect on your initial analysis. Are you still satisfied with the length of meaning units? Do the condensed meaning units and codes still “fit” with each other? Do the codes still fit into this particular category? Typically, a fair amount of adjusting is needed after the first analysis endeavour. For example: a meaning unit might need to be split into two meaning units in order to capture an additional core meaning; a code modified to more closely match the core meaning of the condensed meaning unit; or a category name tweaked to most accurately describe the included codes. In other words, analysis is a flexible reflective process of working and re-working your data that reveals connections and relationships. Once condensed meaning units are coded it is easier to get a bigger picture and see patterns in your codes and organise codes in categories.

Content analysis exercise

The synopsis above is representative of analysis descriptions in many content analysis articles. Although correct, such method descriptions still do not provide much support for the novice researcher during the actual analysis process. Aspiring to provide guidance and direction to support the novice, a practical example of doing the actual work of content analysis is provided in the following sections. This practical example is based on a transcribed interview excerpt that was part of a study that aimed to explore patients’ experiences of being admitted into the emergency centre ( Fig. 2 ).

An external file that holds a picture, illustration, etc.
Object name is gr2.jpg

Excerpt from interview text exploring “Patient’s experience of being admitted into the emergency centre”

This content analysis exercise provides instructions, tips, and advice to support the content analysis novice in a) familiarising oneself with the data and the hermeneutic spiral, b) dividing up the text into meaning units and subsequently condensing these meaning units, c) formulating codes, and d) developing categories and themes.

Familiarising oneself with the data and the hermeneutic spiral

An important initial phase in the data analysis process is to read and re-read the transcribed interview while keeping your aim in focus. Write down your initial impressions. Embrace your intuition. What is the text talking about? What stands out? How did you react while reading the text? What message did the text leave you with? In this analysis phase, you are gaining a sense of the text as a whole.

You may ask why this is important. During analysis, you will be breaking down the whole text into smaller parts. Returning to your notes with your initial impressions will help you see if your “parts” analysis is matching up with your first impressions of the “whole” text. Are your initial impressions visible in your analysis of the parts? Perhaps you need to go back and check for different perspectives. This is what is referred to as the hermeneutic spiral or hermeneutic circle. It is the process of comparing the parts to the whole to determine whether impressions of the whole verify the analysis of the parts in all phases of analysis. Each part should reflect the whole and the whole should be reflected in each part. This concept will become clearer as you start working with your data.

Dividing up the text into meaning units and condensing meaning units

You have now read the interview a number of times. Keeping your research aim and question clearly in focus, divide up the text into meaning units. Located meaning units are then condensed further while keeping the central meaning intact ( Table 2 ). The condensation should be a shortened version of the same text that still conveys the essential message of the meaning unit. Sometimes the meaning unit is already so compact that no further condensation is required. Some content analysis sources warn researchers against short meaning units, claiming that this can lead to fragmentation [1] . However, our personal experience as research supervisors has shown us that a greater problem for the novice is basing analysis on meaning units that are too large and include many meanings which are then lost in the condensation process.

Suggestion for how the exemplar interview text can be divided into meaning units and condensed meaning units ( condensations are in parentheses ).

Formulating codes

The next step is to develop codes that are descriptive labels for the condensed meaning units ( Table 3 ). Codes concisely describe the condensed meaning unit and are tools to help researchers reflect on the data in new ways. Codes make it easier to identify connections between meaning units. At this stage of analysis you are still keeping very close to your data with very limited interpretation of content. You may adjust, re-do, re-think, and re-code until you get to the point where you are satisfied that your choices are reasonable. Just as in the initial phase of getting to know your data as a whole, it is also good to write notes during coding on your impressions and reactions to the text.

Suggestions for coding of condensed meaning units.

Developing categories and themes

The next step is to sort codes into categories that answer the questions who , what , when or where? One does this by comparing codes and appraising them to determine which codes seem to belong together, thereby forming a category. In other words, a category consists of codes that appear to deal with the same issue, i.e., manifest content visible in the data with limited interpretation on the part of the researcher. Category names are most often short and factual sounding.

In data that is rich with latent meaning, analysis can be carried on to create themes. In our practical example, we have continued the process of abstracting data to a higher level, from category to theme level, and developed three themes as well as an overarching theme ( Table 4 ). Themes express underlying meaning, i.e., latent content, and are formed by grouping two or more categories together. Themes are answering questions such as why , how , in what way or by what means? Therefore, theme names include verbs, adverbs and adjectives and are very descriptive or even poetic.

Suggestion for organisation of coded meaning units into categories and themes.

Some reflections and helpful tips

Understand your pre-understandings.

While conducting qualitative research, it is paramount that the researcher maintains a vigilance of non-bias during analysis. In other words, did you remain aware of your pre-understandings, i.e., your own personal assumptions, professional background, and previous experiences and knowledge? For example, did you zero in on particular aspects of the interview on account of your profession (as an emergency doctor, emergency nurse, pre-hospital professional, etc.)? Did you assume the patient’s gender? Did your assumptions affect your analysis? How about aspects of culpability; did you assume that this patient was at fault or that this patient was a victim in the crash? Did this affect how you analysed the text?

Staying aware of one’s pre-understandings is exactly as difficult as it sounds. But, it is possible and it is requisite. Focus on putting yourself and your pre-understandings in a holding pattern while you approach your data with an openness and expectation of finding new perspectives. That is the key: expect the new and be prepared to be surprised. If something in your data feels unusual, is different from what you know, atypical, or even odd – don’t by-pass it as “wrong”. Your reactions and intuitive responses are letting you know that here is something to pay extra attention to, besides the more comfortable condensing and coding of more easily recognisable meaning units.

Use your intuition

Intuition is a great asset in qualitative analysis and not to be dismissed as “unscientific”. Intuition results from tacit knowledge. Just as tacit knowledge is a hallmark of great clinicians [11] , [12] ; it is also an invaluable tool in analysis work [13] . Literally, take note of your gut reactions and intuitive guidance and remember to write these down! These notes often form a framework of possible avenues for further analysis and are especially helpful as you lift the analysis to higher levels of abstraction; from meaning units to condensed meaning units, to codes, to categories and then to the highest level of abstraction in content analysis, themes.

Aspects of coding and categorising hard to place data

All too often, the novice gets overwhelmed by interview material that deals with the general subject matter of the interview, but doesn’t seem to answer the research question. Don’t be too quick to consider such text as off topic or dross [6] . There is often data that, although not seeming to match the study aim precisely, is still important for illuminating the problem area. This can be seen in our practical example about exploring patients’ experiences of being admitted into the emergency centre. Initially the participant is describing the accident itself. While not directly answering the research question, the description is important for understanding the context of the experience of being admitted into the emergency centre. It is very common that participants will “begin at the beginning” and prologue their narratives in order to create a context that sets the scene. This type of contextual data is vital for gaining a deepened understanding of participants’ experiences.

In our practical example, the participant begins by describing the crash and the rescue, i.e., experiences leading up to and prior to admission to the emergency centre. That is why we have chosen in our analysis to code the condensed meaning unit “Ambulance staff looked worried about all the blood” as “In the ambulance” and place it in the category “Reliving the rescue”. We did not choose to include this meaning unit in the categories specifically about admission to the emergency centre itself. Do you agree with our coding choice? Would you have chosen differently?

Another common problem for the novice is deciding how to code condensed meaning units when the unit can be labelled in several different ways. At this point researchers usually groan and wish they had thought to ask one of those classic follow-up questions like “Can you tell me a little bit more about that?” We have examples of two such coding conundrums in the exemplar, as can be seen in Table 3 (codes we conferred on) and Table 4 (codes we reached consensus on). Do you agree with our choices or would you have chosen different codes? Our best advice is to go back to your impressions of the whole and lean into your intuition when choosing codes that are most reasonable and best fit your data.

A typical problem area during categorisation, especially for the novice researcher, is overlap between content in more than one initial category, i.e., codes included in one category also seem to be a fit for another category. Overlap between initial categories is very likely an indication that the jump from code to category was too big, a problem not uncommon when the data is voluminous and/or very complex. In such cases, it can be helpful to first sort codes into narrower categories, so-called subcategories. Subcategories can then be reviewed for possibilities of further aggregation into categories. In the case of a problematic coding, it is advantageous to return to the meaning unit and check if the meaning unit itself fits the category or if you need to reconsider your preliminary coding.

It is not uncommon to be faced by thorny problems such as these during coding and categorisation. Here we would like to reiterate how valuable it is to have fellow researchers with whom you can discuss and reflect together with, in order to reach consensus on the best way forward in your data analysis. It is really advantageous to compare your analysis with meaning units, condensations, coding and categorisations done by another researcher on the same text. Have you identified the same meaning units? Do you agree on coding? See similar patterns in the data? Concur on categories? Sometimes referred to as “researcher triangulation,” this is actually a key element in qualitative analysis and an important component when striving to ensure trustworthiness in your study [14] . Qualitative research is about seeking out variations and not controlling variables, as in quantitative research. Collaborating with others during analysis lets you tap into multiple perspectives and often makes it easier to see variations in the data, thereby enhancing the quality of your results as well as contributing to the rigor of your study. It is important to note that it is not necessary to force consensus in the findings but one can embrace these variations in interpretation and use that to capture the richness in the data.

Yet there are times when neither openness, pre-understanding, intuition, nor researcher triangulation does the job; for example, when analysing an interview and one is simply confused on how to code certain meaning units. At such times, there are a variety of options. A good starting place is to re-read all the interviews through the lens of this specific issue and actively search for other similar types of meaning units you might have missed. Another way to handle this is to conduct further interviews with specific queries that hopefully shed light on the issue. A third option is to have a follow-up interview with the same person and ask them to explain.

Additional tips

It is important to remember that in a typical project there are several interviews to analyse. Codes found in a single interview serve as a starting point as you then work through the remaining interviews coding all material. Form your categories and themes when all project interviews have been coded.

When submitting an article with your study results, it is a good idea to create a table or figure providing a few key examples of how you progressed from the raw data of meaning units, to condensed meaning units, coding, categorisation, and, if included, themes. Providing such a table or figure supports the rigor of your study [1] and is an element greatly appreciated by reviewers and research consumers.

During the analysis process, it can be advantageous to write down your research aim and questions on a sheet of paper that you keep nearby as you work. Frequently referring to your aim can help you keep focused and on track during analysis. Many find it helpful to colour code their transcriptions and write notes in the margins.

Having access to qualitative analysis software can be greatly helpful in organising and retrieving analysed data. Just remember, a computer does not analyse the data. As Jennings [15] has stated, “… it is ‘peopleware,’ not software, that analyses.” A major drawback is that qualitative analysis software can be prohibitively expensive. One way forward is to use table templates such as we have used in this article. (Three analysis templates, Templates A, B, and C, are provided as supplementary online material ). Additionally, the “find” function in word processing programmes such as Microsoft Word (Redmond, WA USA) facilitates locating key words, e.g., in transcribed interviews, meaning units, and codes.

Lessons learnt/key points

From our experience with content analysis we have learnt a number of important lessons that may be useful for the novice researcher. They are:

  • • A method description is a guideline supporting analysis and trustworthiness. Don’t get caught up too rigidly following steps. Reflexivity and flexibility are just as important. Remember that a method description is a tool helping you in the process of making sense of your data by reducing a large amount of text to distil key results.
  • • It is important to maintain a vigilant awareness of one’s own pre-understandings in order to avoid bias during analysis and in results.
  • • Use and trust your own intuition during the analysis process.
  • • If possible, discuss and reflect together with other researchers who have analysed the same data. Be open and receptive to new perspectives.
  • • Understand that it is going to take time. Even if you are quite experienced, each set of data is different and all require time to analyse. Don’t expect to have all the data analysis done over a weekend. It may take weeks. You need time to think, reflect and then review your analysis.
  • • Keep reminding yourself how excited you have felt about this area of research and how interesting it is. Embrace it with enthusiasm!
  • • Let it be chaotic – have faith that some sense will start to surface. Don’t be afraid and think you will never get to the end – you will… eventually!

Peer review under responsibility of African Federation for Emergency Medicine.

Appendix A Supplementary data associated with this article can be found, in the online version, at http://dx.doi.org/10.1016/j.afjem.2017.08.001 .

Appendix A. Supplementary data

Grad Coach

What Is Qualitative Content Analysis?

Qca explained simply (with examples).

By: Jenna Crosley (PhD). Reviewed by: Dr Eunice Rautenbach (DTech) | February 2021

If you’re in the process of preparing for your dissertation, thesis or research project, you’ve probably encountered the term “ qualitative content analysis ” – it’s quite a mouthful. If you’ve landed on this post, you’re probably a bit confused about it. Well, the good news is that you’ve come to the right place…

Overview: Qualitative Content Analysis

  • What (exactly) is qualitative content analysis
  • The two main types of content analysis
  • When to use content analysis
  • How to conduct content analysis (the process)
  • The advantages and disadvantages of content analysis

1. What is content analysis?

Content analysis is a  qualitative analysis method  that focuses on recorded human artefacts such as manuscripts, voice recordings and journals. Content analysis investigates these written, spoken and visual artefacts without explicitly extracting data from participants – this is called  unobtrusive  research.

In other words, with content analysis, you don’t necessarily need to interact with participants (although you can if necessary); you can simply analyse the data that they have already produced. With this type of analysis, you can analyse data such as text messages, books, Facebook posts, videos, and audio (just to mention a few).

The basics – explicit and implicit content

When working with content analysis, explicit and implicit content will play a role. Explicit data is transparent and easy to identify, while implicit data is that which requires some form of interpretation and is often of a subjective nature. Sounds a bit fluffy? Here’s an example:

Joe: Hi there, what can I help you with? 

Lauren: I recently adopted a puppy and I’m worried that I’m not feeding him the right food. Could you please advise me on what I should be feeding? 

Joe: Sure, just follow me and I’ll show you. Do you have any other pets?

Lauren: Only one, and it tweets a lot!

In this exchange, the explicit data indicates that Joe is helping Lauren to find the right puppy food. Lauren asks Joe whether she has any pets aside from her puppy. This data is explicit because it requires no interpretation.

On the other hand, implicit data , in this case, includes the fact that the speakers are in a pet store. This information is not clearly stated but can be inferred from the conversation, where Joe is helping Lauren to choose pet food. An additional piece of implicit data is that Lauren likely has some type of bird as a pet. This can be inferred from the way that Lauren states that her pet “tweets”.

As you can see, explicit and implicit data both play a role in human interaction  and are an important part of your analysis. However, it’s important to differentiate between these two types of data when you’re undertaking content analysis. Interpreting implicit data can be rather subjective as conclusions are based on the researcher’s interpretation. This can introduce an element of bias , which risks skewing your results.

Explicit and implicit data both play an important role in your content analysis, but it’s important to differentiate between them.

2. The two types of content analysis

Now that you understand the difference between implicit and explicit data, let’s move on to the two general types of content analysis : conceptual and relational content analysis. Importantly, while conceptual and relational content analysis both follow similar steps initially, the aims and outcomes of each are different.

Conceptual analysis focuses on the number of times a concept occurs in a set of data and is generally focused on explicit data. For example, if you were to have the following conversation:

Marie: She told me that she has three cats.

Jean: What are her cats’ names?

Marie: I think the first one is Bella, the second one is Mia, and… I can’t remember the third cat’s name.

In this data, you can see that the word “cat” has been used three times. Through conceptual content analysis, you can deduce that cats are the central topic of the conversation. You can also perform a frequency analysis , where you assess the term’s frequency in the data. For example, in the exchange above, the word “cat” makes up 9% of the data. In other words, conceptual analysis brings a little bit of quantitative analysis into your qualitative analysis.

As you can see, the above data is without interpretation and focuses on explicit data . Relational content analysis, on the other hand, takes a more holistic view by focusing more on implicit data in terms of context, surrounding words and relationships.

There are three types of relational analysis:

  • Affect extraction
  • Proximity analysis
  • Cognitive mapping

Affect extraction is when you assess concepts according to emotional attributes. These emotions are typically mapped on scales, such as a Likert scale or a rating scale ranging from 1 to 5, where 1 is “very sad” and 5 is “very happy”.

If participants are talking about their achievements, they are likely to be given a score of 4 or 5, depending on how good they feel about it. If a participant is describing a traumatic event, they are likely to have a much lower score, either 1 or 2.

Proximity analysis identifies explicit terms (such as those found in a conceptual analysis) and the patterns in terms of how they co-occur in a text. In other words, proximity analysis investigates the relationship between terms and aims to group these to extract themes and develop meaning.

Proximity analysis is typically utilised when you’re looking for hard facts rather than emotional, cultural, or contextual factors. For example, if you were to analyse a political speech, you may want to focus only on what has been said, rather than implications or hidden meanings. To do this, you would make use of explicit data, discounting any underlying meanings and implications of the speech.

Lastly, there’s cognitive mapping, which can be used in addition to, or along with, proximity analysis. Cognitive mapping involves taking different texts and comparing them in a visual format – i.e. a cognitive map. Typically, you’d use cognitive mapping in studies that assess changes in terms, definitions, and meanings over time. It can also serve as a way to visualise affect extraction or proximity analysis and is often presented in a form such as a graphic map.

Example of a cognitive map

To recap on the essentials, content analysis is a qualitative analysis method that focuses on recorded human artefacts . It involves both conceptual analysis (which is more numbers-based) and relational analysis (which focuses on the relationships between concepts and how they’re connected).

Need a helping hand?

research methodology content analysis

3. When should you use content analysis?

Content analysis is a useful tool that provides insight into trends of communication . For example, you could use a discussion forum as the basis of your analysis and look at the types of things the members talk about as well as how they use language to express themselves. Content analysis is flexible in that it can be applied to the individual, group, and institutional level.

Content analysis is typically used in studies where the aim is to better understand factors such as behaviours, attitudes, values, emotions, and opinions . For example, you could use content analysis to investigate an issue in society, such as miscommunication between cultures. In this example, you could compare patterns of communication in participants from different cultures, which will allow you to create strategies for avoiding misunderstandings in intercultural interactions.

Another example could include conducting content analysis on a publication such as a book. Here you could gather data on the themes, topics, language use and opinions reflected in the text to draw conclusions regarding the political (such as conservative or liberal) leanings of the publication.

Content analysis is typically used in projects where the research aims involve getting a better understanding of factors such as behaviours, attitudes, values, emotions, and opinions.

4. How to conduct a qualitative content analysis

Conceptual and relational content analysis differ in terms of their exact process ; however, there are some similarities. Let’s have a look at these first – i.e., the generic process:

  • Recap on your research questions
  • Undertake bracketing to identify biases
  • Operationalise your variables and develop a coding scheme
  • Code the data and undertake your analysis

Step 1 – Recap on your research questions

It’s always useful to begin a project with research questions , or at least with an idea of what you are looking for. In fact, if you’ve spent time reading this blog, you’ll know that it’s useful to recap on your research questions, aims and objectives when undertaking pretty much any research activity. In the context of content analysis, it’s difficult to know what needs to be coded and what doesn’t, without a clear view of the research questions.

For example, if you were to code a conversation focused on basic issues of social justice, you may be met with a wide range of topics that may be irrelevant to your research. However, if you approach this data set with the specific intent of investigating opinions on gender issues, you will be able to focus on this topic alone, which would allow you to code only what you need to investigate.

With content analysis, it’s difficult to know what needs to be coded  without a clear view of the research questions.

Step 2 – Reflect on your personal perspectives and biases

It’s vital that you reflect on your own pre-conception of the topic at hand and identify the biases that you might drag into your content analysis – this is called “ bracketing “. By identifying this upfront, you’ll be more aware of them and less likely to have them subconsciously influence your analysis.

For example, if you were to investigate how a community converses about unequal access to healthcare, it is important to assess your views to ensure that you don’t project these onto your understanding of the opinions put forth by the community. If you have access to medical aid, for instance, you should not allow this to interfere with your examination of unequal access.

You must reflect on the preconceptions and biases that you might drag into your content analysis - this is called "bracketing".

Step 3 – Operationalise your variables and develop a coding scheme

Next, you need to operationalise your variables . But what does that mean? Simply put, it means that you have to define each variable or construct . Give every item a clear definition – what does it mean (include) and what does it not mean (exclude). For example, if you were to investigate children’s views on healthy foods, you would first need to define what age group/range you’re looking at, and then also define what you mean by “healthy foods”.

In combination with the above, it is important to create a coding scheme , which will consist of information about your variables (how you defined each variable), as well as a process for analysing the data. For this, you would refer back to how you operationalised/defined your variables so that you know how to code your data.

For example, when coding, when should you code a food as “healthy”? What makes a food choice healthy? Is it the absence of sugar or saturated fat? Is it the presence of fibre and protein? It’s very important to have clearly defined variables to achieve consistent coding – without this, your analysis will get very muddy, very quickly.

When operationalising your variables, you must give every item a clear definition. In other words, what does it mean (include) and what does it not mean (exclude).

Step 4 – Code and analyse the data

The next step is to code the data. At this stage, there are some differences between conceptual and relational analysis.

As described earlier in this post, conceptual analysis looks at the existence and frequency of concepts, whereas a relational analysis looks at the relationships between concepts. For both types of analyses, it is important to pre-select a concept that you wish to assess in your data. Using the example of studying children’s views on healthy food, you could pre-select the concept of “healthy food” and assess the number of times the concept pops up in your data.

Here is where conceptual and relational analysis start to differ.

At this stage of conceptual analysis , it is necessary to decide on the level of analysis you’ll perform on your data, and whether this will exist on the word, phrase, sentence, or thematic level. For example, will you code the phrase “healthy food” on its own? Will you code each term relating to healthy food (e.g., broccoli, peaches, bananas, etc.) with the code “healthy food” or will these be coded individually? It is very important to establish this from the get-go to avoid inconsistencies that could result in you having to code your data all over again.

On the other hand, relational analysis looks at the type of analysis. So, will you use affect extraction? Proximity analysis? Cognitive mapping? A mix? It’s vital to determine the type of analysis before you begin to code your data so that you can maintain the reliability and validity of your research .

research methodology content analysis

How to conduct conceptual analysis

First, let’s have a look at the process for conceptual analysis.

Once you’ve decided on your level of analysis, you need to establish how you will code your concepts, and how many of these you want to code. Here you can choose whether you want to code in a deductive or inductive manner. Just to recap, deductive coding is when you begin the coding process with a set of pre-determined codes, whereas inductive coding entails the codes emerging as you progress with the coding process. Here it is also important to decide what should be included and excluded from your analysis, and also what levels of implication you wish to include in your codes.

For example, if you have the concept of “tall”, can you include “up in the clouds”, derived from the sentence, “the giraffe’s head is up in the clouds” in the code, or should it be a separate code? In addition to this, you need to know what levels of words may be included in your codes or not. For example, if you say, “the panda is cute” and “look at the panda’s cuteness”, can “cute” and “cuteness” be included under the same code?

Once you’ve considered the above, it’s time to code the text . We’ve already published a detailed post about coding , so we won’t go into that process here. Once you’re done coding, you can move on to analysing your results. This is where you will aim to find generalisations in your data, and thus draw your conclusions .

How to conduct relational analysis

Now let’s return to relational analysis.

As mentioned, you want to look at the relationships between concepts . To do this, you’ll need to create categories by reducing your data (in other words, grouping similar concepts together) and then also code for words and/or patterns. These are both done with the aim of discovering whether these words exist, and if they do, what they mean.

Your next step is to assess your data and to code the relationships between your terms and meanings, so that you can move on to your final step, which is to sum up and analyse the data.

To recap, it’s important to start your analysis process by reviewing your research questions and identifying your biases . From there, you need to operationalise your variables, code your data and then analyse it.

Time to analyse

5. What are the pros & cons of content analysis?

One of the main advantages of content analysis is that it allows you to use a mix of quantitative and qualitative research methods, which results in a more scientifically rigorous analysis.

For example, with conceptual analysis, you can count the number of times that a term or a code appears in a dataset, which can be assessed from a quantitative standpoint. In addition to this, you can then use a qualitative approach to investigate the underlying meanings of these and relationships between them.

Content analysis is also unobtrusive and therefore poses fewer ethical issues than some other analysis methods. As the content you’ll analyse oftentimes already exists, you’ll analyse what has been produced previously, and so you won’t have to collect data directly from participants. When coded correctly, data is analysed in a very systematic and transparent manner, which means that issues of replicability (how possible it is to recreate research under the same conditions) are reduced greatly.

On the downside , qualitative research (in general, not just content analysis) is often critiqued for being too subjective and for not being scientifically rigorous enough. This is where reliability (how replicable a study is by other researchers) and validity (how suitable the research design is for the topic being investigated) come into play – if you take these into account, you’ll be on your way to achieving sound research results.

One of the main advantages of content analysis is that it allows you to use a mix of quantitative and qualitative research methods, which results in a more scientifically rigorous analysis.

Recap: Qualitative content analysis

In this post, we’ve covered a lot of ground – click on any of the sections to recap:

If you have any questions about qualitative content analysis, feel free to leave a comment below. If you’d like 1-on-1 help with your qualitative content analysis, be sure to book an initial consultation with one of our friendly Research Coaches.

research methodology content analysis

Psst… there’s more (for free)

This post is part of our dissertation mini-course, which covers everything you need to get started with your dissertation, thesis or research project. 

You Might Also Like:

Narrative analysis explainer

13 Comments

Abhishek

If I am having three pre-decided attributes for my research based on which a set of semi-structured questions where asked then should I conduct a conceptual content analysis or relational content analysis. please note that all three attributes are different like Agility, Resilience and AI.

Ofori Henry Affum

Thank you very much. I really enjoyed every word.

Janak Raj Bhatta

please send me one/ two sample of content analysis

pravin

send me to any sample of qualitative content analysis as soon as possible

abdellatif djedei

Many thanks for the brilliant explanation. Do you have a sample practical study of a foreign policy using content analysis?

DR. TAPAS GHOSHAL

1) It will be very much useful if a small but complete content analysis can be sent, from research question to coding and analysis. 2) Is there any software by which qualitative content analysis can be done?

Carkanirta

Common software for qualitative analysis is nVivo, and quantitative analysis is IBM SPSS

carmely

Thank you. Can I have at least 2 copies of a sample analysis study as my reference?

Yang

Could you please send me some sample of textbook content analysis?

Abdoulie Nyassi

Can I send you my research topic, aims, objectives and questions to give me feedback on them?

Bobby Benjamin Simeon

please could you send me samples of content analysis?

Gaid Ahmed

really we enjoyed your knowledge thanks allot. from Ethiopia

Ary

can you please share some samples of content analysis(relational)? I am a bit confused about processing the analysis part

Submit a Comment Cancel reply

Your email address will not be published. Required fields are marked *

Save my name, email, and website in this browser for the next time I comment.

  • Print Friendly

Logo for Open Educational Resources

Chapter 17. Content Analysis

Introduction.

Content analysis is a term that is used to mean both a method of data collection and a method of data analysis. Archival and historical works can be the source of content analysis, but so too can the contemporary media coverage of a story, blogs, comment posts, films, cartoons, advertisements, brand packaging, and photographs posted on Instagram or Facebook. Really, almost anything can be the “content” to be analyzed. This is a qualitative research method because the focus is on the meanings and interpretations of that content rather than strictly numerical counts or variables-based causal modeling. [1] Qualitative content analysis (sometimes referred to as QCA) is particularly useful when attempting to define and understand prevalent stories or communication about a topic of interest—in other words, when we are less interested in what particular people (our defined sample) are doing or believing and more interested in what general narratives exist about a particular topic or issue. This chapter will explore different approaches to content analysis and provide helpful tips on how to collect data, how to turn that data into codes for analysis, and how to go about presenting what is found through analysis. It is also a nice segue between our data collection methods (e.g., interviewing, observation) chapters and chapters 18 and 19, whose focus is on coding, the primary means of data analysis for most qualitative data. In many ways, the methods of content analysis are quite similar to the method of coding.

research methodology content analysis

Although the body of material (“content”) to be collected and analyzed can be nearly anything, most qualitative content analysis is applied to forms of human communication (e.g., media posts, news stories, campaign speeches, advertising jingles). The point of the analysis is to understand this communication, to systematically and rigorously explore its meanings, assumptions, themes, and patterns. Historical and archival sources may be the subject of content analysis, but there are other ways to analyze (“code”) this data when not overly concerned with the communicative aspect (see chapters 18 and 19). This is why we tend to consider content analysis its own method of data collection as well as a method of data analysis. Still, many of the techniques you learn in this chapter will be helpful to any “coding” scheme you develop for other kinds of qualitative data. Just remember that content analysis is a particular form with distinct aims and goals and traditions.

An Overview of the Content Analysis Process

The first step: selecting content.

Figure 17.2 is a display of possible content for content analysis. The first step in content analysis is making smart decisions about what content you will want to analyze and to clearly connect this content to your research question or general focus of research. Why are you interested in the messages conveyed in this particular content? What will the identification of patterns here help you understand? Content analysis can be fun to do, but in order to make it research, you need to fit it into a research plan.

Figure 17.1. A Non-exhaustive List of "Content" for Content Analysis

To take one example, let us imagine you are interested in gender presentations in society and how presentations of gender have changed over time. There are various forms of content out there that might help you document changes. You could, for example, begin by creating a list of magazines that are coded as being for “women” (e.g., Women’s Daily Journal ) and magazines that are coded as being for “men” (e.g., Men’s Health ). You could then select a date range that is relevant to your research question (e.g., 1950s–1970s) and collect magazines from that era. You might create a “sample” by deciding to look at three issues for each year in the date range and a systematic plan for what to look at in those issues (e.g., advertisements? Cartoons? Titles of articles? Whole articles?). You are not just going to look at some magazines willy-nilly. That would not be systematic enough to allow anyone to replicate or check your findings later on. Once you have a clear plan of what content is of interest to you and what you will be looking at, you can begin, creating a record of everything you are including as your content. This might mean a list of each advertisement you look at or each title of stories in those magazines along with its publication date. You may decide to have multiple “content” in your research plan. For each content, you want a clear plan for collecting, sampling, and documenting.

The Second Step: Collecting and Storing

Once you have a plan, you are ready to collect your data. This may entail downloading from the internet, creating a Word document or PDF of each article or picture, and storing these in a folder designated by the source and date (e.g., “ Men’s Health advertisements, 1950s”). Sølvberg ( 2021 ), for example, collected posted job advertisements for three kinds of elite jobs (economic, cultural, professional) in Sweden. But collecting might also mean going out and taking photographs yourself, as in the case of graffiti, street signs, or even what people are wearing. Chaise LaDousa, an anthropologist and linguist, took photos of “house signs,” which are signs, often creative and sometimes offensive, hung by college students living in communal off-campus houses. These signs were a focal point of college culture, sending messages about the values of the students living in them. Some of the names will give you an idea: “Boot ’n Rally,” “The Plantation,” “Crib of the Rib.” The students might find these signs funny and benign, but LaDousa ( 2011 ) argued convincingly that they also reproduced racial and gender inequalities. The data here already existed—they were big signs on houses—but the researcher had to collect the data by taking photographs.

In some cases, your content will be in physical form but not amenable to photographing, as in the case of films or unwieldy physical artifacts you find in the archives (e.g., undigitized meeting minutes or scrapbooks). In this case, you need to create some kind of detailed log (fieldnotes even) of the content that you can reference. In the case of films, this might mean watching the film and writing down details for key scenes that become your data. [2] For scrapbooks, it might mean taking notes on what you are seeing, quoting key passages, describing colors or presentation style. As you might imagine, this can take a lot of time. Be sure you budget this time into your research plan.

Researcher Note

A note on data scraping : Data scraping, sometimes known as screen scraping or frame grabbing, is a way of extracting data generated by another program, as when a scraping tool grabs information from a website. This may help you collect data that is on the internet, but you need to be ethical in how to employ the scraper. A student once helped me scrape thousands of stories from the Time magazine archives at once (although it took several hours for the scraping process to complete). These stories were freely available, so the scraping process simply sped up the laborious process of copying each article of interest and saving it to my research folder. Scraping tools can sometimes be used to circumvent paywalls. Be careful here!

The Third Step: Analysis

There is often an assumption among novice researchers that once you have collected your data, you are ready to write about what you have found. Actually, you haven’t yet found anything, and if you try to write up your results, you will probably be staring sadly at a blank page. Between the collection and the writing comes the difficult task of systematically and repeatedly reviewing the data in search of patterns and themes that will help you interpret the data, particularly its communicative aspect (e.g., What is it that is being communicated here, with these “house signs” or in the pages of Men’s Health ?).

The first time you go through the data, keep an open mind on what you are seeing (or hearing), and take notes about your observations that link up to your research question. In the beginning, it can be difficult to know what is relevant and what is extraneous. Sometimes, your research question changes based on what emerges from the data. Use the first round of review to consider this possibility, but then commit yourself to following a particular focus or path. If you are looking at how gender gets made or re-created, don’t follow the white rabbit down a hole about environmental injustice unless you decide that this really should be the focus of your study or that issues of environmental injustice are linked to gender presentation. In the second round of review, be very clear about emerging themes and patterns. Create codes (more on these in chapters 18 and 19) that will help you simplify what you are noticing. For example, “men as outdoorsy” might be a common trope you see in advertisements. Whenever you see this, mark the passage or picture. In your third (or fourth or fifth) round of review, begin to link up the tropes you’ve identified, looking for particular patterns and assumptions. You’ve drilled down to the details, and now you are building back up to figure out what they all mean. Start thinking about theory—either theories you have read about and are using as a frame of your study (e.g., gender as performance theory) or theories you are building yourself, as in the Grounded Theory tradition. Once you have a good idea of what is being communicated and how, go back to the data at least one more time to look for disconfirming evidence. Maybe you thought “men as outdoorsy” was of importance, but when you look hard, you note that women are presented as outdoorsy just as often. You just hadn’t paid attention. It is very important, as any kind of researcher but particularly as a qualitative researcher, to test yourself and your emerging interpretations in this way.

The Fourth and Final Step: The Write-Up

Only after you have fully completed analysis, with its many rounds of review and analysis, will you be able to write about what you found. The interpretation exists not in the data but in your analysis of the data. Before writing your results, you will want to very clearly describe how you chose the data here and all the possible limitations of this data (e.g., historical-trace problem or power problem; see chapter 16). Acknowledge any limitations of your sample. Describe the audience for the content, and discuss the implications of this. Once you have done all of this, you can put forth your interpretation of the communication of the content, linking to theory where doing so would help your readers understand your findings and what they mean more generally for our understanding of how the social world works. [3]

Analyzing Content: Helpful Hints and Pointers

Although every data set is unique and each researcher will have a different and unique research question to address with that data set, there are some common practices and conventions. When reviewing your data, what do you look at exactly? How will you know if you have seen a pattern? How do you note or mark your data?

Let’s start with the last question first. If your data is stored digitally, there are various ways you can highlight or mark up passages. You can, of course, do this with literal highlighters, pens, and pencils if you have print copies. But there are also qualitative software programs to help you store the data, retrieve the data, and mark the data. This can simplify the process, although it cannot do the work of analysis for you.

Qualitative software can be very expensive, so the first thing to do is to find out if your institution (or program) has a universal license its students can use. If they do not, most programs have special student licenses that are less expensive. The two most used programs at this moment are probably ATLAS.ti and NVivo. Both can cost more than $500 [4] but provide everything you could possibly need for storing data, content analysis, and coding. They also have a lot of customer support, and you can find many official and unofficial tutorials on how to use the programs’ features on the web. Dedoose, created by academic researchers at UCLA, is a decent program that lacks many of the bells and whistles of the two big programs. Instead of paying all at once, you pay monthly, as you use the program. The monthly fee is relatively affordable (less than $15), so this might be a good option for a small project. HyperRESEARCH is another basic program created by academic researchers, and it is free for small projects (those that have limited cases and material to import). You can pay a monthly fee if your project expands past the free limits. I have personally used all four of these programs, and they each have their pluses and minuses.

Regardless of which program you choose, you should know that none of them will actually do the hard work of analysis for you. They are incredibly useful for helping you store and organize your data, and they provide abundant tools for marking, comparing, and coding your data so you can make sense of it. But making sense of it will always be your job alone.

So let’s say you have some software, and you have uploaded all of your content into the program: video clips, photographs, transcripts of news stories, articles from magazines, even digital copies of college scrapbooks. Now what do you do? What are you looking for? How do you see a pattern? The answers to these questions will depend partially on the particular research question you have, or at least the motivation behind your research. Let’s go back to the idea of looking at gender presentations in magazines from the 1950s to the 1970s. Here are some things you can look at and code in the content: (1) actions and behaviors, (2) events or conditions, (3) activities, (4) strategies and tactics, (5) states or general conditions, (6) meanings or symbols, (7) relationships/interactions, (8) consequences, and (9) settings. Table 17.1 lists these with examples from our gender presentation study.

Table 17.1. Examples of What to Note During Content Analysis

One thing to note about the examples in table 17.1: sometimes we note (mark, record, code) a single example, while other times, as in “settings,” we are recording a recurrent pattern. To help you spot patterns, it is useful to mark every setting, including a notation on gender. Using software can help you do this efficiently. You can then call up “setting by gender” and note this emerging pattern. There’s an element of counting here, which we normally think of as quantitative data analysis, but we are using the count to identify a pattern that will be used to help us interpret the communication. Content analyses often include counting as part of the interpretive (qualitative) process.

In your own study, you may not need or want to look at all of the elements listed in table 17.1. Even in our imagined example, some are more useful than others. For example, “strategies and tactics” is a bit of a stretch here. In studies that are looking specifically at, say, policy implementation or social movements, this category will prove much more salient.

Another way to think about “what to look at” is to consider aspects of your content in terms of units of analysis. You can drill down to the specific words used (e.g., the adjectives commonly used to describe “men” and “women” in your magazine sample) or move up to the more abstract level of concepts used (e.g., the idea that men are more rational than women). Counting for the purpose of identifying patterns is particularly useful here. How many times is that idea of women’s irrationality communicated? How is it is communicated (in comic strips, fictional stories, editorials, etc.)? Does the incidence of the concept change over time? Perhaps the “irrational woman” was everywhere in the 1950s, but by the 1970s, it is no longer showing up in stories and comics. By tracing its usage and prevalence over time, you might come up with a theory or story about gender presentation during the period. Table 17.2 provides more examples of using different units of analysis for this work along with suggestions for effective use.

Table 17.2. Examples of Unit of Analysis in Content Analysis

Every qualitative content analysis is unique in its particular focus and particular data used, so there is no single correct way to approach analysis. You should have a better idea, however, of what kinds of things to look for and what to look for. The next two chapters will take you further into the coding process, the primary analytical tool for qualitative research in general.

Further Readings

Cidell, Julie. 2010. “Content Clouds as Exploratory Qualitative Data Analysis.” Area 42(4):514–523. A demonstration of using visual “content clouds” as a form of exploratory qualitative data analysis using transcripts of public meetings and content of newspaper articles.

Hsieh, Hsiu-Fang, and Sarah E. Shannon. 2005. “Three Approaches to Qualitative Content Analysis.” Qualitative Health Research 15(9):1277–1288. Distinguishes three distinct approaches to QCA: conventional, directed, and summative. Uses hypothetical examples from end-of-life care research.

Jackson, Romeo, Alex C. Lange, and Antonio Duran. 2021. “A Whitened Rainbow: The In/Visibility of Race and Racism in LGBTQ Higher Education Scholarship.” Journal Committed to Social Change on Race and Ethnicity (JCSCORE) 7(2):174–206.* Using a “critical summative content analysis” approach, examines research published on LGBTQ people between 2009 and 2019.

Krippendorff, Klaus. 2018. Content Analysis: An Introduction to Its Methodology . 4th ed. Thousand Oaks, CA: SAGE. A very comprehensive textbook on both quantitative and qualitative forms of content analysis.

Mayring, Philipp. 2022. Qualitative Content Analysis: A Step-by-Step Guide . Thousand Oaks, CA: SAGE. Formulates an eight-step approach to QCA.

Messinger, Adam M. 2012. “Teaching Content Analysis through ‘Harry Potter.’” Teaching Sociology 40(4):360–367. This is a fun example of a relatively brief foray into content analysis using the music found in Harry Potter films.

Neuendorft, Kimberly A. 2002. The Content Analysis Guidebook . Thousand Oaks, CA: SAGE. Although a helpful guide to content analysis in general, be warned that this textbook definitely favors quantitative over qualitative approaches to content analysis.

Schrier, Margrit. 2012. Qualitative Content Analysis in Practice . Thousand Okas, CA: SAGE. Arguably the most accessible guidebook for QCA, written by a professor based in Germany.

Weber, Matthew A., Shannon Caplan, Paul Ringold, and Karen Blocksom. 2017. “Rivers and Streams in the Media: A Content Analysis of Ecosystem Services.” Ecology and Society 22(3).* Examines the content of a blog hosted by National Geographic and articles published in The New York Times and the Wall Street Journal for stories on rivers and streams (e.g., water-quality flooding).

  • There are ways of handling content analysis quantitatively, however. Some practitioners therefore specify qualitative content analysis (QCA). In this chapter, all content analysis is QCA unless otherwise noted. ↵
  • Note that some qualitative software allows you to upload whole films or film clips for coding. You will still have to get access to the film, of course. ↵
  • See chapter 20 for more on the final presentation of research. ↵
  • . Actually, ATLAS.ti is an annual license, while NVivo is a perpetual license, but both are going to cost you at least $500 to use. Student rates may be lower. And don’t forget to ask your institution or program if they already have a software license you can use. ↵

A method of both data collection and data analysis in which a given content (textual, visual, graphic) is examined systematically and rigorously to identify meanings, themes, patterns and assumptions.  Qualitative content analysis (QCA) is concerned with gathering and interpreting an existing body of material.    

Introduction to Qualitative Research Methods Copyright © 2023 by Allison Hurst is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License , except where otherwise noted.

Reference management. Clean and simple.

How to do a content analysis

Content analysis illustration

What is content analysis?

Why would you use a content analysis, types of content analysis, conceptual content analysis, relational content analysis, reliability and validity, reliability, the advantages and disadvantages of content analysis, a step-by-step guide to conducting a content analysis, step 1: develop your research questions, step 2: choose the content you’ll analyze, step 3: identify your biases, step 4: define the units and categories of coding, step 5: develop a coding scheme, step 6: code the content, step 7: analyze the results, frequently asked questions about content analysis, related articles.

In research, content analysis is the process of analyzing content and its features with the aim of identifying patterns and the presence of words, themes, and concepts within the content. Simply put, content analysis is a research method that aims to present the trends, patterns, concepts, and ideas in content as objective, quantitative or qualitative data , depending on the specific use case.

As such, some of the objectives of content analysis include:

  • Simplifying complex, unstructured content.
  • Identifying trends, patterns, and relationships in the content.
  • Determining the characteristics of the content.
  • Identifying the intentions of individuals through the analysis of the content.
  • Identifying the implied aspects in the content.

Typically, when doing a content analysis, you’ll gather data not only from written text sources like newspapers, books, journals, and magazines but also from a variety of other oral and visual sources of content like:

  • Voice recordings, speeches, and interviews.
  • Web content, blogs, and social media content.
  • Films, videos, and photographs.

One of content analysis’s distinguishing features is that you'll be able to gather data for research without physically gathering data from participants. In other words, when doing a content analysis, you don't need to interact with people directly.

The process of doing a content analysis usually involves categorizing or coding concepts, words, and themes within the content and analyzing the results. We’ll look at the process in more detail below.

Typically, you’ll use content analysis when you want to:

  • Identify the intentions, communication trends, or communication patterns of an individual, a group of people, or even an institution.
  • Analyze and describe the behavioral and attitudinal responses of individuals to communications.
  • Determine the emotional or psychological state of an individual or a group of people.
  • Analyze the international differences in communication content.
  • Analyzing audience responses to content.

Keep in mind, though, that these are just some examples of use cases where a content analysis might be appropriate and there are many others.

The key thing to remember is that content analysis will help you quantify the occurrence of specific words, phrases, themes, and concepts in content. Moreover, it can also be used when you want to make qualitative inferences out of the data by analyzing the semantic meanings and interrelationships between words, themes, and concepts.

In general, there are two types of content analysis: conceptual and relational analysis . Although these two types follow largely similar processes, their outcomes differ. As such, each of these types can provide different results, interpretations, and conclusions. With that in mind, let’s now look at these two types of content analysis in more detail.

With conceptual analysis, you’ll determine the existence of certain concepts within the content and identify their frequency. In other words, conceptual analysis involves the number of times a specific concept appears in the content.

Conceptual analysis is typically focused on explicit data, which means you’ll focus your analysis on a specific concept to identify its presence in the content and determine its frequency.

However, when conducting a content analysis, you can also use implicit data. This approach is more involved, complicated, and requires the use of a dictionary, contextual translation rules, or a combination of both.

No matter what type you use, conceptual analysis brings an element of quantitive analysis into a qualitative approach to research.

Relational content analysis takes conceptual analysis a step further. So, while the process starts in the same way by identifying concepts in content, it doesn’t focus on finding the frequency of these concepts, but rather on the relationships between the concepts, the context in which they appear in the content, and their interrelationships.

Before starting with a relational analysis, you’ll first need to decide on which subcategory of relational analysis you’ll use:

  • Affect extraction: With this relational content analysis approach, you’ll evaluate concepts based on their emotional attributes. You’ll typically assess these emotions on a rating scale with higher values assigned to positive emotions and lower values to negative ones. In turn, this allows you to capture the emotions of the writer or speaker at the time the content is created. The main difficulty with this approach is that emotions can differ over time and across populations.
  • Proximity analysis: With this approach, you’ll identify concepts as in conceptual analysis, but you’ll evaluate the way in which they occur together in the content. In other words, proximity analysis allows you to analyze the relationship between concepts and derive a concept matrix from which you’ll be able to develop meaning. Proximity analysis is typically used when you want to extract facts from the content rather than contextual, emotional, or cultural factors.
  • Cognitive mapping: Finally, cognitive mapping can be used with affect extraction or proximity analysis. It’s a visualization technique that allows you to create a model that represents the overall meaning of content and presents it as a graphic map of the relationships between concepts. As such, it’s also commonly used when analyzing the changes in meanings, definitions, and terms over time.

Now that we’ve seen what content analysis is and looked at the different types of content analysis, it’s important to understand how reliable it is as a research method . We’ll also look at what criteria impact the validity of a content analysis.

There are three criteria that determine the reliability of a content analysis:

  • Stability . Stability refers to the tendency of coders to consistently categorize or code the same data in the same way over time.
  • Reproducibility . This criterion refers to the tendency of coders to classify categories membership in the same way.
  • Accuracy . Accuracy refers to the extent to which the classification of content corresponds to a specific standard.

Keep in mind, though, that because you’ll need to code or categorize the concepts you’ll aim to identify and analyze manually, you’ll never be able to eliminate human error. However, you’ll be able to minimize it.

In turn, three criteria determine the validity of a content analysis:

  • Closeness of categories . This is achieved by using multiple classifiers to get an agreed-upon definition for a specific category by using either implicit variables or synonyms. In this way, the category can be broadened to include more relevant data.
  • Conclusions . Here, it’s crucial to decide what level of implication will be allowable. In other words, it’s important to consider whether the conclusions are valid based on the data or whether they can be explained using some other phenomena.
  • Generalizability of the results of the analysis to a theory . Generalizability comes down to how you determine your categories as mentioned above and how reliable those categories are. In turn, this relies on how accurately the categories are at measuring the concepts or ideas that you’re looking to measure.

Considering everything mentioned above, there are definite advantages and disadvantages when it comes to content analysis:

Let’s now look at the steps you’ll need to follow when doing a content analysis.

The first step will always be to formulate your research questions. This is simply because, without clear and defined research questions, you won’t know what question to answer and, by implication, won’t be able to code your concepts.

Based on your research questions, you’ll then need to decide what content you’ll analyze. Here, you’ll use three factors to find the right content:

  • The type of content . Here you’ll need to consider the various types of content you’ll use and their medium like, for example, blog posts, social media, newspapers, or online articles.
  • What criteria you’ll use for inclusion . Here you’ll decide what criteria you’ll use to include content. This can, for instance, be the mentioning of a certain event or advertising a specific product.
  • Your parameters . Here, you’ll decide what content you’ll include based on specified parameters in terms of date and location.

The next step is to consider your own pre-conception of the questions and identify your biases. This process is referred to as bracketing and allows you to be aware of your biases before you start your research with the result that they’ll be less likely to influence the analysis.

Your next step would be to define the units of meaning that you’ll code. This will, for example, be the number of times a concept appears in the content or the treatment of concept, words, or themes in the content. You’ll then need to define the set of categories you’ll use for coding which can be either objective or more conceptual.

Based on the above, you’ll then organize the units of meaning into your defined categories. Apart from this, your coding scheme will also determine how you’ll analyze the data.

The next step is to code the content. During this process, you’ll work through the content and record the data according to your coding scheme. It’s also here where conceptual and relational analysis starts to deviate in relation to the process you’ll need to follow.

As mentioned earlier, conceptual analysis aims to identify the number of times a specific concept, idea, word, or phrase appears in the content. So, here, you’ll need to decide what level of analysis you’ll implement.

In contrast, with relational analysis, you’ll need to decide what type of relational analysis you’ll use. So, you’ll need to determine whether you’ll use affect extraction, proximity analysis, cognitive mapping, or a combination of these approaches.

Once you’ve coded the data, you’ll be able to analyze it and draw conclusions from the data based on your research questions.

Content analysis offers an inexpensive and flexible way to identify trends and patterns in communication content. In addition, it’s unobtrusive which eliminates many ethical concerns and inaccuracies in research data. However, to be most effective, a content analysis must be planned and used carefully in order to ensure reliability and validity.

The two general types of content analysis: conceptual and relational analysis . Although these two types follow largely similar processes, their outcomes differ. As such, each of these types can provide different results, interpretations, and conclusions.

In qualitative research coding means categorizing concepts, words, and themes within your content to create a basis for analyzing the results. While coding, you work through the content and record the data according to your coding scheme.

Content analysis is the process of analyzing content and its features with the aim of identifying patterns and the presence of words, themes, and concepts within the content. The goal of a content analysis is to present the trends, patterns, concepts, and ideas in content as objective, quantitative or qualitative data, depending on the specific use case.

Content analysis is a qualitative method of data analysis and can be used in many different fields. It is particularly popular in the social sciences.

It is possible to do qualitative analysis without coding, but content analysis as a method of qualitative analysis requires coding or categorizing data to then analyze it according to your coding scheme in the next step.

research methodology content analysis

  • Search Menu
  • Browse content in Arts and Humanities
  • Browse content in Archaeology
  • Anglo-Saxon and Medieval Archaeology
  • Archaeological Methodology and Techniques
  • Archaeology by Region
  • Archaeology of Religion
  • Archaeology of Trade and Exchange
  • Biblical Archaeology
  • Contemporary and Public Archaeology
  • Environmental Archaeology
  • Historical Archaeology
  • History and Theory of Archaeology
  • Industrial Archaeology
  • Landscape Archaeology
  • Mortuary Archaeology
  • Prehistoric Archaeology
  • Underwater Archaeology
  • Urban Archaeology
  • Zooarchaeology
  • Browse content in Architecture
  • Architectural Structure and Design
  • History of Architecture
  • Residential and Domestic Buildings
  • Theory of Architecture
  • Browse content in Art
  • Art Subjects and Themes
  • History of Art
  • Industrial and Commercial Art
  • Theory of Art
  • Biographical Studies
  • Byzantine Studies
  • Browse content in Classical Studies
  • Classical History
  • Classical Philosophy
  • Classical Mythology
  • Classical Literature
  • Classical Reception
  • Classical Art and Architecture
  • Classical Oratory and Rhetoric
  • Greek and Roman Papyrology
  • Greek and Roman Epigraphy
  • Greek and Roman Law
  • Greek and Roman Archaeology
  • Late Antiquity
  • Religion in the Ancient World
  • Digital Humanities
  • Browse content in History
  • Colonialism and Imperialism
  • Diplomatic History
  • Environmental History
  • Genealogy, Heraldry, Names, and Honours
  • Genocide and Ethnic Cleansing
  • Historical Geography
  • History by Period
  • History of Emotions
  • History of Agriculture
  • History of Education
  • History of Gender and Sexuality
  • Industrial History
  • Intellectual History
  • International History
  • Labour History
  • Legal and Constitutional History
  • Local and Family History
  • Maritime History
  • Military History
  • National Liberation and Post-Colonialism
  • Oral History
  • Political History
  • Public History
  • Regional and National History
  • Revolutions and Rebellions
  • Slavery and Abolition of Slavery
  • Social and Cultural History
  • Theory, Methods, and Historiography
  • Urban History
  • World History
  • Browse content in Language Teaching and Learning
  • Language Learning (Specific Skills)
  • Language Teaching Theory and Methods
  • Browse content in Linguistics
  • Applied Linguistics
  • Cognitive Linguistics
  • Computational Linguistics
  • Forensic Linguistics
  • Grammar, Syntax and Morphology
  • Historical and Diachronic Linguistics
  • History of English
  • Language Evolution
  • Language Reference
  • Language Acquisition
  • Language Variation
  • Language Families
  • Lexicography
  • Linguistic Anthropology
  • Linguistic Theories
  • Linguistic Typology
  • Phonetics and Phonology
  • Psycholinguistics
  • Sociolinguistics
  • Translation and Interpretation
  • Writing Systems
  • Browse content in Literature
  • Bibliography
  • Children's Literature Studies
  • Literary Studies (Romanticism)
  • Literary Studies (American)
  • Literary Studies (Asian)
  • Literary Studies (European)
  • Literary Studies (Eco-criticism)
  • Literary Studies (Modernism)
  • Literary Studies - World
  • Literary Studies (1500 to 1800)
  • Literary Studies (19th Century)
  • Literary Studies (20th Century onwards)
  • Literary Studies (African American Literature)
  • Literary Studies (British and Irish)
  • Literary Studies (Early and Medieval)
  • Literary Studies (Fiction, Novelists, and Prose Writers)
  • Literary Studies (Gender Studies)
  • Literary Studies (Graphic Novels)
  • Literary Studies (History of the Book)
  • Literary Studies (Plays and Playwrights)
  • Literary Studies (Poetry and Poets)
  • Literary Studies (Postcolonial Literature)
  • Literary Studies (Queer Studies)
  • Literary Studies (Science Fiction)
  • Literary Studies (Travel Literature)
  • Literary Studies (War Literature)
  • Literary Studies (Women's Writing)
  • Literary Theory and Cultural Studies
  • Mythology and Folklore
  • Shakespeare Studies and Criticism
  • Browse content in Media Studies
  • Browse content in Music
  • Applied Music
  • Dance and Music
  • Ethics in Music
  • Ethnomusicology
  • Gender and Sexuality in Music
  • Medicine and Music
  • Music Cultures
  • Music and Media
  • Music and Religion
  • Music and Culture
  • Music Education and Pedagogy
  • Music Theory and Analysis
  • Musical Scores, Lyrics, and Libretti
  • Musical Structures, Styles, and Techniques
  • Musicology and Music History
  • Performance Practice and Studies
  • Race and Ethnicity in Music
  • Sound Studies
  • Browse content in Performing Arts
  • Browse content in Philosophy
  • Aesthetics and Philosophy of Art
  • Epistemology
  • Feminist Philosophy
  • History of Western Philosophy
  • Metaphysics
  • Moral Philosophy
  • Non-Western Philosophy
  • Philosophy of Language
  • Philosophy of Mind
  • Philosophy of Perception
  • Philosophy of Science
  • Philosophy of Action
  • Philosophy of Law
  • Philosophy of Religion
  • Philosophy of Mathematics and Logic
  • Practical Ethics
  • Social and Political Philosophy
  • Browse content in Religion
  • Biblical Studies
  • Christianity
  • East Asian Religions
  • History of Religion
  • Judaism and Jewish Studies
  • Qumran Studies
  • Religion and Education
  • Religion and Health
  • Religion and Politics
  • Religion and Science
  • Religion and Law
  • Religion and Art, Literature, and Music
  • Religious Studies
  • Browse content in Society and Culture
  • Cookery, Food, and Drink
  • Cultural Studies
  • Customs and Traditions
  • Ethical Issues and Debates
  • Hobbies, Games, Arts and Crafts
  • Lifestyle, Home, and Garden
  • Natural world, Country Life, and Pets
  • Popular Beliefs and Controversial Knowledge
  • Sports and Outdoor Recreation
  • Technology and Society
  • Travel and Holiday
  • Visual Culture
  • Browse content in Law
  • Arbitration
  • Browse content in Company and Commercial Law
  • Commercial Law
  • Company Law
  • Browse content in Comparative Law
  • Systems of Law
  • Competition Law
  • Browse content in Constitutional and Administrative Law
  • Government Powers
  • Judicial Review
  • Local Government Law
  • Military and Defence Law
  • Parliamentary and Legislative Practice
  • Construction Law
  • Contract Law
  • Browse content in Criminal Law
  • Criminal Procedure
  • Criminal Evidence Law
  • Sentencing and Punishment
  • Employment and Labour Law
  • Environment and Energy Law
  • Browse content in Financial Law
  • Banking Law
  • Insolvency Law
  • History of Law
  • Human Rights and Immigration
  • Intellectual Property Law
  • Browse content in International Law
  • Private International Law and Conflict of Laws
  • Public International Law
  • IT and Communications Law
  • Jurisprudence and Philosophy of Law
  • Law and Politics
  • Law and Society
  • Browse content in Legal System and Practice
  • Courts and Procedure
  • Legal Skills and Practice
  • Primary Sources of Law
  • Regulation of Legal Profession
  • Medical and Healthcare Law
  • Browse content in Policing
  • Criminal Investigation and Detection
  • Police and Security Services
  • Police Procedure and Law
  • Police Regional Planning
  • Browse content in Property Law
  • Personal Property Law
  • Study and Revision
  • Terrorism and National Security Law
  • Browse content in Trusts Law
  • Wills and Probate or Succession
  • Browse content in Medicine and Health
  • Browse content in Allied Health Professions
  • Arts Therapies
  • Clinical Science
  • Dietetics and Nutrition
  • Occupational Therapy
  • Operating Department Practice
  • Physiotherapy
  • Radiography
  • Speech and Language Therapy
  • Browse content in Anaesthetics
  • General Anaesthesia
  • Neuroanaesthesia
  • Clinical Neuroscience
  • Browse content in Clinical Medicine
  • Acute Medicine
  • Cardiovascular Medicine
  • Clinical Genetics
  • Clinical Pharmacology and Therapeutics
  • Dermatology
  • Endocrinology and Diabetes
  • Gastroenterology
  • Genito-urinary Medicine
  • Geriatric Medicine
  • Infectious Diseases
  • Medical Toxicology
  • Medical Oncology
  • Pain Medicine
  • Palliative Medicine
  • Rehabilitation Medicine
  • Respiratory Medicine and Pulmonology
  • Rheumatology
  • Sleep Medicine
  • Sports and Exercise Medicine
  • Community Medical Services
  • Critical Care
  • Emergency Medicine
  • Forensic Medicine
  • Haematology
  • History of Medicine
  • Browse content in Medical Skills
  • Clinical Skills
  • Communication Skills
  • Nursing Skills
  • Surgical Skills
  • Browse content in Medical Dentistry
  • Oral and Maxillofacial Surgery
  • Paediatric Dentistry
  • Restorative Dentistry and Orthodontics
  • Surgical Dentistry
  • Medical Ethics
  • Medical Statistics and Methodology
  • Browse content in Neurology
  • Clinical Neurophysiology
  • Neuropathology
  • Nursing Studies
  • Browse content in Obstetrics and Gynaecology
  • Gynaecology
  • Occupational Medicine
  • Ophthalmology
  • Otolaryngology (ENT)
  • Browse content in Paediatrics
  • Neonatology
  • Browse content in Pathology
  • Chemical Pathology
  • Clinical Cytogenetics and Molecular Genetics
  • Histopathology
  • Medical Microbiology and Virology
  • Patient Education and Information
  • Browse content in Pharmacology
  • Psychopharmacology
  • Browse content in Popular Health
  • Caring for Others
  • Complementary and Alternative Medicine
  • Self-help and Personal Development
  • Browse content in Preclinical Medicine
  • Cell Biology
  • Molecular Biology and Genetics
  • Reproduction, Growth and Development
  • Primary Care
  • Professional Development in Medicine
  • Browse content in Psychiatry
  • Addiction Medicine
  • Child and Adolescent Psychiatry
  • Forensic Psychiatry
  • Learning Disabilities
  • Old Age Psychiatry
  • Psychotherapy
  • Browse content in Public Health and Epidemiology
  • Epidemiology
  • Public Health
  • Browse content in Radiology
  • Clinical Radiology
  • Interventional Radiology
  • Nuclear Medicine
  • Radiation Oncology
  • Reproductive Medicine
  • Browse content in Surgery
  • Cardiothoracic Surgery
  • Gastro-intestinal and Colorectal Surgery
  • General Surgery
  • Neurosurgery
  • Paediatric Surgery
  • Peri-operative Care
  • Plastic and Reconstructive Surgery
  • Surgical Oncology
  • Transplant Surgery
  • Trauma and Orthopaedic Surgery
  • Vascular Surgery
  • Browse content in Science and Mathematics
  • Browse content in Biological Sciences
  • Aquatic Biology
  • Biochemistry
  • Bioinformatics and Computational Biology
  • Developmental Biology
  • Ecology and Conservation
  • Evolutionary Biology
  • Genetics and Genomics
  • Microbiology
  • Molecular and Cell Biology
  • Natural History
  • Plant Sciences and Forestry
  • Research Methods in Life Sciences
  • Structural Biology
  • Systems Biology
  • Zoology and Animal Sciences
  • Browse content in Chemistry
  • Analytical Chemistry
  • Computational Chemistry
  • Crystallography
  • Environmental Chemistry
  • Industrial Chemistry
  • Inorganic Chemistry
  • Materials Chemistry
  • Medicinal Chemistry
  • Mineralogy and Gems
  • Organic Chemistry
  • Physical Chemistry
  • Polymer Chemistry
  • Study and Communication Skills in Chemistry
  • Theoretical Chemistry
  • Browse content in Computer Science
  • Artificial Intelligence
  • Computer Architecture and Logic Design
  • Game Studies
  • Human-Computer Interaction
  • Mathematical Theory of Computation
  • Programming Languages
  • Software Engineering
  • Systems Analysis and Design
  • Virtual Reality
  • Browse content in Computing
  • Business Applications
  • Computer Security
  • Computer Games
  • Computer Networking and Communications
  • Digital Lifestyle
  • Graphical and Digital Media Applications
  • Operating Systems
  • Browse content in Earth Sciences and Geography
  • Atmospheric Sciences
  • Environmental Geography
  • Geology and the Lithosphere
  • Maps and Map-making
  • Meteorology and Climatology
  • Oceanography and Hydrology
  • Palaeontology
  • Physical Geography and Topography
  • Regional Geography
  • Soil Science
  • Urban Geography
  • Browse content in Engineering and Technology
  • Agriculture and Farming
  • Biological Engineering
  • Civil Engineering, Surveying, and Building
  • Electronics and Communications Engineering
  • Energy Technology
  • Engineering (General)
  • Environmental Science, Engineering, and Technology
  • History of Engineering and Technology
  • Mechanical Engineering and Materials
  • Technology of Industrial Chemistry
  • Transport Technology and Trades
  • Browse content in Environmental Science
  • Applied Ecology (Environmental Science)
  • Conservation of the Environment (Environmental Science)
  • Environmental Sustainability
  • Environmentalist Thought and Ideology (Environmental Science)
  • Management of Land and Natural Resources (Environmental Science)
  • Natural Disasters (Environmental Science)
  • Nuclear Issues (Environmental Science)
  • Pollution and Threats to the Environment (Environmental Science)
  • Social Impact of Environmental Issues (Environmental Science)
  • History of Science and Technology
  • Browse content in Materials Science
  • Ceramics and Glasses
  • Composite Materials
  • Metals, Alloying, and Corrosion
  • Nanotechnology
  • Browse content in Mathematics
  • Applied Mathematics
  • Biomathematics and Statistics
  • History of Mathematics
  • Mathematical Education
  • Mathematical Finance
  • Mathematical Analysis
  • Numerical and Computational Mathematics
  • Probability and Statistics
  • Pure Mathematics
  • Browse content in Neuroscience
  • Cognition and Behavioural Neuroscience
  • Development of the Nervous System
  • Disorders of the Nervous System
  • History of Neuroscience
  • Invertebrate Neurobiology
  • Molecular and Cellular Systems
  • Neuroendocrinology and Autonomic Nervous System
  • Neuroscientific Techniques
  • Sensory and Motor Systems
  • Browse content in Physics
  • Astronomy and Astrophysics
  • Atomic, Molecular, and Optical Physics
  • Biological and Medical Physics
  • Classical Mechanics
  • Computational Physics
  • Condensed Matter Physics
  • Electromagnetism, Optics, and Acoustics
  • History of Physics
  • Mathematical and Statistical Physics
  • Measurement Science
  • Nuclear Physics
  • Particles and Fields
  • Plasma Physics
  • Quantum Physics
  • Relativity and Gravitation
  • Semiconductor and Mesoscopic Physics
  • Browse content in Psychology
  • Affective Sciences
  • Clinical Psychology
  • Cognitive Psychology
  • Cognitive Neuroscience
  • Criminal and Forensic Psychology
  • Developmental Psychology
  • Educational Psychology
  • Evolutionary Psychology
  • Health Psychology
  • History and Systems in Psychology
  • Music Psychology
  • Neuropsychology
  • Organizational Psychology
  • Psychological Assessment and Testing
  • Psychology of Human-Technology Interaction
  • Psychology Professional Development and Training
  • Research Methods in Psychology
  • Social Psychology
  • Browse content in Social Sciences
  • Browse content in Anthropology
  • Anthropology of Religion
  • Human Evolution
  • Medical Anthropology
  • Physical Anthropology
  • Regional Anthropology
  • Social and Cultural Anthropology
  • Theory and Practice of Anthropology
  • Browse content in Business and Management
  • Business Ethics
  • Business Strategy
  • Business History
  • Business and Technology
  • Business and Government
  • Business and the Environment
  • Comparative Management
  • Corporate Governance
  • Corporate Social Responsibility
  • Entrepreneurship
  • Health Management
  • Human Resource Management
  • Industrial and Employment Relations
  • Industry Studies
  • Information and Communication Technologies
  • International Business
  • Knowledge Management
  • Management and Management Techniques
  • Operations Management
  • Organizational Theory and Behaviour
  • Pensions and Pension Management
  • Public and Nonprofit Management
  • Strategic Management
  • Supply Chain Management
  • Browse content in Criminology and Criminal Justice
  • Criminal Justice
  • Criminology
  • Forms of Crime
  • International and Comparative Criminology
  • Youth Violence and Juvenile Justice
  • Development Studies
  • Browse content in Economics
  • Agricultural, Environmental, and Natural Resource Economics
  • Asian Economics
  • Behavioural Finance
  • Behavioural Economics and Neuroeconomics
  • Econometrics and Mathematical Economics
  • Economic History
  • Economic Systems
  • Economic Methodology
  • Economic Development and Growth
  • Financial Markets
  • Financial Institutions and Services
  • General Economics and Teaching
  • Health, Education, and Welfare
  • History of Economic Thought
  • International Economics
  • Labour and Demographic Economics
  • Law and Economics
  • Macroeconomics and Monetary Economics
  • Microeconomics
  • Public Economics
  • Urban, Rural, and Regional Economics
  • Welfare Economics
  • Browse content in Education
  • Adult Education and Continuous Learning
  • Care and Counselling of Students
  • Early Childhood and Elementary Education
  • Educational Equipment and Technology
  • Educational Strategies and Policy
  • Higher and Further Education
  • Organization and Management of Education
  • Philosophy and Theory of Education
  • Schools Studies
  • Secondary Education
  • Teaching of a Specific Subject
  • Teaching of Specific Groups and Special Educational Needs
  • Teaching Skills and Techniques
  • Browse content in Environment
  • Applied Ecology (Social Science)
  • Climate Change
  • Conservation of the Environment (Social Science)
  • Environmentalist Thought and Ideology (Social Science)
  • Natural Disasters (Environment)
  • Social Impact of Environmental Issues (Social Science)
  • Browse content in Human Geography
  • Cultural Geography
  • Economic Geography
  • Political Geography
  • Browse content in Interdisciplinary Studies
  • Communication Studies
  • Museums, Libraries, and Information Sciences
  • Browse content in Politics
  • African Politics
  • Asian Politics
  • Chinese Politics
  • Comparative Politics
  • Conflict Politics
  • Elections and Electoral Studies
  • Environmental Politics
  • European Union
  • Foreign Policy
  • Gender and Politics
  • Human Rights and Politics
  • Indian Politics
  • International Relations
  • International Organization (Politics)
  • International Political Economy
  • Irish Politics
  • Latin American Politics
  • Middle Eastern Politics
  • Political Behaviour
  • Political Economy
  • Political Institutions
  • Political Methodology
  • Political Communication
  • Political Philosophy
  • Political Sociology
  • Political Theory
  • Politics and Law
  • Public Policy
  • Public Administration
  • Quantitative Political Methodology
  • Regional Political Studies
  • Russian Politics
  • Security Studies
  • State and Local Government
  • UK Politics
  • US Politics
  • Browse content in Regional and Area Studies
  • African Studies
  • Asian Studies
  • East Asian Studies
  • Japanese Studies
  • Latin American Studies
  • Middle Eastern Studies
  • Native American Studies
  • Scottish Studies
  • Browse content in Research and Information
  • Research Methods
  • Browse content in Social Work
  • Addictions and Substance Misuse
  • Adoption and Fostering
  • Care of the Elderly
  • Child and Adolescent Social Work
  • Couple and Family Social Work
  • Developmental and Physical Disabilities Social Work
  • Direct Practice and Clinical Social Work
  • Emergency Services
  • Human Behaviour and the Social Environment
  • International and Global Issues in Social Work
  • Mental and Behavioural Health
  • Social Justice and Human Rights
  • Social Policy and Advocacy
  • Social Work and Crime and Justice
  • Social Work Macro Practice
  • Social Work Practice Settings
  • Social Work Research and Evidence-based Practice
  • Welfare and Benefit Systems
  • Browse content in Sociology
  • Childhood Studies
  • Community Development
  • Comparative and Historical Sociology
  • Economic Sociology
  • Gender and Sexuality
  • Gerontology and Ageing
  • Health, Illness, and Medicine
  • Marriage and the Family
  • Migration Studies
  • Occupations, Professions, and Work
  • Organizations
  • Population and Demography
  • Race and Ethnicity
  • Social Theory
  • Social Movements and Social Change
  • Social Research and Statistics
  • Social Stratification, Inequality, and Mobility
  • Sociology of Religion
  • Sociology of Education
  • Sport and Leisure
  • Urban and Rural Studies
  • Browse content in Warfare and Defence
  • Defence Strategy, Planning, and Research
  • Land Forces and Warfare
  • Military Administration
  • Military Life and Institutions
  • Naval Forces and Warfare
  • Other Warfare and Defence Issues
  • Peace Studies and Conflict Resolution
  • Weapons and Equipment

The Oxford Handbook of Qualitative Research

A newer edition of this book is available.

  • < Previous chapter
  • Next chapter >

18 Content Analysis

Lindsay Prior, School of Sociology, Social Policy, and Social Work, Queen's University

  • Published: 04 August 2014
  • Cite Icon Cite
  • Permissions Icon Permissions

In this chapter, the focus is on ways in which content analysis can be used to investigate and describe interview and textual data. The chapter opens with a contextualization of the method and then proceeds to an examination of the role of content analysis in relation to both quantitative and qualitative modes of social research. Following the introductory sections, four kinds of data are subjected to content analysis. These include data derived from a sample of qualitative interviews (N = 54), textual data derived from a sample of health policy documents (N = 6), data derived from a single interview relating to a “case” of traumatic brain injury, and data gathered from 54 abstracts of academic papers on the topic of “well-being.” Using a distinctive and somewhat novel style of content analysis that calls upon the notion of semantic networks, the chapter shows how the method can be used either independently or in conjunction with other forms of inquiry (including various styles of discourse analysis) to analyze data, and also how it can be used to verify and underpin claims that arise out of analysis. The chapter ends with an overview of the different ways in which the study of “content”—especially the study of document content—can be positioned in social scientific research projects.

What is Content Analysis?

In his 1952 text on the subject of content analysis, Bernard Berelson traces the origins of the method to communication research and then lists what he calls six distinguishing features of the approach. As one might expect, the six defining features reflect the concerns of social science as taught in the 1950s, an age in which the calls for an “objective,” “systematic,” and “quantitative” approach to the study of communication data were first heard. The reference to the field of “communication” was of course nothing less than a reflection of a substantive social scientific interest over the previous decades in what was called public opinion, and specifically attempts to understand why and how a potential of source of critical, rational judgement on political leaders (i.e., the views of the public) could be turned into something to be manipulated by dictators and demagogues. In such a context, it is perhaps not so surprising that in one of the more popular research methods texts of the decade, the terms content analysis and communication analysis are used interchangeably (see Goode & Hatt, 1952 :325).

Academic fashions and interests naturally change with available technology, and these days we are more likely to focus on the individualization of communications through Twitter and the like, rather than of mass newspaper readership or mass radio audiences, yet the prevailing discourse on content analysis has remained much the same as it was in Berleson’s day. Thus Neuendorf (2002 :1), for example, continues to define content analysis as “the systematic, objective, quantitative analysis of message characteristics.” Clearly the centrality of communication as a basis for understanding and using content analysis continues to hold, but in this article I will try to show that, rather than locate the use of content analysis in disembodied “messages” and distantiated “media,” we would do better to focus on the fact that communication is a building block of social life itself and not merely a system of messages that are transmitted—in whatever form—from sender to receiver. To put that statement in another guise, we need to note that communicative action (to use the phraseology of Habermas, 1987 ) rests at the very base of the lifeworld, and one very important way of coming to grips with that world is to study the content of what people say and write in the course of their everyday lives.

My aim is to demonstrate various ways in which content analysis (henceforth CTA) can be used and developed to analyze social scientific data as derived from interviews and documents. It is not my intention to cover the history of CTA or to venture into forms of literary analysis or to demonstrate each and every technique that has ever been deployed by content analysts. (Many of the standard textbooks deal with those kinds of issues much more fully than is possible here. See, for example, Babbie, 2013 ; Berelson, 1952 ; Bryman, 2008 , Krippendorf, 2004 ; Neuendorf, 2002 ; and Weber, 1990 ). Instead I seek to recontextualize the use of the method in a framework of network thinking and to link the use of CTA to specific problems of data analysis. As will become evident, my exposition of the method is grounded in real world problems. Those problems are drawn from my own research projects and tend to reflect my particular academic interests—which are almost entirely related to the analysis of the ways in which people talk and write about aspects of health, illness, and disease. However, lest the reader be deterred from going any further, I should emphasise that the substantive issues that I elect to examine are secondary if not tertiary to my main objective—which is to demonstrate how CTA can be integrated into a range of research designs and add depth and rigour to the analysis of interview and inscription data. To that end, in the next section I aim to clear our path to analysis by dealing with some issues that touch on the general position of CTA in the research armory, and especially its location in the schism that has developed between quantitative and qualitative modes of inquiry.

The Methodological Context of Content Analysis

Content analysis is usually associated with the study of inscription contained in published reports, newspapers, adverts, books, web pages, journals, and other forms of documentation. Hence, nearly all of Berelson’s (1952) illustrations and references to the method relate to the analysis of written records of some kind, and where speech is mentioned it is almost always in the form of broadcast and published political speeches (such as State of the Union addresses). This association of content analysis with text and documentation is further underlined in modern textbook discussions of the method. Thus Bryman (2008) for example, defines content analysis as “an approach to the analysis of documents and texts , that seek to quantify content in terms of pre-determined categories” (2008:274, emphasis in original), while Babbie (2013) states that content analysis is “the study of recorded human communications” (2013:295), and Weber refers to it as a method to make “valid inferences from text” (1990:9). It is clear then that CTA is viewed as a text-based method of analysis, though extensions of the method to other forms of inscriptional material are also referred to in some discussions. Thus Neuendorf (2002) , for example, rightly refers to analyses of film and television images as legitimate fields for the deployment of CTA, and by implication analyses of still—as well as moving—images such as photographs and billboard adverts. Oddly, in the traditional or standard paradigm of content analysis, the method is solely used to capture the “message” of a text or speech; it is not used for the analysis of a recipient’s response to or understanding of the message (which is normally accessed via interview data and analyzed in other and often less rigorous ways; see, e.g., Merton, 1968 ). So in this article I suggest that we can take things at least one small step further by using CTA to analyse speech (especially interview data) as well as text.

Standard textbook discussions of CTA usually refer to it as a “non-reactive” or “unobtrusive” method of investigation (see, e.g., Babbie, 2013 :294), and a large part of the reason for that designation is due to its focus on already existing text (i.e., text gathered without intrusion into a research setting). More importantly, however, (and to underline the obvious) CTA is primarily a method of analysis rather than of data collection. Its use therefore has to be integrated into wider frames of research design that embrace systematic forms of data collection as well as forms of data analysis. Thus routine strategies for sampling data are often required in designs that call upon CTA as a method of analysis. These latter can either be built around random sampling methods, or even techniques of “theoretical sampling” ( Glaser & Strauss, 1967 ) so as to identify as suitable range of materials for content analysis. CTA can also be linked to styles of ethnographic inquiry and to the use of various purposive or non-random sampling techniques. For an example, see Altheide (1987) .

Of course, the use of CTA in a research design does not preclude the use of other forms of analysis in the same study, for it is a technique that can be deployed in parallel with other methods or with other methods sequentially. For example, and as I will demonstrate in the following sections, one might use CTA as a preliminary analytical strategy to get a grip on the available data before moving into specific forms of discourse analysis. In this respect it can be as well to think of using CTA in, say, the frame of a priority/sequence model of research design as described by Morgan (1998) .

As I shall explain, there is a sense in which content analysis rests at the base of all forms of qualitative data analysis, yet the paradox is that the analysis of content is usually considered to be a quantitative (numerically based) method. In terms of the qualitative/quantitative divide, however, it is probably best to think of CTA as a hybrid method, and some writers have in the past argued that it is necessarily so ( Kracauer, 1952 ). That was probably easier to do in an age when many recognised the strictly drawn boundaries between qualitative and quantitative styles of research to be inappropriate. Thus in their widely used text on “ Methods in Social Research ,” Goode and Hatt (1952 :313), for example, asserted that, “[M]odern research must reject as a false dichotomy the separation between ‘qualitative’ and ‘quantitative’ studies, or between the ‘statistical’ and the ‘non-statistical’ approach.” It was a position advanced on the grounds that all good research must meet adequate standards of validity and reliability whatever its style, and it is a message well worth preserving. However, there is a more fundamental reason why it is nonsensical to draw a division between the qualitative and the quantitative. It is simply this: all acts of social observation depend on the deployment of qualitative categories—whether gender, class, race, or even age; there is no descriptive category in use in the social sciences that connects to a world of “natural kinds.” In short, all categories are made, and therefore when we seek to count “things” in the world, we are dependent on the existence of socially constructed divisions. How the categories take the shape that they do—how definitions are arrived at, how inclusion and exclusion criteria are decided upon, and how taxonomic principles are deployed—constitute interesting research questions in themselves. From our starting point, however, we need only note that “sorting things out” (to use a phrase from Bowker & Star, 1999 ) and acts of “counting”—whether it be of chromosomes or people ( Martin and Lynch, 2009 )—are activities that connect to the social world of organized interaction rather than to unsullied observation of the external world.

Of course, some writers deny the strict division between the qualitative and quantitative on grounds of empirical practice rather than of ontological reasoning. For example, Bryman (2008) argues that qualitative researchers also call upon quantitative thinking but tend to use somewhat vague, imprecise terms rather than numbers and percentages—referring to frequencies via the use of phrases such as “more than” and “less then.” Kracauer (1952) advanced various arguments against the view that CTA was strictly a quantitative method, suggesting that very often we wished to assess content as being negative or positive with respect to some political, social, or economic thesis and that such evaluations could never be merely statistical. He further argued that we often wished to study “underlying” messages or latent content of documentation and that in consequence we needed to interpret content as well as count items of content. Morgan (1993) has argued that, given the emphasis that is placed on “coding” in almost all forms of qualitative data analysis, the deployment of counting techniques is essential and that we ought therefore to think in terms of what he calls qualitative as well as quantitative content analysis. Naturally, some of these positions create more problems than they seemingly solve (as is the case with considerations of “latent content”), but given the twentieth-first-century predilection for “mixed-methods” research ( Creswell, 2007 ), it is clear that CTA has a role to play in integrating quantitative and qualitative modes of analysis in a systematic rather than merely an ad hoc and piecemeal fashion. In the sections that follow, I will provide some examples of the ways in which “qualitative” analysis can be combined with systematic modes of counting. First, however, we need to focus on what is analyzed in CTA.

Units of analysis

So what is the unit of analysis in CTA? A brief answer to that question is that analysis can be focused on words, sentences, grammatical structures, tenses, clauses, ratios (of say, nouns to verbs), or even “themes.” Berelson (1952) gives some examples of all of the above and also recommends a form of thematic analysis (c.f., Braun and Clarke, 2006 ) as a viable option. Other possibilities include counting column length (of speeches and newspaper articles), amounts of (advertising) space, or frequency of images. For our purposes, however, it might be useful to consider a specific (and somewhat traditional) example. Here it is. It is an extract from what has turned out to be one of the most important political speeches of the current century.

Iraq continues to flaunt its hostility toward America and to support terror. The Iraqi regime has plotted to develop anthrax and nerve gas and nuclear weapons for over a decade. This is a regime that has already used poison gas to murder thousands of its own citizens, leaving the bodies of mothers huddled over their dead children. This is a regime that agreed to international inspections then kicked out the inspectors. This is a regime that has something to hide from the civilized world. States like these, and their terrorist allies, constitute an axis of evil, arming to threaten the peace of the world. By seeking weapons of mass destruction, these regimes pose a grave and growing danger. They could provide these arms to terrorists, giving them the means to match their hatred. They could attack our allies or attempt to blackmail the United States. In any of these cases, the price of indifference would be catastrophic.” —George W. Bush, State of the Union address, January 29, 2002

A number of possibilities arise for analysing the content of a speech such as the one above. Clearly, words and sentences must play a part in any such analysis, but in addition to words there are structural features of the speech that could also figure. For example, the extract takes the form of a simple narrative—pointing to a past, a present, and an ominous future (catastrophe)—and could therefore be analysed as such. There are, in addition, a number of interesting oppositions in the speech (such as those between “regimes” and the “civilised” world), as well as a set of interconnected present participles such as “plotting,” “hiding,” “arming,” and “threatening” that are associated both with Iraq and with other states that “constitute an axis of evil.” Evidently, simple word counts would fail to capture the intricacies of a speech of this kind. Indeed, our example serves another purpose—to highlight the difficulty that often arises in dissociating content analysis from discourse analysis (of which narrative analysis and the analysis of rhetoric and trope are subspecies). So how might we deal with these problems?

One approach that can be adopted is to focus on what is referenced in text and speech. That is, to concentrate on the characters or elements that are recruited into the text and to examine the ways in which they are connected or co-associated. I shall provide some examples of this form of analysis shortly. Let us merely note for the time being that in the previous example we have a speech in which various “characters”—including weapons in general, specific weapons (such as nerve gas), threats, plots, hatred, evil and mass destruction—play a role. Be aware that we need not be concerned with the veracity of what is being said—whether it is true or false—but simply with what is in the speech and how what is in there is associated. (We may leave the task of assessing truth and falsity to the jurists). Be equally aware that it is a text that is before us and not an insight into the ex-President’s mind, nor his thinking, nor his beliefs, nor any other subjective property that he may have possessed.

In the introductory paragraph, I made brief reference to some ideas of the German philosopher Jűrgen Habermas (1987) . It is not my intention here to expand on the detailed twists and turns of his claims with respect to the role of language in the “lifeworld” at this point. However, I do intend to borrow what I regard as some particularly useful ideas from his work. The first, is his claim—influenced by a strong line of twentieth-century philosophical thinking—that language and culture are constitutive of the lifeworld (1987:125), and in that sense we might say that things (including individuals and societies) are made in language. That of course is a simple justification for focusing on what people say rather than what they “think” or “believe” or “feel” or “mean” (all of which have been suggested at one time or another as points of focus for social inquiry and especially qualitative forms of inquiry). Second, Habermas argues that speakers and therefore hearers (and one might add writers and therefore readers), in what he calls their speech acts, necessarily adopt a pragmatic relation to one of three worlds: entities in the objective world, things in the social world, and elements of a subjective world. In practice, Habermas (1987 :120) suggests all three worlds are implicated in any speech act but that there will be a predominant orientation to one of these. To rephrase this in a crude form, when speakers engage in communication, they refer to things and facts and observations relating to external nature, to aspects of interpersonal relations, and to aspects of private inner subjective worlds (thoughts, feelings, beliefs, etc.). One of the problems with locating CTA in “communication research” has been that the communications referred to are but a special and limited form of action (often what Habermas would call strategic acts). In other words, television, newspaper, video, and internet communications are just particular forms (with particular features) of action in general. Again we might note in passing that the adoption of the Habermassian perspective on speech acts implies that much of qualitative analysis in particular has tended to focus only on one dimension of communicative action—the subjective and private. In this respect, I would argue that it is much better to look at speeches such as George W Bush’s 2002 State of the Union address as an “account” and to examine what has been recruited into the account; and how what has been recruited is connected or co-associated rather than to use the data to form insights into his (or his adviser’s) thoughts, feelings, and beliefs.

In the sections that follow, and with an emphasis on the ideas that I have just expounded, I intend to demonstrate how CTA can be deployed to advantage in almost all forms of inquiry that call upon either interview (or speech-based) data or textual data. In my first example, I will show how CTA can be used to analyze a group of interviews. In the second example, I will show how it can be used to analyze a group of policy documents. In the third, I shall focus on a single interview (a “case”), and in the fourth and final example, I will show how CTA can be used to track the biography of a concept. In each instance, I shall briefly introduce the context of the “problem” on which the research was based, outline the methods of data collection, discuss how the data were analyzed and presented, and underline the ways in which content analysis has sharpened the analytical strategy.

Analyzing a Sample of Interviews: Looking at Concepts and Their Co-Associations in a Semantic Network

My first example of using CTA is based on a research study that was initially undertaken in the early 2000s. It was a project aimed at understanding why older people might reject the offer to be immunized against influenza (at no cost to them). The ultimate objective was to improve rates of immunization in the study area. The first phase of the research was based on interviews with 54 older people in South Wales. The sample included people who had never been immunized, some who had refused immunization, and some who had accepted immunization. Within each category, respondents were randomly selected from primary care physician patient lists, and the data were initially analyzed “thematically” and published accordingly ( Evans, Prout, Prior, et al., 2007 ). A few years later, however, I returned to the same data set to look at a different question—how (older) lay people talked about colds and flu, especially how they distinguished between the two illnesses and how they understood the causes of the two illnesses (see Prior, Evans, & Prout, 2011 ). Fortunately, in the original interview schedule, we had asked people about how they saw the “differences between cold and flu” and what caused flu, so it was possible to reanalyze the data with such questions in mind. In that frame, the example that follows demonstrates not only how CTA might be used on interview data, but also how it might be used to undertake a secondary analysis of a pre-existing data set ( Bryman, 2008 ).

As with all talk about illness, talk about colds and flu is routinely set within a mesh of concerns—about causes, symptoms, and consequences. Such talk comprises the base elements of what has at times been referred to as the “explanatory model” of an illness ( Kleinman, Eisenberg, & Good, 1978 ). In what follows, I shall focus almost entirely on issues of causation as understood from the viewpoint of older people; the analysis is based on the answers that respondents made in response to the question, “How do you think people catch flu?”

Semi-structured interviews of the kind undertaken for a study such as this are widely used and are often characterized as akin to “a conversation with a purpose” ( Kahn & Cannell, 1957 :97). One of the problems of analyzing the consequent data is that, although the interviewer holds to a planned schedule, the respondents often reflect in a somewhat unstructured way about the topic of investigation, so it is not always easy to unravel the web of talk about, say, “causes” that occurs in the interview data. In this example, causal agents of flu, inhibiting agents, and means of transmission were often conflated by the respondents. Nevertheless, in their talk people did answer the questions that were posed, and in the study referred to here, that talk made reference to things such as “bugs” (and “germs”) as well as viruses; but the most commonly referred to causes were “the air” and the “atmosphere.” The interview data also pointed toward means of transmission as “cause”—so coughs and sneezes and mixing in crowds figured in the causal mix. Most interesting perhaps was the fact that lay people made a nascent distinction between facilitating factors (such as bugs and viruses) and inhibiting factors (such as being resistant, immune, or healthy), so that in the presence of the latter, the former are seen to have very little effect. Here are some shorter examples of typical question-response pairs from the original interview data.

(R:32): “How do you catch it [the flu]? Well, I take it its through ingesting and inhaling bugs from the atmosphere. Not from sort of contact or touching things. Sort of airborne bugs. Is that right?” (R:3): “I suppose it’s [the cause of flu] in the air. I think I get more diseases going to the surgery than if I stayed home. Sometimes the waiting room is packed and you’ve got little kids coughing and spluttering and people sneezing, and air conditioning I think is a killer by and large I think air conditioning in lots of these offices”. (R:46): “I think you catch flu from other people. You know in enclosed environments in air conditioning which in my opinion is the biggest cause of transferring diseases is air conditioning. Worse thing that was ever invented that was. I think so, you know. It happens on aircraft exactly the same you know.”

Alternatively, it was clear that for some people being cold, wet, or damp could also serve as a direct cause of flu; thus:

Interviewer: “OK, good. How do you think you catch the flu?” (R:39): “Ah. The 65 dollar question. Well, I would catch it if I was out in the rain and I got soaked through. Then I would get the flu. I mean my neighbour up here was soaked through and he got pneumonia and he died. He was younger than me: well, 70. And he stayed in his wet clothes and that’s fatal. Got pneumonia and died, but like I said, if I get wet, especially if I get my head wet, then I can get a nasty head cold and it could develop into flu later.”

As I suggested earlier, despite the presence of bugs and germs, viruses, the air, and wetness or dampness, “catching” the flu is not a matter of simple exposure to causative agents. Thus some people hypothesized that within each person there is a measure of immunity or resistance or healthiness that comes into play and that is capable of counteracting the effects of external agents. For example, being “hardened” to germs and harsh weather can prevent a person getting colds and flu. Being “healthy” can itself negate the effects of any causative agents, and healthiness is often linked to aspects of “good” nutrition and diet and not smoking cigarettes. These mitigating and inhibiting factors can either mollify the effects of infection or prevent a person “catching” the flu entirely. Thus (R:45) argued that it was almost impossible for him to catch flu or cold “[c]os I got all this resistance.” Interestingly respondents often used possessive pronouns in their discussion of immunity and resistance (“my immunity” and “my resistance”)—and tended to view them as personal assets (or capital) that might be compromised by mixing with crowds.

By implication, having a weak immune system can heighten the risk of contracting cold and flu and might therefore spur one on to take preventive measures such as accepting a flu jab. There are some, of course, who believe that it is the flu jab that can cause the flu and other illnesses. An example of what might be called lay “epidemiology” ( Davison, Davey-Smith, & Frankel, 1991 ) is evident in the following extract.

(R:4): “Well, now it’s coincidental you know that [my brother] died after the jab, but another friend of mine, about 8 years ago, the same happened to her. She had the jab and about six months later, she died, so I know they’re both coincidental, but to me there’s a pattern.”

Normally, results from studies such as this are presented in exactly the same way as has just been set out. Thus the researcher highlights given themes that are said to have emerged out of the data and then provides appropriate extracts from the interviews to illustrate and substantiate the relevant themes. However, one very reasonable question that any critic might ask about the selected data extracts concerns the extent to which they are “representative” of the material in the data set as a whole. Maybe, for example, the author has been unduly selective in his or her use of both themes and quotations. Perhaps, as a consequence, the author has ignored or left out talk that does not fit their arguments or extracts that might be considered dull and uninteresting compared to more exotic material. And these kinds of issues and problems are certainly common to the reporting of almost all forms of qualitative research. However, the adoption of CTA techniques can help to mollify such problems. This is so because by using CTA we can indicate the extent to which we have used all or just some of the data, and we can provide a view of the content of the entire sample of interviews rather than just the content and flavor of merely one or two interviews. In this light, we need to consider Figure 18.1 . The figure is based on counting the number of references in the 54 interviews to the various “causes” of the flu, though references to the flu jab (i.e., inoculation) as a cause of flu have been ignored for the purpose of this discussion). The node sizes reflect the relative importance of each cause as determined by the concept count (frequency of occurrence). The links between nodes reflect the degree to which causes are co-associated in interview talk and are calculated according to a co-occurrence index (see, e.g., SPSS, 2007 :183).

Given this representation, we can immediately assess the relative importance of the different causes as referred to in the interview data. Thus we can see that such things as (poor) “hygiene” and “foreigners” were mentioned as a potential cause of flu—but mention of hygiene and foreigners was nowhere near so important as references to “the air” or to “crowds” or to “coughs and sneezes.” In addition, we can also determine the strength of the connections that interviewees made between one cause and another. Thus there are relatively strong links between “resistance” and “coughs and sneezes,” for example.

In fact, Figure 18.1 divides causes into the “external” and the “internal,” or the facilitating and the impeding (lighter and darker nodes). Among the former I have placed such things as crowds, coughs, sneezes, and the air while among the latter I have included “resistance,” “immunity,” and “health.” That division, of course, is a product of my conceptualizing and interpreting the data, but whichever way we organize the findings, it is evident that talk about the causes of flu belongs in a web or mesh of concerns that would be difficult to represent by the use of individual interview extracts alone. Indeed, it would be impossible to demonstrate how the semantics of causation belong to a culture (rather than to individuals) in any other way. In addition I would argue that the counting involved in the construction of the diagram functions as a kind of check on researcher interpretations and provides a source of visual support for claims that an author might make about, say, the relative importance of “damp” and “air” as perceived causes of disease. Finally, the use of CTA techniques allied with aspects of conceptualization and interpretation has enabled us to approach the interview data as a set and to consider the respondents as belonging to a community rather than regarding them merely as isolated and disconnected individuals, each with their own views. It has also enabled us to squeeze some new findings out of old data, and I would argue that it has done so with advantage. There are of course other advantages to using CTA to explore data sets, which I highlight in the next section.

What causes flu? A lay perspective. Factors listed as causes of colds and flu in 54 interviews. Node size is proportional to number of references “as causes.” Line thickness is proportional to co-occurrence of any two “causes” in the set of interviews.

Analyzing a Sample of Documents: Using Content Analysis to Verify Claims

Policy analysis is a difficult business. For a start, it is never entirely clear where (social, health, economic, environmental) policy actually is. Is it in documents (as published by governments, think tanks, and research centres), in action (what people actually do), or in speech (what people say)? Perhaps it rests in a mixture of all three realms. Yet wherever it may be, it is always possible, at the very least, to identify a range of policy texts and to focus on the conceptual or semantic webs in terms of which government officials and other agents (such as politicians) talk about the relevant policy issues. Furthermore, in so far as policy is recorded—in speeches, pamphlets, and reports—we may begin to speak of specific policies as having a history or a pedigree that unfolds through time (think, e.g., of US or UK health policies during the Clinton years or the Obama years). And in so far as we consider “policy” as having a biography or a history, we can also think of studying policy narratives.

Though firmly based in the world of literary theory, narrative method has been widely used for both the collection and the analysis of data concerning ways in which individuals come to perceive and understand various states of health, ill health, and disability ( Frank, 1995 ; “ Hydén, 1997 ). Narrative techniques have also been adapted for use in clinical contexts and allied to concepts of healing ( Charon, 2006 ). In both social scientific and clinical work, however, the focus is invariably on individuals and on how individuals “tell” stories of health and illness. Yet narratives can also belong to collectives—such as political parties and ethnic and religious groups—just as much as to individuals, and in the latter case there is a need to collect and analyse data that are dispersed across a much wider range of materials than can be obtained from the personal interview. In this context, Roe (1994) has demonstrated how narrative method can be applied to an analysis of national budgets, animal rights, and environmental policies.

An extension of the concept of narrative to policy discourse is undoubtedly useful ( Newman & Vidler, 2006 ), but how might such narratives be analyzed? What strategies can be used to unravel the form and content of a narrative, especially in circumstances where the narrative might be contained in multiple (policy) documents, authored by numerous individuals, and published across a span of time rather than in a single, unified text such as a novel? Roe (1994) , unfortunately, is not in any way specific about analytical procedures apart from offering the useful rule to “never stray too far from the data” (1994:xii). So in this example I will outline a strategy for tackling such complexities. In essence, it is a strategy that combines techniques of linguistically (rule) based content analysis with a theoretical and conceptual frame that enables us to unraveland identify the core features of a policy narrative. My substantive focus is on documents concerning health service delivery policies published 2000–2009 in the constituent countries of the UK (that is, England, Scotland, Wales, and Northern Ireland—all of which have different political administrations).

Narratives can be described and analyzed in various ways, but for our purposes we can say that they have three key features: they point to a chronology, they have a plot and they contain “characters.”

Chronology : All narratives have beginnings; they also have middles and endings, and these three stages are often seen as comprising the fundamental structure of narrative text. Indeed, in his masterly analysis of time and narrative, Ricoeur (1984) argues that it is in the unfolding chronological structure of a narrative that one finds its explanatory (and not merely descriptive) force. By implication, one of the simplest strategies for the examination of policy narratives is to locate and then divide a narrative into its three constituent parts—beginning, middle, and end.

Unfortunately, while it can sometimes be relatively easy to locate or choose a beginning to a narrative, it can be much more difficult to locate an end point. Thus in any illness narrative, a narrator might be quite capable of locating the start of an illness process (in an infection, accident, or other event) but unable to see how events will be resolved in an ongoing and constantly unfolding life. As a consequence, both narrators and researchers usually find themselves in the midst of an emergent present—a present without a known and determinate end (see, e.g., Frank, 1995 ). Similar considerations arise in the study of policy narratives where chronology is perhaps best approached in terms of (past) beginnings, (present) middles, and projected futures.

Plot : According to Ricoeur (1984) , our basic ideas about narrative are best derived from the work and thought of Aristotle who in his Poetics sought to establish “first principles” of composition. For Ricoeur, as for Aristotle, plot ties things together. It “brings together factors as heterogeneous as agents, goals, means, interactions, circumstances, unexpected results” (1984:65) into the narrative frame. For Aristotle, it is the ultimate untying or unraveling of the plot that releases the dramatic energy of the narrative.

Character : Characters are most commonly thought of as individuals, but they can be considered in much broader terms. Thus the French semiotician A. J. Greimas (1970) , for example, suggested that, rather than think of characters as people, it would be better to think in terms of what he called “actants” and of the functions that such actants fulfill within a story. In this sense geography, climate, and capitalism can be considered as characters every bit as much as aggressive wolves and Little Red Riding Hood. Further, he argued that the same character (actant) can be considered to fulfill many functions and the same function performed by many characters. Whatever else, the deployment of the term actant certainly helps us to think in terms of narratives as functioning and creative structures. It also serves to widen our understanding of the ways in which concepts, ideas, and institutions, as well “things” in the material world can influence the direction of unfolding events every bit as much as conscious human subjects. Thus, for example, the “American people,” “the nation,” “the constitution,” “ the West,” “tradition,” and “Washington” can all serve as characters in a policy story.

As I have already suggested, narratives can unfold across many media and in numerous arenas—speech and action, as well as text. Here, however, my focus is solely on official documents—all of which are UK government policy statements as listed in Table 18.1 . The question is how might CTA help us unravel the narrative frame?

It might be argued that a simple reading of any document should familiarize the researcher with elements of all three policy narrative components (plot, chronology, and character). However, in most policy research, we are rarely concerned with a single and unified text as is the case with a novel, but rather with multiple documents written at distinctly different times by multiple (usually anonymous) authors that notionally can range over a wide variety of issues and themes. In the full study, some 19 separate publications were analyzed across England, Wales, Scotland, and Northern Ireland.

Naturally, to list word frequencies—still less to identify co-occurrences and semantic webs in large data sets (covering hundreds of thousand of words and footnotes)—cannot be done manually but rather requires the deployment of complex algorithms and text-mining procedures. To this end I analyzed the 19 documents using “Text Mining for Clementine” ( SPSS, 2007 ).

Text-mining procedures begin by providing an initial list of concepts based on the lexicon of the text but which can be weighted according to word frequency and which take account of elementary word associations. For example, learning disability, mental health, and performance management indicate three concepts, not six words. Using such procedures on the aforementioned documents gives the researcher an initial grip on the most important concepts in the document set of each country. Note that this is much more than a straightforward concordance analysis of the text and is more akin to what Ryan & Bernard (2000) have referred to as “semantic analysis” and Carley (1993) has referred to as “concept” and “mapping” analysis.

So the first task was to identify and then extract the core concepts, thus identifying what might be called “key” characters or actants in each of the policy narratives. For example, in the Scottish documents such actants included “Scotland” and the “Scottish people,” as well as “health” and the “NHS,” among others; while in the Welsh documents it was “the people of Wales” and “Wales” that figured largely—thus emphasizing how national identity can play every bit as important a role in a health policy narrative as concepts such as “health,” “hospitals,” and “wellbeing.”

Having identified key concepts it was then possible to track concept clusters in which particular actants or characters are embedded. Such cluster analysis is dependent on the use of co-occurrence rules and the analysis of synonyms, whereby it is possible to get a grip on the strength of the relationships between the concepts, as well as the frequency with which the concepts appear in the collected texts. In Figure 18.2 , I provide an example of a concept cluster. The diagram indicates the nature of the conceptual and semantic web in which various actants are discussed. The diagrams further indicate strong (solid line) and weaker (dotted line) connections between the various elements in any specific mix, and the numbers indicate frequency counts for the individual concepts. Using Clementine , the researcher is unable to specify in advance which clusters will emerge from the data. One cannot, for example, choose to have an NHS cluster. In that respect, these diagrams not only provide an array in terms of which concepts are located, but also serve as a check on and to some extent validation of the interpretations of the researcher. Of course none of this tells us what the various narratives contained within the documents might be. They merely point to key characters and relationships both within and between the different narratives. So having indicated the techniques used to identify the essential parts of the four policy narratives, it is now time to sketch out their substantive form.

It may be useful to note that Aristotle recommended brevity in matters of narrative —deftly summarising the whole of the Odyssey in just seven lines. In what follows, I attempt—albeit somewhat weakly—to emulate that example by summarising a key narrative of English health services policy in just four paragraphs. The citations are of Department of Health publications (by year) as listed in Table 18.1 . Note how the narrative unfolds in relation to the dates of publication. In the English case (though not so much in the other UK countries), it is a narrative that is concerned to introduce market forces into what is and has been a state-managed health service. Market forces are justified in terms of improving opportunities for the consumer (i.e., the patients in the service), and the pivot of the newly envisaged system is something called “patient choice” or “choice.” This is how the story unfolds as told through the policy documents between 2000–2008 (see Table 18.1 ).

The advent of the NHS in 1948 was a “seminal event” (2000:8), but under successive Conservative administrations the NHS was seriously underfunded (2006:3). The (New Labour) government will invest (2000) or already has (2003:4) invested extensively in infrastructure and staff, and the NHS is now on a “journey of major improvement” (2004:2). But “more money is only a starting point” (2000:2), and the journey is far from finished. Continuation requires some fundamental changes of “culture” (2003:6). In particular, the NHS remains unresponsive to patient need, and “[a]ll too often, the individual needs and wishes are secondary to the convenience of the services that are available. This ‘one size fits all’ approach is neither responsive, equitable nor person-centred” (2003:17). In short, the NHS is a 1940s system operating in a twenty-first-century world (2000:26). Change is therefore needed across the “whole system” (2005:3) of care and treatment.

Above all, we have to recognize that we “live in a consumer age” (2000:26). People’s expectations have changed dramatically (2006:129), and people want more choice, more independence, and more control (2003:12) over their affairs. Patients are no longer, and should not be considered as, “passive recipients” of care (2003:62), but wish to be and should be (2006:81) actively “involved” in their treatments (2003:38, 2005:18)—indeed, engaged in a partnership (2003:22) of respect with their clinicians. Furthermore, most people want a personalized service “tailor made to their individual needs” (2000:17, 2003:15, 2004:1, 2006:83)—“[a] service which feels personal to each and every individual within a framework of equity and good use of public money” (2003:6).

To advance the necessary changes, “patient choice” needs to be and “will be strengthened” (2000:89). “Choice” must be made to “happen” (2003), and it must be “real” (2003:3, 2004:5, 2005:20, 2006:4). Indeed, it must be “underpinned” (2003:7) and “widened and deepened” (2003:6) throughout the entire system of care.

If “we” expand and underpin patient choice in appropriate ways and engage patients in their treatment systems, then levels of patient satisfaction will increase (2003:39), and their choices will lead to a more “efficient” (2003:5, 2004:2, 2006:16) and effective (2003:62, 2005:8) use of resources. Above all, the promotion of choice will help to drive up “standards” of care and treatment (2000:4, 2003:12, 2004:3, 2005:7, 2006:3). Furthermore, the expansion of choice will serve to negate the effects of the “inverse care law,” whereby those who need services most tend to get catered for the least (2000:107, 2003:5, 2006:63), and it will thereby help in moderating the extent of health inequalities in the society in which we live. “The overall aim of all our reforms,” therefore, “is to turn the NHS from a top down monolith into a responsive service that gives the patient the best possible experience. We need to develop an NHS that is both fair to all of us, and personal to each of us” (2003:5).

Concept cluster for “care” in six English policy documents, 2000–2007. Line thickness is proportional to the strength co-occurrence co-efficient. Node size reflects relative frequency of concept, and (numbers) refer to the frequency of concept. Solid lines indicate relationships between terms within the same cluster, and dotted lines indicate relationships between terms in different clusters.

We can see how most—though not all—of the elements of this story are represented in Figure 18.2 . In particular we can see strong (co-occurrence) links between “care” and “choice” and how partnership, performance, control, and improvement have a prominent profile. There are of course some elements of the web that have a strong profile (in terms of node size and links) but to which we have not referred; access, information, primary care, and waiting times are four. As anyone well versed in English health care policy would know, these have important roles to play in the wider, consumer-driven narrative. However, by rendering the excluded as well as included elements of that wider narrative visible, the concept web provides a degree of verification on the content of the policy story as told herein and on the scope of its “coverage.”

In following through on this example, we have of course moved from content analysis to a form of discourse analysis (in this instance narrative analysis). That shift underlines aspects of both the versatility of CTA and some of its weaknesses—versatility in the sense that CTA can be readily combined with other methods of analysis and in the way in which the results of the CTA help us to check and verify the claims of the researcher. The weakness of the diagram compared to the narrative is that CTA on its own is a somewhat one-dimensional and static form of analysis, and while it is possible to introduce time and chronology into the diagrams, the diagrams themselves remain lifeless in the absence of some form of discursive overview. (For a fuller analysis of these data see, Prior, Hughes, & Peckham, 2012 ).

Analyzing a Single Interview: The Role of Content Analysis in a Case Study

So far I have focused on using content analysis on a sample of interviews and on a sample of documents. In the first instance, I recommended CTA for its capacity to tell us something about what is seemingly central to interviewees and for demonstrating how what is said is linked (in terms of a concept network). In the second instance, I reaffirmed the virtues of co-occurrence and network relations, but this time in the context of a form of discourse analysis. I also suggested that CTA can serve an important role in the process of verification of a narrative and its academic interpretation. In this section, however, I am going to link the use of CTA to another style of research—case study—to show how CTA might be used to analyze a single “case.”

Case study is a term used in multiple and often ambiguous ways. However, Gerring (2004 :342) defines it as “an intensive study of a single unit for the purpose of understanding a larger class of (similar) units.” As Gerring points out, case study does not necessarily imply a focus on N = 1, although that is indeed the most logical number for case study research ( Ragin & Becker, 1992 ). Naturally, an N of 1 can be immensely informative, and whether we like it or not we often have only one N to study (think, e.g., of the 1986 Challenger shuttle disaster, or of the 9/11 attack on the World Trade Center). In the clinical sciences, of course, case studies are widely used to represent the “typical” features of a wider class of phenomena, and often used to define a kind or syndrome (as is in the field of clinical genetics). Indeed, at the risk of mouthing a tautology, one can say that the distinctive feature of case study is its focus on a case in all of its complexity—rather than on individual variables and their inter-relationships, which tends to be a point of focus for large N research.

There was a time when case study was central to the science of psychology. Breuer and Freud’s (2001) famous studies of “hysteria” (orig. 1895) provide an early and outstanding example of the genre in this respect, but as with many of the other styles of social science research, the influence of case studies waned with the rise of much more powerful investigative techniques—including experimental methods—driven by the deployment of new statistical technologies. Ideographic studies consequently gave way to the current fashion for statistically driven forms of analysis that focus on causes and cross-sectional associations between variables rather than ideographic complexity.

In the example that follows, we will look at the consequences of a traumatic brain injury (TBI) on just one individual. The analysis is based on an interview with a person suffering from such an injury, and it was one of 32 interviews carried out with people who had experienced a TBI. The objective of the original research was to develop an outcome measure for TBI that was sensitive to the sufferer’s (rather than the health professional’s) point of view. In our original study (see Morris, Prior, Deb et al., 2005 ), interviews were also undertaken with 27 carers of the injured with the intention of comparing their perceptions of TBI to those of the people for which they cared. A sample survey was also undertaken to elicit views about TBI from a much wider population of patients than was studied via interview.

In the introduction, I referred to Habermas and the concept of the “lifeworld.” Lifeworld ( Lebenswelt ) is a concept that first arose out of twentieth-century German philosophy. It constituted a specific focus for the work of Alfred Schutz (see, e.g., Schutz and Luckman, 1974 ). Schutz described the lifeworld as “that province of reality which the wide-awake and normal adult simply takes-for-granted in an attitude of common sense” (1974:3). Indeed, it was the routine and taken-for-granted quality of such a world that fascinated Schutz. As applied to the worlds of those with head injuries, the concept has particular resonance because head injuries often result in that taken-for-granted quality being disrupted and fragmented, ending in what Russian neuropsychologist A.R. Luria once described as “shattered” worlds ( Luria, 1975 ). As well as providing another excellent example of a case study, Luria’s work is also pertinent because he sometimes argued for a “romantic science” of brain injury—that is, a science that sought to grasp the world view of the injured patient by paying attention to an unfolding and detailed personal “story” of the head injured as well as to the neurological changes and deficits associated with the injury itself. In what follows, I shall attempt to demonstrate how CTA might be used to underpin such an approach.

In the original research, we began analysis by a straightforward reading of the interview transcripts. Unfortunately, a simple reading of a text or an interview can, strangely, mislead the reader into thinking that some issues or themes are actually more important than is warranted by the actual contents of the text. How that comes about is not always clear, but it probably has something to do with a desire to develop “findings” and our natural capacity to overlook the familiar in favor of the unusual. For that reason alone, it is always useful to subject any text to some kind of concordance analysis—that is, generating a simple frequency list of words used in an interview or text. Given the current state of technology, one might even speak these days of using text-mining procedures such as the aforementioned Clementine to undertake such a task. By using Clementine, and as we have seen, it is also possible to measure the strength of co-occurrence links between elements (i.e., words and concepts) in the entire data set (in this example, 32 interviews), though for a single interview these aims can just as easily be achieved using much simpler, low-tech strategies.

By putting all 32 interviews into the database, a number of common themes emerged. For example, it was clear that “time” entered into the semantic web in a prominent manner, and it was clearly linked to such things as “change,” “injury,” “the body,” and what can only be called the “I was.” Indeed, time runs through the 32 stories in many guises, and the centrality of time is of course a reflection of storytelling and narrative recounting in general—chronology, as we have noted, being a defining feature of all story telling ( Ricoeur, 1984 ). Thus sufferers recounted both the events surrounding their injury and provided accounts as to how the injuries affected their present life and future hopes. As to time present, much of the patient story circled around activities of daily living—walking, working, talking, looking, feeling, remembering, and so forth.

Understandably, the word and the concept of “injury” featured largely in the interviews, though it was a word most commonly associated with discussions of physical consequences of injury. There were many references in that respect to injured arms, legs, hands, and eyes. There were also references to “mind”—though with far lesser frequency than with references to the body and to body parts. Perhaps none of this is surprising. However, one of the most frequent concepts in the semantic mix was the “I was” (716 references). The statement “I was,” or “I used to” was in turn strongly connected to terms such as “the accident” and “change.” Interestingly, the “I was” overwhelmingly eclipsed the “I am” in the interview data (the latter with just 63 references). This focus on the “I was” appears in many guises. For example, it is often associated with the use of the passive voice: “I was struck by a car;” “I was put on the toilet;” “I was shipped from there then, transferred to [Cityville];” “I got told that I would never be able...;” “I was sat in a room,” and so forth. In short, the “I was” is often associated with things, people, and events acting upon the injured person. More importantly, however, the appearance of the “I was” is often used to preface statements signifying a state of loss or change in the person’s course of life—that is, as an indicator for talk about the patient’s shattered world. For example, Patient 7122 stated, “The main (effect) at the moment is I’m not actually with my children, I can’t really be their mum at the moment. I was a caring Mum, but I can’t sort of do the things that I want to be able to do like take them to school. I can’t really do a lot on my own. Like crossing the roads.”

Another patient stated, “Everything is completely changed. The way I was... I can’t really do anything at the moment. I mean my German, my English, everything’s gone. Job possibilities is out the window. Everything is just out of the window... I just think about it all the time actually every day you know. You know it has destroyed me anyway, but if I really think about what has happened I would just destroy myself.”

Each of these quotations in its own way serves to emphasize how life has changed and how the patient’s world has changed. In that respect, we can say that one of the major outcomes arising from TBI may be substantial “biographical disruption” ( Bury, 1982 ), whereupon key features of an individual’s life course are radically altered forever. Indeed, as Becker (1997 :37) argues in relation to a wide array of life events, “When their health is suddenly disrupted, people are thrown into chaos. Illness challenges one’s knowledge of one’s body. It defies orderliness. People experience the time before their illness and its aftermath as two separate entities.” Indeed, this notion of a cusp in personal biography is particularly well illustrated by Luria’s patient Zasetsky; the latter often refers to being a “newborn creature” ( Luria, 1975 :24, 88), a shadow of a former self (1975;25), and as having his past “wiped out” (1975: 116).

However, none of this tells us about how these factors come together in the life and experience of one individual. When we focus on an entire set of interviews, we necessarily lose the rich detail of personal experience and tend instead to rely on a conceptual rather than a graphic description of effects and consequences (to focus on, say, “memory loss,” rather than loss of memory about family life). The contents of Figure 18.3 attempt to correct that vision. It records all of the things that a particular respondent (Patient 7011 )used to do and liked doing. It records all of the things that he says that can no longer do (at one year after injury), and it records all of the consequences that he suffered from his head injury at the time of interview. Thus we see references to epilepsy (his “fits”), paranoia (the patient spoke of his suspicions concerning other people, people scheming behind his back, and his inability to trust others), deafness, depression, and so forth. Note that, although I have inserted a future tense into the web (“I will”), such a statement never appeared in the transcript. I have set it there for emphasis and to show how for this person the future fails to connect to any of the other features of his world except in a negative way. Thus he states at one point that he cannot think of the future because it makes him feel depressed (see Fig. 18.3). The line thickness of the arcs reflect the emphasis that the subject placed on the relevant “outcomes” in relation to the “I was” and the “now” during the interview. Thus we see that factors affecting his concentration and balance loom large but that he is also concerned about his being dependent on others, his epileptic fits, and his being unable to work and drive a vehicle. The schism in his life between what he used to do, what cannot now do, and his current state of being is nicely represented in the CTA diagram.

What have we gained from executing this kind of analysis? For a start, we have moved away from a focus on variables, frequencies, and causal connections (e.g., a focus on the proportion of people with TBI who suffer from memory problems or memory problems and speech problems) and refocused on how the multiple consequences of a TBI link together in one person. In short, instead of developing a narrative of acting variables, we have emphasized a narrative of an acting individual ( Abbott, 1992 :62). Second, it has enabled us to see how the consequences of a TBI connect to an actual lifeworld (and not simply an injured body). So the patient is not viewed just as having a series of discrete problems such as balancing, or staying awake, which is the usual way of assessing outcomes, but is seen as someone struggling to come to terms with an objective world of changed things, people, and activities (missing work is not, for example, routinely considered an “outcome” of head injury). Third, by focusing on what the patient was saying, we gain insight into something that is simply not visible by concentrating on single outcomes or symptoms alone—namely, the void that rests at the center of the interview, what I have called the “I was.” Fourth, we have contributed to understanding a type, for the case that we have read about is not simply a case of “John” or “Jane” but a case of TBI, and in that respect it can add to many other accounts of what it is like to experience head injury—including one of the most well documented of all TBI cases, that of Zatetsky. Finally, we have opened up the possibility of developing and comparing cognitive maps ( Carley, 1993 ) for different individuals, and thereby gained insight into how alternative cognitive frames of the world arise and operate.

The shattered world of patient 7011. Thickness of lines (arcs) are proportional to the frequency of reference to the “outcome” by the patient during interview.

Tracing the biography of a concept

In the previous sections, I emphasised the virtues of CTA for its capacity to link into a data set in its entirety—and how the use of CTA can counter any tendency of a researcher to be selective and partial in the presentation and interpretation of information contained in interviews and documents. However, that does not mean that we always have to take an entire document or interview as the data source. Indeed, it is possible to select (on rational and explicit grounds) sections of documentation and to conduct the CTA on the chosen portions. In the example that follows, I do just that. The sections that I chose to concentrate on are titles and abstracts of academic papers—rather than the full texts. The research on which the following is based is concerned with a biography of a concept and is being conducted in conjunction with a PhD student of mine, Joanne Wilson. Joanne thinks of this component of the study more in terms of a “scoping study” than of a biographical study, and that too is a useful framework for structuring the context in which CTA can be used. Scoping studies ( Arksey & O’Malley, 2005 ) are increasingly used in health related research to “map the field” and to get a sense of the range of work that has been conducted on a given topic. Such studies can also be used to refine research questions and research designs. In our investigation the scoping study was centred on the concept of “well-being.” During the past decade or so, “well-being” has emerged as an important research target for governments and corporations as well as for academics, yet it is far from clear to what the term refers. Given the ambiguity of meaning, it is clear that a scoping review, rather than either a systematic review or a narrative review of available literature, would be best suited to our goals.

The origins of the concept of well-being can be traced at least as far back as the fourth century B.C., when philosophers produced normative explanations of the good life (e.g., eudaimonia, hedonia, and harmony). However, contemporary interest in the concept seemed to have been regenerated by the concerns of economists and most recently psychologists. These days governments are equally concerned with measuring well-being to inform policy and conduct surveys of well-being to assess that state of the nation (see, e.g., Office for National Statistics [ONS], 2012 )—but what are they assessing?

We adopted a two-step process to address the research question, “What is the meaning of ‘well-being’ in the context of public policy?” First, we explored the existing thesauri of eight databases to establish those higher-order headings (if any) under which articles with relevance to well-being might be catalogued. Thus we searched the following databases: Cumulative Index of Nursing and Allied Health Literature [CINAHL], EconLit, Health Management Information Consortium [HMIC], MEDLINE, Philosopher’s Index, PsycINFO, Sociological Abstracts, and Worldwide Political Science Abstracts (WPSA). Each of these databases adopts keyword-controlled vocabularies. In other words, they use inbuilt statistical procedures to link core terms to a set lexis of phrases that depict the concepts contained in the database. Table 18.2 shows each database and its associated taxonomy. The contents of the table point toward a linguistic infrastructure in terms of which academic discourse is conducted, and our task was to extract from this infrastructure the semantic web wherein the concept of “well-being” is situated. We limited the thesaurus terms to “well-being” and its variants (i.e., wellbeing or well being). If the term was returned, it was then exploded to identify any associated terms.

CINAHL = Cumulative Index of Nursing and Allied Health Literature; HMIC = Health Management Information Consortium; WPSA = Worldwide Political Science Abstracts.

To develop the conceptual map, we conducted a free-text search for well-being and its variants within the context of public policy across the same databases. We orchestrated these searches across five separate timeframes: January 1990 to December 1994, January 1995 to December 1999, January 2000 to December 2004, January 2005 to December 2009, and January 2010 to October 2011. Naturally, different disciplines use different words to refer to well-being, each of which may wax and wane in usage over time. The searches thus sought to quantitatively capture any changes in the use and subsequent prevalence of well-being and any referenced terms (i.e., to trace a biography).

It is important to note that we did not intend to provide an exhaustive, systematic search of all the relevant literature. Rather we wanted to establish the prevalence of well-being and any referenced (i.e., allied) terms within the context of public policy. This has the advantage of ensuring that any identified words are grounded in the literature (i.e., they represent words actually used by researchers to talk and write about well-being in policy settings). The searches were limited to abstracts to increase specificity, albeit at some expense to sensitivity, with which we could identify relevant articles.

We also employed inclusion/exclusion criteria to facilitate the process by which we selected articles, thereby minimizing any potential bias arising from our subjective interpretations. We included independent, standalone investigations relevant to the study’s objectives (i.e., concerned with well-being in the context of public policy), which focused on well-being as a central outcome or process and which made explicit reference to “well-being” and “public policy” in either the title or the abstract. We excluded articles that were irrelevant to the study’s objectives, used noun adjuncts to focus on the well-being of specific populations (i.e., children, elderly, women) and contexts (e.g., retirement village), or that focused on deprivation or poverty unless poverty indices were used to understand well-being as opposed to social exclusion. We also excluded book reviews and abstracts describing a compendium of studies.

Using these criteria, Joanne Wilson conducted the review and recorded the results on a template developed specifically for the project, organized chronologically across each database and timeframe. Results were scrutinized by two other colleagues to ensure the validity of the search strategy and the findings. Any concerns regarding the eligibility of studies for inclusion were discussed amongst the research team. I then analyzed the co-occurrence of the key terms in the database. The resultant conceptual map is shown in Figure 18.4 .

The diagram can be interpreted as a visualization of a conceptual space. So when academics write about “well-being” in the context of public policy, they tend to connect the discussion to the other terms in the matrix. “Happiness,” “health,” “economic,” and “subjective,” for example, are relatively dominant terms in the matrix. The node size of these words suggest that references to such entities is only slightly less than reference to well-being itself. However, when we come to analyse how well-being is talked about in detail, we see specific connections come to the fore. Thus the data imply that talk of “subjective well-being” far outweighs discussion of “social well-being,” or “economic well-being.” Happiness tends to act as an independent node (there is only one occurrence of happiness and well-being), probably suggesting that “happiness” is acting as a synonym for wellbeing. Quality of life (QoL) is poorly represented in the abstracts, and its connection to most of the other concepts in the space is very weak—confirming, perhaps, that QoL is unrelated to contemporary discussions of well-being and happiness. The existence of “measures” points to a distinct concern to assess and to quantify expressions of happiness, well-being, economic growth, and gross domestic product. More important and underlying this detail, there are grounds for suggesting that there are in fact a number of tensions in the literature on well-being.

On one hand, the results point toward an understanding of well-being as a property of individuals—as something that they feel or experience. Such a discourse is reflected through the use of words like “happiness,” “subjective,” and “individual.” This individualistic and subjective frame has grown in influence over the past decade in particular, and one of the problems with it is that it tends toward a somewhat content-free conceptualisation of well-being. To feel a sense of well-being one merely states that one is in a state of well-being; to be happy, one merely proclaims that one is happy (cf. ONS, 2012 ). It is reminiscent of the conditions portrayed in Aldous Huxley’s Brave New World , wherein the rulers of a closely managed society gave their priority to maintaining order and ensuring the happiness of the greatest number—in the absence of attention to justice or freedom of thought or any sense of duty and obligation to others, many of whom were systematically bred in “the hatchery” as slaves.

The position of a concept in a network—a study of “wellbeing.” Node size is proportional to the frequency of terms in 54 selected abstracts. Line thickness is proportional to the co-occurrence of two terms in any phrase of three words (e.g., subjective well-being, economics of well-being, well-being and development).

On the other hand, there is some intimation in our web that the notion of well-being cannot be captured entirely by reference to individuals alone and that there are other dimensions to the concept—that well-being is the outcome or product of, say, access to reasonable incomes, to safe environments, to “development,” and to health and welfare. It is a vision hinted at by the inclusion of those very terms in the network. These different concepts necessarily give rise to important differences concerning how well-being is identified and measured and therefore what policies are most likely to advance well-being. In the first kind of conceptualization, we might improve well-being merely by dispensing what Huxley referred to as “soma” (a super drug that ensured feelings of happiness and elation); in the other case, however, we would need to invest in economic, human, and social capital as the infrastructure for well-being. In any event and even at this nascent level, we can see how content analysis can begin to tease out conceptual complexities and theoretical positions in what is otherwise routine textual data.

Putting the Content of Documents in Their Place

I suggested in my introduction that CTA was a method of analysis—not a method of data collection nor a form of research design. As such, it does not necessarily inveigle us into any specific forms of either design or of data collection, though designs and methods that rely on quantification are dominant. In this closing section, however, I want to raise the issue as to how we should position a study of content in our research strategies as a whole. For we need to keep in mind that documents and records always exist in a context, and that while what is “in” the document may be considered central, a good research plan can often encompass a variety of ways of looking at how content links to context. Hence in what follows I intend to outline how an analysis of content might be combined with other ways of looking at a record or text, and even how the analysis of content might even be positioned as secondary to an examination of a document or record. The discussion calls upon a much broader analysis as presented in Prior (2011) .

I have already stated that basic forms of CTA can serve as an important point of departure for many different types of data analysis—for example, as discourse analysis. Naturally, whenever “discourse” is invoked, there is at least some recognition of the notion that words might actually play a part in structuring the world rather than merely reporting on it or describing it (as is the case with the 2002 State of the Nation address that was quoted in Section “Units of Analysis”). Thus, for example, there is a considerable tradition within social studies of science and technology for examining the place of scientific rhetoric in structuring notions of “nature” and the position of human beings (especially as scientists) within nature (see, e.g., work by Bazerman, 1988 ); Gilbert & Mulkay, 1984 ; and Kay, 2000 ). Nevertheless, little if any of that scholarship situates documents as anything other than as inert objects, either constructed by or waiting patiently to be activated by scientists.

However, in the tradition of the ethnomethodologists ( Heritage, 1991 ) and some adherents of discourse analysis, it is also possible to argue that documents might be more fruitfully approached as a “topic” ( Zimmerman and Pollner; 1971 ) rather than a “resource” (to be scanned for content), in which case the focus would be on the ways in which any given document came to assume its present content and structure. In the field of documentation, these latter approaches are akin to what Foucault (1970) might have called an “archaeology of documentation” and are well represented in studies of such things as how crime, suicide, and other statistics and associated official reports and policy documents are routinely generated. That too is a legitimate point of research focus, and it can often be worth examining the genesis of, say, suicide statistics or statistics about the prevalence of mental disorder in a community as well as using such statistics as a basis for statistical modeling.

Unfortunately, the distinction between topic and resource is not always easy to maintain—especially in the hurly-burly of doing empirical research (see, e.g., Prior, 2003 ). Putting an emphasis on “topic,” however, can open up a further dimension of research, and that concerns the ways in which documents function in the everyday world. And as I have already hinted, when we focus on function, it becomes apparent that documents serve not merely as containers of content but very often as active agents in episodes of interaction and schemes of social organization. In this vein, one can begin to think of an ethnography of documentation. Therein, the key research questions revolve around the ways in which documents are used and integrated into specific kinds of organizational settings, as well as with how documents are exchanged and how they circulate within such settings. Clearly, documents carry content—words, images, plans, ideas, patterns, and so forth—but the manner in which such material is actually called upon and manipulated, and the way in which it functions, cannot be determined (though it may be constrained) by an analysis of content. Thus, Harper’s (1998) study of the use of economic reports inside the International Monetary Fund provides various examples of how “reports” can function to both differentiate and cohere work groups. In the same way. Henderson (1995) illustrates how engineering sketches and drawings can serve as what she calls conscription devices on the workshop floor.

Of course, documents constitute a form of what Latour (1986) would refer to as “immutable mobiles,” and with an eye on the mobility of documents, it is worth noting an emerging interest in histories of knowledge that seek to examine how the same documents have been received and absorbed quite differently by different cultural networks (see, e.g., Burke, 2000 ). A parallel concern has arisen with regard to the newly emergent “geographies of knowledge” (see, e.g., Livingstone, 2005 ). In the history of science, there has also been an expressed interest in the biography of scientific objects ( Latour, 1987 :262) or of “epistemic things” ( Rheinberger, 2000 )—tracing the history of objects independent of the “inventors” and “discoverers” to which such objects are conventionally attached. It is an approach that could be easily extended to the study of documents and is partly reflected in the earlier discussion concerning the meaning of the concept of well-being. Note how in all of these cases a key consideration is how words and documents as “things” circulate and translate from one culture to another; issues of content are secondary.

Clearly, studying how documents are used and how they circulate can constitute an important area of research in its own right. Yet even those who focus on document use can be overly anthropocentric and subsequently overemphasize the potency of human action in relation to written text. In that light, it is interesting to consider ways in which we might reverse that emphasis and instead to study the potency of text and the manner in which documents can influence organizational activities as well as reflect them. Thus Dorothy Winsor (1999) has, for example, examined the ways in which work orders drafted by engineers not only shape and fashion the practices and activities of engineering technicians but construct “two different worlds” on the workshop floor.

In light of this, I will suggest a typology (Table 18.3 ) of the ways in which documents have come to be and can be considered in social research.

While accepting that no form of categorical classification can capture the inherent fluidity of the world, its actors, and its objects, Table 18.3 aims to offer some understanding of the various ways in which documents have been dealt with by social researchers. Thus approaches that fit into cell 1 have been dominant in the history of social science generally. Therein documents (especially as text) have been analyzed and coded for what they contain in the way of descriptions, reports, images, representations, and accounts. In short, they have been scoured for evidence. Data-analysis strategies concentrate almost entirely on what is in the “text” (via various forms of content analysis). This emphasis on content is carried over into cell 2 type approaches with the key differences that analysis is concerned with how document content comes into being. The attention here is usually on the conceptual architecture and socio-technical procedures by means of which written reports, descriptions, statistical data, and so forth are generated. Various kinds of discourse analysis have been used to unravel the conceptual issues, while a focus on socio-technical and rule-based procedures by means of which clinical, police, social work, and other forms of records and reports are constructed has been well represented in the work of ethnomethodologists ( see Prior, 2011 ). In contrast, and in cell 3, the research focus is on the ways in which documents are called upon as a resource by various and different kinds of “user.” Here concerns with document content or how a document has come into being are marginal, and the analysis concentrates on the relationship between specific documents and their use or recruitment by identifiable human actors for purposeful ends. I have already pointed to some studies of the latter kind in earlier paragraphs (e.g., Henderson, 1995 ). Finally, the approaches that fit into cell 4 also position content as secondary. The emphasis here is on how documents as “things” function in schemes of social activity and with how such things can drive, rather than be driven by, human actors. In short, the spotlight is on the vita activa of documentation, and I have provided numerous example of documents as actors in other publications (see Prior, 2003 ; 2008 ; 2011 ).

Content analysis was a method originally developed to analyze mass media “messages” in an age of radio and newspaper print, and well before the digital age. Unfortunately, it struggles to break free of its origins and continues to be associated with the quantitative analysis of “communication.” Yet as I have argued, there is no rational reason why its use has to be restricted to such a narrow field, for it can be used to analyze printed text and interview data (as well as other forms of inscription) in various settings. What it cannot overcome is the fact that it is a method of analysis and not a method of data collection. However, as I have shown, it is an analytical strategy that can be integrated into a variety of research designs and approaches—cross-sectional and longitudinal survey designs, ethnography and other forms of qualitative design, and secondary analysis of pre-existing data sets. Even as a method of analysis it is flexible and can be used either independent of other methods or in conjunction with them. As we have seen, it is easily merged with various forms of discourse analysis and can be used as an exploratory method or as a means of verification. Above all, perhaps, it crosses the divide between “quantitative” and “qualitative” modes of inquiry in social research and offers a new dimension to the meaning of mixed-methods research. I recommend it.

Source : Prior (2008) .

Abbott, A. ( 1992 ). What do cases do? In C. C. Ragin , and H. S. Becker (Eds.). What is a case? Exploring the foundations of social inquiry . Cambridge: Cambridge University Press, 53–82.

Google Scholar

Google Preview

Altheide, D. L. ( 1987 ). Ethnographic Content Analysis.   Qualitative Sociology , 10 (1): 65–77.

Arksey H , O’Malley L. ( 2005 ). Scoping studies: Towards a Methodological Framework.   International Journal of Sociological Research Methodology , 8 : 19–32.

Babbie, E. ( 2013 ). The practice of social research. 13th ed . Belmont, CA: Wadsworth.

Bazerman, C. ( 1988 ). Shaping written knowledge. The genre and activity of the experimental article in science . Madison, WI: University of Wisconsin Press.

Becker, G. ( 1997 ). Disrupted lives. How people create meaning in a chaotic world . London: University of California Press.

Berelson, B. ( 1952 ). Content analysis in communication research . Glencoe, IL: Free Press.

Bowker, G. C. and Star, S. L. ( 1999 ). Sorting things out. Classification and its consequences . Cambridge, MA: MIT Press.

Braun, V. , Clarke, V. ( 2006 ). Using Thematic Analysis in Psychology.   Qualitative Research in Psychology , 3 : 77–101.

Breuer, J. , Freud, S. ( 2001 ). Studies on Hysteria. In Strachey, L. (Ed.). The standard edition of the complete psychological works of Sigmund Freud . Vol. 2 . London: Vintage.

Bryman, A. ( 2008 ). Social research methods . 3rd Ed. Oxford: Oxford University Press.

Burke, P. ( 2000 ). A social history of knowledge. From Guttenberg to Diderot . Cambridge: Polity Press.

Bury, M. ( 1982 ). Chronic illness as biographical disruption.   Sociology of Health and Illness , 4 : 167–182.

Carley, K. ( 1993 ). Coding choices for textual analysis. A comparison of content analysis and map analysis.   Sociological Methodology , 23 : 75–126.

Charon, R. ( 2006 ). Narrative medicine. Honoring the stories of illness . New York: Oxford University Press.

Creswell, J. W. ( 2007 ). Designing and conducting mixed methods research . Thousand Oaks, CA: Sage.

Davison, C. , Davey-Smith, G. , Frankel, S. ( 1991 ). Lay epidemiology and the prevention paradox.   Sociology of Health & Illness , 13 (1): 1–19.

Evans, M. , Prout, H. , Prior, L. , Tapper-Jones, L. , Butler, C. ( 2007 ). A qualitative Study of Lay Beliefs about Influenza,   British Journal of General Practice , 57 :352–358.

Foucault, M. ( 1970 ). The Order of things. An archaeology of the human sciences . London: Tavistock.

Frank, A. ( 1995 ). The wounded storyteller: Body, illness, and ethics . Chicago: University of Chicago Press.

Gerring, J. ( 2004 ). What is a case study, and what is it good for?   The American Political Science Review , 98 (2): 341–354.

Gilbert, G.N. , Mulkay, M. ( 1984 ). Opening Pandora’s box. A sociological analysis of scientists’ discourse . Cambridge: Cambridge University Press.

Glaser, B.G. , Strauss, A.L. ( 1967 ). The discovery of grounded theory. Strategies for qualitative research . New York: Aldine De Gruyter.

Goode, W. J. , Hatt, P. K. ( 1952 ). Methods in social research . New York: McGraw-Hill.

Greimas, A. J. ( 1970 ). Du Sens. Essays sémiotiques . Paris: Ėditions du Seuil.

Habermas, J. ( 1987 ). The theory of communicative action. Vol.2. A critique of functionalist reason . ( T. McCarthy , trans.). Cambridge: Polity Press.

Harper, R. ( 1998 ). Inside the IMF. An ethnography of documents, technology, and organizational action . London: Academic Press.

Henderson, K. ( 1995 ). The political career of a prototype. Visual representation in design engineering,   Social Problems , 42 (2): 274–299.

Heritage, J. ( 1991 ). Garkfinkel and ethnomethodology . Cambridge. Polity Press.

Hydén, L-C. ( 1997 ). ‘ Illness and narrative ’, Sociology of Health & Illness , 19 (1): 48–69.

Kahn, R. , Cannell, C. ( 1957 ). The dynamics of interviewing. Theory, technique and cases . New York: Wiley.

Kay, L. E. ( 2000 ). Who wrote the book of life? A history of the genetic code . Stanford, CA: Stanford University Press.

Kleinman, A. , Eisenberg, L. , Good, B. ( 1978 ). Culture, illness & care, clinical lessons from anthropologic and cross-cultural research.   Annals of Internal Medicine , 88 (2): 251–258.

Kracauer, S. ( 1952 ). The Challenge of Qualitative Content Analysis’,   Public Opinion Quarterly, Special Issue on International Communications Research (1952–53) , 16 ( 4 ): 631–642.

Krippendorf, K. ( 2004 ). Content Analysis: An introduction to its methodology, 2nd ed . Thousand Oaks, CA: Sage Publications.

Latour, B. ( 1986 ). Visualization and Cognition: Thinking with Eyes and Hands,   Knowledge and Society, Studies in Sociology of Culture, Past and Present , 6 : 1–40.

Latour, B. ( 1987 ). Science in Action. How to Follow Scientists and Engineers through Society . Milton Keynes: Open University Press.

Livingstone, D. N. ( 2005 ). Text, talk, and testimony: geographical reflections on scientific habits. An afterword,   British Society for the History of Science . 38 (1): 93–100.

Luria, A.R. ( 1975 ). The man with the shattered world. A history of a brain wound . (Trans. L. Solotaroff ). Harmondsworth: Penguin.

Martin, A. , and Lynch, M. ( 2009 ). Counting things and counting people: The practices and politics of counting,   Social Problems , 56 (2): 243–266.

Merton, R.K. ( 1968 ). Social theory and social structure . New York: Free Press.

Morgan, D. L. ( 1993 ). Qualitative content analysis. A guide to paths not taken,   Qualitative Health Research , 2 : 112–121.

Morgan, D. L. ( 1998 ). Practical Strategies for combining qualitative and quantitative methods,   Qualitative Health Research , 8 (3): 362–376.

Morris, P. G. , Prior, L. , Deb, S. , Lewis, G. , et al. ( 2005 ). Patients’ views on outcome following head injury: a qualitative study,   BMC Family Practice , 6 :30.

Neuendorf, K. A. ( 2002 ). The content analysis guidebook . Thousand Oaks: CA: Sage.

Newman. J , and Vidler. E. ( 2006 ). Discriminating customers, responsible patients, empowered users: consumerism and the modernisation of health care,   Journal of Social Policy , 35 (2): 193–210.

Office for National Statistics ( 2012 ) First ONS Annual Experimental Subjective Well-being Results . London: ONS. Available at: http://www.ons.gov.uk/ons/dcp171766_272294.pdf . Accessed July 2013.

Prior, L. ( 2003 ). Using documents in social research . London: Sage.

Prior, L. ( 2008 ). Repositioning Documents in Social Research.   Sociology. Special Issue on Research Methods , 42 : 821–836.

Prior, L. ( 2011 ). Using documents and records in social research . 4 Vols . London: Sage.

Prior, L.   Hughes, D. , Peckham, S. ( 2012 ) The discursive turn in policy analysis and the validation of policy stories,   Journal of Social Policy , 41 (2): 271–289.

Prior, L. , Evans, M. , Prout, H. ( 2011 ). Talking about colds and flu: The lay diagnosis of two common illnesses among older British people,   Social Science and Medicine , 73 : 922–928.

Ragin, C. C. , Becker, H. S. ( 1992 ). What is a case? Exploring the foundations of social inquiry . Cambridge: Cambridge University Press.

Rheinberger H.-J. , ( 2000 ). Cytoplasmic Particles. The Trajectory of a Scientific Object. In Daston, L. (Ed.). Biographies of scientific objects . Chicago: Chicago University Press, 270–294.

Ricoeur, P. ( 1984 ). Time and narrative . Vol. 1 . ( McLaughlin K. , Pellauer D. trans.) Chicago: University of Chicago Press.

Roe, E. ( 1994 ). Narrative policy analysis, theory and practice . Durham, NC: Duke University Press.

Ryan, G.W. , Bernard, H. R. ( 2000 ). Data management and analysis methods. In Denzin, N.K. , Lincoln, Y.S. (Eds.). Handbook of qualitative research . 2nd ed . Thousand Oaks, CA: Sage, 769–802.

Schutz, A. , Luckman, T. ( 1974 ). The structures of the life-world . ( Zaner, R. M. , Engelhardt, H.T. , trans.). London: Heinemann.

SPSS. ( 2007 ). Text Mining for Clementine . 12.0 User’s Guide. Chicago: SPSS.

Weber, R.P. ( 1990 ). Basic content analysis . Newbury Park: CA: Sage.

Winsor, D. ( 1999 ). Genre and activity systems. The role of documentation in maintaining and changing engineering activity systems.   Written Communication , 16 (2): 200–224.

Zimmerman, D. H. , Pollner, M. ( 1971 ). The everyday world as a phenomenon. In Douglas, J. D. (Ed). Understanding everyday life . London: Routledge and Kegan Paul, 80–103.

  • About Oxford Academic
  • Publish journals with us
  • University press partners
  • What we publish
  • New features  
  • Open access
  • Institutional account management
  • Rights and permissions
  • Get help with access
  • Accessibility
  • Advertising
  • Media enquiries
  • Oxford University Press
  • Oxford Languages
  • University of Oxford

Oxford University Press is a department of the University of Oxford. It furthers the University's objective of excellence in research, scholarship, and education by publishing worldwide

  • Copyright © 2024 Oxford University Press
  • Cookie settings
  • Cookie policy
  • Privacy policy
  • Legal notice

This Feature Is Available To Subscribers Only

Sign In or Create an Account

This PDF is available to Subscribers Only

For full access to this pdf, sign in to an existing account, or purchase an annual subscription.

research methodology content analysis

The Ultimate Guide to Qualitative Research - Part 2: Handling Qualitative Data

research methodology content analysis

  • Handling qualitative data
  • Transcripts
  • Field notes
  • Survey data and responses
  • Visual and audio data
  • Data organization
  • Data coding
  • Coding frame
  • Auto and smart coding
  • Organizing codes
  • Qualitative data analysis
  • Introduction

What is meant by content analysis?

Quantitative content analysis, practical examples of quantitative content analysis, using atlas.ti for content analysis.

  • Thematic analysis
  • Thematic analysis vs. content analysis
  • Narrative research
  • Phenomenological research
  • Discourse analysis
  • Grounded theory
  • Deductive reasoning
  • Inductive reasoning
  • Inductive vs. deductive reasoning
  • Qualitative data interpretation
  • Qualitative analysis software

Content analysis

Qualitative data collection usually leads to a strictly qualitative data analysis , but that need not always be the case. If a required analysis involves quantifying data , there are a number of data organization and data analysis methods that might be helpful in giving structure to raw data for frequency or statistical analysis.

research methodology content analysis

This part of the guide will explore the idea of quantitative content analysis. Where quantitative analysis is useful, there are tools in qualitative data analysis software like ATLAS.ti that can reorganize your data for a content analysis that can supplement your use of qualitative research methods. Let's explore content analysis by providing a brief overview of this approach, then by looking at the quantitative aspects of content analysis.

Content analysis, in its simplest form, is a research method for interpreting and quantifying textual data , such as speeches, interviews , articles, social media posts , and so on. It allows researchers to sift through large volumes of data to identify patterns, themes, or biases and turn these into quantifiable variables that can be further analyzed.

At its core, content analysis combines elements of both qualitative and quantitative research methods . The method itself is systematic and replicable, aiming to condense a significant amount of text into fewer content categories based on explicit rules of coding. Yet, the interpretive component of understanding the context, nuances, and underlying meanings of the content being analyzed remains essential, borrowing heavily from qualitative research traditions.

This flexibility makes content analysis a versatile research approach applicable to numerous disciplines, such as communication, marketing, sociology, psychology, and political science, among others. Its uses range from studying cultural shifts over time, media representation of specific groups, political speeches, sentiments expressed in social media, and much more.

Differences from other research methods

The uniqueness of content analysis primarily stems from its ability to convert qualitative textual data into quantitative data , which can then be systematically examined. This capability sets it apart from many other research methodologies, each of which has its strengths and weaknesses.

Content analysis offers a less intrusive way of understanding a subject matter or phenomenon than more interpretive approaches. Unlike with an ethnographic or observational approach , there's no direct involvement with the study's subjects. Instead, the researcher examines texts and communications to uncover patterns, themes, or biases. This can be especially advantageous when researching sensitive topics or populations that are difficult to access.

Contrasting with quantitative methods associated with surveys and experiments, content analysis allows for a more contextual and nuanced understanding of data. While surveys and experiments can yield numerical data about attitudes, behaviors, and opinions, they often lack depth and fail to capture the richness of subjective experiences. Content analysis, on the other hand, provides more depth by enabling the researcher to delve into the intricacies of language and communication.

In comparison to discourse analysis , another method for studying a text, content analysis is more focused on the manifest content - the actual text - rather than the underlying discourses or power dynamics. Discourse analysis typically explores the relationships between text, context, and societal structures.

Lastly, unlike thematic analysis, which identifies, analyzes, and reports themes within data, content analysis goes a step further by transforming these themes into measurable variables. This quantification allows researchers to perform statistical analyses, giving content analysis an edge in examining the relationships between variables.

In essence, content analysis straddles the line between qualitative and quantitative methodologies, extracting the best of both worlds. It allows researchers to maintain the depth and richness of qualitative data while taking advantage of the numerical robustness of quantitative analysis. This makes content analysis a valuable addition to the researcher's toolkit.

Advantages of content analysis

Content analysis offers several advantages that make it a valuable tool for researchers in various disciplines. These advantages extend across its methodological flexibility, analytical depth, and practical adaptability.

  • Methodological flexibility : Content analysis allows for both qualitative and quantitative research, enabling researchers to explore themes in-depth while also making quantifiable comparisons. It's a versatile method, adaptable to a variety of research questions and data sources.
  • Rich, in-depth analysis : Content analysis provides a rich, textured understanding of data. Coding and categorizing allow researchers to delve into the complexities of language and communication, exploring nuanced meanings and connotations.
  • Unobtrusive method : As content analysis involves studying existing texts and communications, it is an unobtrusive method that does not require interaction with research participants. This can make it an excellent choice for sensitive research topics.
  • Ability to handle large data sets : Content analysis can manage large volumes of textual data, making it suitable for studies involving extensive texts or long timeframes. As we will see later in this section, the coding process in a content analysis approach can thus be relatively more straightforward.
  • Replicability : The systematic nature of content analysis lends itself to replicability. By creating explicit rules for coding and categorizing, other researchers can reproduce the study, enhancing its reliability.
  • Longitudinal analysis : Content analysis allows for longitudinal studies , as it can examine texts and communication over extended periods. This ability can be invaluable for tracking changes and trends over time.
  • Cost-effective : Compared to many other research methods , content analysis can be a cost-effective approach. Since it primarily involves analyzing existing texts, it often requires fewer resources than methods involving primary data collection .

The flexibility, depth, and practicality of content analysis make it a powerful tool for answering a range of research questions. Despite some limitations, which we will explore in the next section, the advantages of content analysis often make it an appealing choice for researchers.

Disadvantages of content analysis

While content analysis is a valuable tool, it's essential to acknowledge its limitations. These include:

  • Dependence on the quality of source materials : Content analysis relies on the quality of the source materials. If the documents or texts used for analysis are biased, incomplete, or inaccurate, it can lead to skewed results.
  • Contextual understanding : Texts often derive their meaning from context. Isolating texts for analysis can sometimes result in the loss of crucial contextual information, which may affect the overall interpretation of the results.
  • Coding and categorization limitations: The process of coding and categorizing can be time-consuming and prone to bias or error, potentially affecting the reliability and validity of the results.
  • Lack of depth compared to other qualitative methods : While content analysis allows for in-depth analysis, it may not reach the same level of depth as methods such as interviews or participant observations, particularly when exploring participants' feelings, thoughts, or motivations.
  • Difficulty in establishing causality : Content analysis can identify patterns and associations in the data but establishing causality can be challenging due to its primarily descriptive nature. As a result, conducting conceptual and relational analysis can prove challenging.
  • Focus on manifest content : Content analysis typically focuses on manifest content - the visible, surface content. Latent content, which refers to the underlying meanings or connotations, can sometimes be overlooked, limiting the depth of analysis.

Despite these limitations, with careful consideration and thoughtful application, content analysis remains a useful method. Understanding its potential drawbacks helps researchers apply the method more effectively and interpret their findings with due consideration. The next section will introduce qualitative content analysis, a specific type of content analysis that, while sharing some of the limitations mentioned here, offers unique advantages of its own.

What is qualitative content analysis?

Qualitative content analysis is a specific type of content analysis that primarily focuses on the interpretation and understanding of textual data. While it shares some similarities with its quantitative counterpart—such as the use of systematic and replicable methods—qualitative content analysis tends to dive deeper into the nuances, meanings, and contexts of the data.

At the heart of a qualitative analysis is the process of categorizing and coding data to identify patterns, themes, and relationships. The categories are usually derived inductively—that is, they emerge from the data itself rather than being pre-established. This approach offers a higher degree of flexibility and is especially beneficial when exploring a new or under-researched area.

An excellent example of the application of qualitative content analysis can be seen in qualitative health research. Consider a study examining patients' experiences with a chronic disease, such as diabetes. Here, qualitative content analysis would not only identify and categorize themes related to the disease experience, such as challenges in managing the condition, the impact on daily life, or interactions with healthcare professionals. It could also delve into the patients' psychological or emotional state regarding the management of their condition, as well as their attitudinal and behavioral responses to their condition and the healthcare system. For instance, the analysis might uncover feelings of frustration or resignation, proactive strategies for disease management, or attitudes toward healthcare advice.

Another distinctive characteristic of qualitative content analysis is its emphasis on context. Rather than viewing data in isolation, it considers the broader context in which the communication occurs. It takes into account aspects like the social, cultural, and historical background, the intention of the speaker, and the perception of the audience. This contextual understanding provides a richer, more nuanced analysis.

Also noteworthy is the iterative nature of qualitative content analysis. The process of coding, categorizing, and interpreting the data is not linear but recursive. As the analysis progresses, the researcher may revise the coding scheme, refine categories, and re-interpret the data, gradually enhancing the depth and precision of the analysis.

While qualitative content analysis provides an in-depth understanding of textual data, it can be more time-consuming and require more interpretative skill than quantitative content analysis. However, as we will explore in the next sections, both methods have their unique strengths and can complement each other in providing a comprehensive understanding of the data.

research methodology content analysis

Textual analysis made easy with ATLAS.ti

Get the deepest insights through analysis with ATLAS.ti. Download a free trial today.

Having explored content analysis in its broad scope and delved into qualitative analysis methods behind content analysis, we now shift our focus to quantitative content analysis. This approach retains the systematic, objective nature of content analysis but introduces a more numerical, count-based method of analyzing textual data . As such, it stands at the intersection of qualitative and quantitative research paradigms, offering the opportunity to transform the same data used in a qualitative analysis into a form that can be statistically analyzed.

In the subsequent subsections, we will define this research technique, detail the steps involved in its implementation, discuss its benefits and limitations, and illustrate its practical application with some examples. By the end of this section, you should have a solid understanding of quantitative content analysis and its role in your research toolkit.

Defining quantitative content analysis

This research approach, also known as deductive or 'classical' content analysis, is used to quantify patterns in textual data. This approach systematically transforms a text into numerical data, allowing for statistical analysis. This means that the content is categorized and counted to provide an objective, quantifiable overview of its characteristics.

Quantitative content analysis is predominantly concerned with manifest content—the visible, obvious components of the text. It examines what the text explicitly says rather than delving into possible latent meanings or underlying connotations. The text's elements—such as words, phrases, sentences, or specific themes—are coded into predefined categories, and the frequency of these categories is then quantified. This quantification allows for a more precise and broad-scale analysis of the data.

It's important to note that while quantification is a fundamental aspect of this approach, quantitative analysis still involves an element of interpretation. For instance, the development of coding schemes and the categorization of data require the researcher to understand and interpret the content. As such, even though it's labeled as 'quantitative,' this approach maintains a crucial qualitative component.

Despite this, the predominant focus of a quantitative approach is on the numerical, allowing it to provide a structured, replicable, and count-based exploration of textual data. The value of this approach lies in its ability to deliver an empirical, data-driven understanding of the content, enabling researchers to make statistical inferences and comparisons. In the next subsection, we will discuss the steps involved in conducting quantitative content analysis.

Steps in conducting quantitative analysis

The process typically involves several key steps:

  • Define the research question : The research question should be suitable for a quantitative approach. It should examine the frequency or patterns of certain aspects in a body of text.
  • Select the sample : Based on the research question, decide what texts to analyze. The texts could be anything from newspaper articles, social media posts , and speeches to transcripts of interviews or focus groups . Make sure to define a clear and replicable strategy for sample selection.
  • Define categories and develop a coding scheme : This step involves identifying the aspects of the text you are interested in and developing a set of categories to classify these aspects. Each category should be clearly defined, mutually exclusive, and collectively exhaustive.
  • Pilot-test the coding scheme : Before you start the actual analysis, it is advisable to pilot-test the coding scheme on a smaller subset of the sample. This helps ensure that your categories cover all relevant aspects of the content and that the coding scheme is reliable.
  • Code the content : In this step, the selected content is coded according to the coding scheme. Each part of the content that corresponds to a category is counted as a 'unit.' The units could be individual words, phrases, sentences, paragraphs, or even entire documents, depending on the research question and the nature of the categories.
  • Analyze and interpret the data : The coded data is then analyzed, often using statistical methods. You can calculate the frequencies of each category, compare frequencies between different parts of the text or different texts, or examine the relationships between categories. The analysis should be linked back to the research question and the wider context of the research.
  • Present the findings : Finally, the findings are reported in a clear and comprehensible manner, often using tables or graphs to display the frequencies of categories. It's also important to discuss the findings in the context of the research question and existing literature.

These steps provide a general framework for conducting quantitative content analysis. However, depending on the specifics of your research project, you may need to adapt or expand on these steps. For instance, if your research involves a large volume of text or multiple coders, you may need to include additional steps to ensure the consistency and reliability of the coding process.

With the process outlined above, here are a few practical examples illustrating a quantitative application of content analysis.

One common application of quantitative content analysis is in media studies. For instance, a researcher might use it to examine the representation of gender roles in a sample of popular movies. The researcher could define a set of categories reflecting different aspects of gender representation, such as the occupation, behaviors, or speech of male and female characters. By coding and quantifying these categories, the researcher could provide an empirical, data-driven analysis of gender representation in movies.

In political science, a researcher might use quantitative content analysis to analyze politicians' speeches. For example, they could examine the frequency of certain themes or keywords to gain insights into a politician's focus areas or ideological leanings. This approach allows for a systematic, objective assessment of political communication.

In health research, quantitative content analysis could be used to analyze patient reviews of healthcare providers. Categories could be developed to capture aspects like the quality of care, communication skills, waiting times, etc. By coding and quantifying these categories, the researcher could identify patterns and trends in patient satisfaction.

These examples illustrate the breadth of applications for quantitative content analysis. Whether you're exploring social issues, political discourses, customer reviews, or any other type of textual data, quantitative content analysis provides a method for systematically coding, categorizing, and quantifying your data. By offering a way to transform qualitative data into a form that can be statistically analyzed, it adds a valuable tool to your research toolkit.

ATLAS.ti is particularly useful to researchers who want to conduct content analysis from both quantitative and qualitative approaches . For research inquiries that rely more on interpretation to identify patterns and abundance in the data, then thematic analysis may be more appropriate for your study.

On the other hand, when you are relying on counting words or phrases to determine key insights, a quantitative approach to content analysis will be a useful component of your study's methodology. To facilitate your analysis, a number of tools in ATLAS.ti will provide you with the ability to conduct a quantitative inquiry.

Word Frequencies and Concepts

A word cloud is a common but meaningful visualization in qualitative research , as it shows what words appear more often than others. The greater the frequency of a word, the closer to the center of the cloud that word is placed. While a word cloud relies on statistics, it presents the analysis in a visual manner that allows your research audience to quickly grasp the meaning.

research methodology content analysis

ATLAS.ti's Word Cloud tool determines the frequencies and creates the visualization quickly and easily. All the researcher needs to do is select the documents they want to analyze. They can then refine their word cloud by including or excluding certain classes of words, such as adverbs or determiners, or by setting a required minimum frequency for the word to appear in the cloud.

The Concepts tool works similarly to Word Clouds, except it relies on collocations of words to determine which phrases are more prevalent in your data than others.

research methodology content analysis

Once the researcher selects the data they want to analyze, the words included in the most common concepts will appear in a visualization resembling a word cloud. Hovering over any of these words will show which phrases are relevant to that word and where those phrases can be found in the data. This allows the researcher to look at the phrase in context and add codes as necessary.

Text Search

Most people are familiar with a text search function in a word processor or a web browser. ATLAS.ti's Text Search tool has a similar search capability but also employs language models developed through machine learning to help you expand your search quickly and efficiently.

When entering a word to search, the researcher can also choose from a list of synonyms they can include in their search. In research on sustainability, for example, the words "preserve" and "save" might be similar enough to be included in one inquiry. As a result, ATLAS.ti allows the researcher to choose related words relevant to their search.

research methodology content analysis

Searching for inflected forms is also important to a quantitative approach to content analysis. Given that "preserves," "preserving," and "preservation" all come from the word "preserve," it's only appropriate to include them in one search. The option in ATLAS.ti to search for inflected forms makes it easy to search the data for all possible versions of a word. And in Text Search, all results can be easily coded so those codes can be used in content analysis.

Insightful content analysis with ATLAS.ti

Download a free trial of ATLAS.ti to get the most out of your data.

No internet connection.

All search filters on the page have been cleared., your search has been saved..

  • All content
  • Dictionaries
  • Encyclopedias
  • Expert Insights
  • Foundations
  • How-to Guides
  • Journal Articles
  • Little Blue Books
  • Little Green Books
  • Project Planner
  • Tools Directory
  • Sign in to my profile My Profile

Not Logged In

  • Sign in Signed in
  • My profile My Profile

Not Logged In

  • FOUNDATION ENTRY Content Analysis, Qualitative
  • FOUNDATION ENTRY Computer-Assisted Text Analysis
  • FOUNDATION ENTRY Social Network Analysis
  • FOUNDATION ENTRY Natural Language Processing
  • FOUNDATION ENTRY Social Media Data: Quantitative Analysis
  • FOUNDATION ENTRY Big Data
  • FOUNDATION ENTRY Content Analysis, Quantitative
  • FOUNDATION ENTRY Narrative Analysis, Quantitative
  • FOUNDATION ENTRY Exponential Random Graph Modelling
  • FOUNDATION ENTRY Social Media Analysis
  • FOUNDATION ENTRY Popular Music

Discover method in the Methods Map

Content analysis, qualitative.

  • By: Margrit Schreier | Edited by: Paul Atkinson, Sara Delamont, Alexandru Cernat, Joseph W. Sakshaug & Richard A.Williams
  • Publisher: SAGE Publications Ltd
  • Publication year: 2019
  • Online pub date: September 17, 2019
  • Discipline: Anthropology , Business and Management , Communication and Media Studies , Computer Science , Counseling and Psychotherapy , Criminology and Criminal Justice , Economics , Education , Engineering , Geography , Health , History , Marketing , Mathematics , Medicine , Nursing , Political Science and International Relations , Psychology , Social Policy and Public Policy , Science , Social Work , Sociology , Technology
  • Methods: Content analysis , Qualitative data analysis
  • Length: 10k+ Words
  • DOI: https:// doi. org/10.4135/9781526421036753373
  • Online ISBN: 9781529745276 More information Less information
  • What's Next

This entry focuses on qualitative content analysis as a rule-guided method for describing and conceptualizing the meaning of qualitative data. Following a brief introduction to core characteristics of the method, the history of the method is described, including its origins in the quantitative version of the method as well as the divergent history of qualitative content analysis in the English and German literature. Next, core defining characteristics of the method are described, with a focus on qualitative content analysis being at once systematic, flexible, and reducing the amount of qualitative data. Based on these defining characteristics, different variants of qualitative content analysis are introduced and compared, such as deductive and inductive, thematic and formal, and type-building qualitative content analysis. The main part of the contribution focuses on describing and illustrating the steps in qualitative content analysis: deciding on a research question and selecting material; creating a preliminary version of the coding frame, including strategies for arriving at main categories and subcategories and how to define categories; piloting and modifying the coding frame, including a discussion of quality criteria, especially reliability and validity; the main coding phase (i.e., applying the coding frame to the entire material); and various strategies for presenting the results of qualitative content analysis. To conclude, recent developments concerning the method are described. These include attempts to strengthen the specifically qualitative elements of the method and discussing the role of qualitative content analysis in the context of big data and text mining.

Introduction

Qualitative content analysis is a rule-guided method for analysing qualitative data. This analysis is conducted by creating a coding frame consisting of categories that capture relevant meaning and assigning passages from the material to these categories. Categories can be developed either inductively, based on the material, or deductively, drawing on information such as theory, prior research, or an interview guide. Typically, a combination of data- and concept-driven approaches is used in creating categories. The method is suitable primarily for analysing textual data of various kinds, such as interviews, focus groups, documents, tweets, and the like, whereas its applicability to visual material is more limited. If the data have been collected as part of a study (e.g., interviews), the material is usually transcribed prior to conducting qualitative content analysis. The method helps to condense the data and typically provides an overview of core issues addressed and of key concepts. Qualitative content analysis is used widely across the social sciences—for example, in media and communication studies, educational research, psychology, sociology, political science, and nursing and health research.

Versions of qualitative content analysis differ in terms of how categories are created, which kinds of categories are created, which validation procedures are used, and how the results of the coding process are used. Most versions, however, share a common core that simultaneously sets them apart from other methods for qualitative data analysis: First, qualitative content analysis is not rooted in any specific ontology or epistemology, making it highly flexible and suited to a variety of research contexts and questions. Moreover, in applying the method, the researcher follows a number of steps, making the application systematic and providing researchers, especially novice researchers, with guidance. Finally, applying the method reduces and abstracts the material, providing an overview. This reductive nature of the method accounts for its popularity in analysing large amounts of data as they typically occur in qualitative research, such as interview transcripts.

This entry starts by outlining the history of qualitative content analysis, followed by a definition of the method and an overview of its different variants. The core of the contribution consists of a description of the steps in conducting qualitative content analysis. This is followed by an outline of current developments and discussions.

The Origins and Definition of Qualitative Content Analysis

The origins of qualitative content analysis.

Qualitative content analysis originates in the quantitative version of the method (on quantitative content analysis, see Krippendorff, 2012; Neuendorf, 2016). Quantitative content analysis in turn is rooted in communication studies: As newspapers began to appear in the second half of the 19th century, the quantitative analysis of newspaper content also gained ground. Three factors were crucial for the subsequent development of quantitative content analysis into a method proper during the first half of the 20th century: the rise of popularity of the social sciences; the advent of radio and television, accompanied by an increasing interest in advertising; and the concern with the analysis of Nazi propaganda in the context of the Second World War under the directorship of Harold Lasswell. All three factors contributed to increasing the theoretical grounding of content analysis and to widening the focus of the analysis. Whereas early quantitative newspaper analysis had limited itself to the analysis of media content, quantitative content analysis increasingly included the contexts of production and reception in the analysis (on the history of content analysis, see Krippendorff, 2012, chapter 1).

Based on these developments, Bernard Berelson (1952) published what would subsequently become the foundational textbook of quantitative content analysis in which he defined the method as “a research technique for the systematic, objective, and quantitative description of the manifest content of communication” (p. 18). This definition highlights some of the core features of the quantitative version of the method: The focus is on manifest content, limiting the amount of interpretation involved in determining textual meaning. It is systematic (i.e., carried out in an invariant manner regardless of the research question and the material). Along the same lines, it adheres to the ideal of objectivity (i.e., the results of the method are assumed to be independent of the researchers and coders). Finally, the results are presented in a quantitative format, typically providing the frequencies of selected words and themes.

Over the subsequent years, this original definition has been modified so as to include latent content, contextualizing media content, and focusing on the interrelations of themes more than on absolute frequencies. Nevertheless, the orientation towards providing systematic, intersubjective (as a proxy to objective), and quantitative descriptions has remained in place and continues to define the quantitative version of the method. Also, quantitative content analysis is situated within a deductive framework and employs mostly concept-driven categories, sometimes for hypothesis testing.

This quantitative type of content analysis was criticized by Siegfried Kracauer as early as 1952, the same year when Berelson published his textbook. Kracauer objected to both the focus on manifest content and on frequencies, arguing that meaning is complex, context dependent, and invariably requires some degree of interpretation and that frequency is only an imperfect indicator of importance. His line of reasoning was subsequently taken up by George (1959) who advocated a nonfrequency type of content analysis.

In spite of these early criticisms and suggestions for developing a qualitative version of the method, in the English-speaking literature qualitative content analysis has only recently gained prominence as a method in its own right (Bengtsson, 2016; Forman & Damschroeder, 2007; Hsie & Shannon, 2005; Mayring, 2000; 2014; Schreier, 2012; 2014). There are a number of reasons for this. First, with quantitative content analysis opening up towards latent content, the dividing lines between the quantitative and qualitative version of the method are not always clear. Also, some authors (e.g., Bernard & Ryan, 2016), when presenting what they term qualitative content analysis , actually describe quantitative content analysis. Second, there are different versions of qualitative content analysis, obscuring the profile of the method. Finally, the distinction between qualitative content analysis and other methods for qualitative data analysis is not always clear. Kuckartz (2014), for example, presents his textbook on qualitative content analysis under the generic heading of qualitative text analysis. There is also considerable overlap between qualitative content analysis and thematic analysis (Boyatzis, 1998; Vaismoradi, Jones, Turunen, & Snelgrove, 2016) and between qualitative content analysis and certain types of coding (e.g., emotion coding or descriptive coding, according to Saldana, 2016).

In Europe, however, especially in Germany, there exists a long-standing tradition of qualitative content analysis, taking up and elaborating on the criticism of the quantitative version of the method put forward by Kracauer, with Mayring (2000; 2014) as the major proponent of qualitative content analysis in Germany since the 1990s.

Definition of Qualitative Content Analysis

Considering the number of qualitative approaches to content analysis, it is not surprising that there has been some confusion about the exact definition and core of the method (Stamann, Janssen, & Schreier, 2016). A closer look shows, however, that differences between current approaches concern only the details of what emerges as a mostly invariant procedure (Schreier, 2014). At its core, qualitative content analysis is concerned with systematically describing and conceptualizing textual meaning that is at least partly latent and requires some degree of interpretation. Following the specification of a research question and data collection (these are sometimes considered part of the process of qualitative content analysis), this typically involves developing a coding frame, pilot testing and modifying the frame, applying it to all the material, and presenting and interpreting the results. When categories are developed inductively, the process is usually iterative, continuously revising the frame as additional material is read and examined for pertinent meaning. The prototype of this variant of qualitative content analysis has been generically presented as “qualitative content analysis” (e.g., Bengtsson, 2016; Elo et al., 2014; Schreier, 2012). It likewise corresponds to structured or summative content analysis according to Mayring (2014), thematic content analysis according to Kuckartz (2014), and conventional and directed content analysis according to Hsie and Shannon (2005).

The method is at once systematic, reduces material, and is flexible. It is systematic to the extent that all relevant material is included in the analysis, that a codebook explaining each category is required, and that at least part of the material is typically double-coded or examined by more than one researcher. By describing meaning based on a coding frame consisting of a number of interrelated categories, meaning is abstracted and thereby reduced. This entails the loss of individual meaning in each case (e.g., interview). At the same time, additional meaning and information is gained, since abstracting and conceptualizing meaning in terms of categories allows for a comparison across cases. Finally, qualitative content analysis is flexible insofar as categories are adapted to the research question and the material at hand. Unlike quantitative content analysis whereby generic coding frames have been developed, qualitative content analysis always involves some data-driven categories to ensure a good representation of the material. Whereas the core quality criterion in quantitative content analysis is reliability, qualitative content analysis emphasizes the validity at least as much or more so than the reliability of the coding frame. Reliability here refers to the extent to which the double coding of the same material using the same coding frame results in similar coding. Validity refers to the extent to which the coding frame is able to capture and adequately represent the meaning of the material.

Variants of Qualitative Content Analysis

The various types of qualitative content analysis differ primarily concerning the following aspects: how categories are generated, the types of categories, and the combination with other procedures.

As for how categories are developed, Hsieh and Shannon (2005) distinguish between conventional and directed qualitative content analysis, whereby the conventional variant refers to creating categories in a data-driven process and the directed variant refers to developing categories in a concept-driven way. A similar, though not identical, distinction is made by Mayring (2014) who differentiates between strategies for developing categories inductively (e.g., summarizing) and deductively (e.g., structuring). The distinction between developing categories inductively and deductively is common in the literature on qualitative content analysis, and some authors subsume both strategies of developing categories under the generic heading of qualitative content analysis.

The second distinction reflected in the different variants of qualitative content analysis refers to the types of categories or, putting it differently, to the features of the material that are described in those categories. The most common type of category is thematic (i.e., it reflects what the material is about). The term thematic is used in a broad sense here, as thematic categories are not limited to themes in the strict sense but often involve some degree of conceptual abstraction. A second type of category involves some degree of scaling (Mayring, 2014; termed evaluative content analysis by Kuckartz, 2014). Building on a prior thematic analysis, additional categories are created so as to capture the direction of an attitude or the extent to which an attitude is expressed or a feature is present in the material (e.g., not at all, somewhat, strongly). Mayring (2000) further mentions formal categories which refer to stylistic and other formal features of the material (e.g., the types of contributions to an argumentative discussion, such as advancing a new thesis or supporting someone else’s argument, or to the camera distance in analysing visual material: close-up, medium, long distance).

A final distinction is highlighted by what Kuckartz (2014) calls type-building qualitative content analysis (see also Mayring, 2014, chapter 6.6.2). Here, generic thematic qualitative content analysis is used and followed by a type-building procedure. Typologies and qualitative content analysis complement each other well. Types are based on the relationships between the categories of the coding frame. They reduce the findings of qualitative content analysis and thus the data even more, creating an easily accessible summary of core findings. At the same time, individual cases (e.g., individual interviewees) can be described in terms of the types (e.g., as typical cases or outliers) and presented in detail. This compensates for the reductive nature of qualitative content analysis.

Most types of qualitative content analysis in the literature are thus variations of thematic content analysis. But there are some exceptions. David Altheide, for example, coming from a qualitative media studies perspective, has developed what he terms ethnographic content analysis , later presented under the heading of “qualitative media analysis” (Altheide & Schneider, 2013). Jochen Gläser and Grit Laudel (2013) have suggested a purely deductive version of qualitative content analysis suitable in particular for research in political science.

Steps in Qualitative Content Analysis

Qualitative content analysis involves the following steps: (1) deciding on a research question and selecting material (sampling), (2) creating a preliminary version of the coding frame, (3) piloting and modifying the coding frame, (4) the main coding phase, and (5) presenting the findings. Up until the main coding phase, these steps are iterative—that is, the researcher will move back and forth between the material and the frame, modifying the frame until it fits the research question and represents the material well. All steps can be supported through the use of computer-assisted qualitative data analysis software for the analysis of qualitative data (Silver & Lewins, 2014).

Deciding on a Research Question and Selecting Material

This first step is not specific to qualitative content analysis. Any empirical study requires deciding on a research question and selecting or generating suitable data. Yet both aspects are worth stressing here as they are of particular relevance to qualitative content analysis compared to other qualitative research methods.

Many qualitative approaches are characterized by an open, flexible procedure that allows for adjusting the research question in the course of the study. This is typically not the case in qualitative content analysis. Qualitative content analysis entails analysing the material from the perspective of the research question. This means that the core of the research question should be in place at the beginning of the study. Some adjustment is possible and, depending on the material, can even be indicated. But this would merely involve modifying the question or including additional aspects of the topic, not a complete change of emphasis. In any case, deciding on a research question is crucial in qualitative content analysis, as the research question provides the angle from which the data are analysed and thus the starting point for developing the coding frame.

It has been argued that, unlike many other qualitative research methods, qualitative content analysis is not grounded in any specific approach or any specific ontology or epistemology (Forman & Damschroeder, 2007; Janssen, Stamann, Krug & Nägele, 2017). Therefore, the specific sampling strategies and corresponding restrictions, such as theoretical sampling in grounded theory methodology or the replication logic of the case study, do not apply to qualitative content analysis. Instead, material is selected or generated in line with the requirements of the research question, the methodological approach, and the nature of the material.

When material is selected, such material is often from the media, such as newspaper articles, blog contributions, tweets, and the like. For this kind of material, sampling strategies from communication studies are indicated, often involving comparatively large samples and the use of random sampling strategies. Also, selecting material from the media typically requires sampling on several levels. In analysing newspaper articles, for example, the first step involves choosing the newspapers, a suitable time frame, and relevant keywords; from among the articles meeting these criteria, a random sample is then drawn. Depending on the research question and the desired sample size, any sampling strategy may be indicated: random, convenience, or purposive (Schreier, 2018). In this type of study, sample size is typically determined before data analysis begins.

If the data are generated during the study, for example, through conducting interviews, focus groups, or observation, qualitative content analysis is merely the method used for data analysis within the wider framework of the study. This requires sampling decisions much earlier during the research process, namely before data collection. Sampling is part of the approach that informs the study and not specifically related to qualitative content analysis. Most likely, a purposive sampling strategy will be used, such as criterion sampling, maximum variation, respondent, or theoretical sampling (Patton, 2015; Schreier, 2018). Sample size will usually be small, ranging from 1 to around 40 cases. It can either be determined in advance or sampling is continued until the point of thematic saturation, that is, until no new information emerges from the data that require generating additional categories.

Creating a Preliminary Version of the Coding Frame

The coding frame constitutes the core of qualitative content analysis. Hence, the majority of steps in applying the method relate to creating, piloting, modifying, and applying the frame. This section focuses on creating a preliminary version of the frame. This process can also be broken down into subsidiary steps, namely: (1) familiarizing, (2) selecting, (3) structuring and generating, (4) defining, (5) revising and expanding. Before describing what these steps involve, an initial section focuses on the coding frame and its structure.

The Coding Frame

The purpose of the coding frame in qualitative content analysis is to conceptualize and assess relevant meanings, with relevance depending on the research question. All potentially relevant meanings must be represented as a category in the coding frame. Conversely, in analysing the material, it should be possible to determine the meaning of all relevant parts of the material by assigning them to a category in the frame. To be able to determine the meaning of a passage in the material, each category is conceptualized so that it includes certain meanings (those that are covered by the respective category) and excludes others (those that are covered by other categories; Forman & Damschroeder, 2007).

Coding frames consist of at least one main category and several subcategories. Main categories (also called dimensions) refer to aspects in the data that are of interest and that the researcher wants to know more about. Subcategories specify what is said concerning these aspects. The focus is usually on the subcategories, that is, on what is said in the material concerning the dimensions of interest. Yasemin Acar and Özden Melis Uluc (2016), for example, used qualitative content analysis to analyse 34 interviews they had conducted with activists in Turkey from the Gezi Park protests. They analysed their data with respect to the following three dimensions/main categories, which were all derived from the research questions: (1) reasons for protest participation, (2) solidarity experiences, and (3) empowerment experiences. The subcategories specify what the reasons of the participants are for joining the protests (e.g., the culmination of recent events, experiencing injustice as a common group), how they experienced solidarity during the protests (e.g., newfound appreciation of other groups, interaction and solidarity across groups), and in what ways the protesters felt empowered by their participation (e.g., seeing similarities with other groups, changing their perspectives in line with those of other disadvantaged groups).

With its distinction between main categories and subcategories, coding frames in qualitative content analysis have a hierarchical structure. This implies that main categories and subcategories are not absolute, but relative, functional terms. A category is not per se a main category or a subcategory, but it functions as such in relation to other categories. In principle, the hierarchical structure of a coding frame can contain several levels. In practice, however, more than three or four levels become difficult to handle and apply. Coding usually takes place at the lowest level (i.e., that of the sub-subcategories).

Coding frames and categories should be conceptualized such that they are unidimensional, exhaustive, mutually exclusive, reliable, and valid (on criteria for coding frames, see Mayring, 2014, chapter 7; Schreier, 2012, chapter 9). Categories are unidimensional if they capture only one aspect of meaning at a time. A coding frame is exhaustive to the extent that each relevant passage can in fact be assigned to one of the categories in the frame, that is, to the extent that the meaning of the passage is covered by the frame. Mutual exclusiveness refers to the relationship between the subcategories of the same main category: The meaning of the subcategories within the same main category should be conceptually distinct, and to clearly determine its meaning, each passage should be assigned to only one subcategory within the same main category. Because different main categories refer to different meaning dimensions, assigning a passage to subcategories from different main categories does not present any problems—it may in fact be necessary to do so. The requirement that categories be mutually exclusive thus applies only to the subcategories within one main category, but not to subcategories across different main categories. The remaining requirements for coding frames (reliability, validity, and saturation) are discussed later in this entry.

Familiarizing

Qualitative data analysis, including qualitative content analysis, is all about determining and conceptualizing meaning. Therefore, the first step in qualitative content analysis is to become familiar with the material, to gain a sense of relevant topics and concepts. This is done by reading through the material, several times if possible. Throughout this initial reading process, the researcher notes any ideas concerning the analysis and potential categories to be included in the coding frame. If software is used, the memo function can be used to do this. If the study is case-oriented rather than variable-oriented (e.g., a case study, an interview study), this initial phase can also be used to create a short summary for each case (Kuckartz, 2014).

Selecting Relevant Material

The next step in creating a coding frame is to select relevant material. This refers to distinguishing between material that is relevant to the research question and material that is not. Interviewees, for example, will sometimes go off topic. Likewise, material that was created for reasons unrelated to the research (e.g., material on social media, like Facebook profiles or tweets) will likely contain irrelevant parts. It is therefore necessary to define what falls under the research question and what does not. In this, it is advisable to err on the side of caution: If uncertain whether a given topic is relevant or not, it is wise to include it. If possible, it is beneficial to do this step together with other researchers: Others may bring a different perspective to the topic that complements the perspective of the main researcher. Software, if used, can support this step: After saving the original file, all relevant passages can be assigned the basic code “relevant” and colour-coded accordingly, if this function is available. In this way, the relevant material is clearly visible in its original context.

In addition to distinguishing between relevant and irrelevant material, it is important to select material for developing the initial part of the coding frame. Qualitative data are usually rich, and even with the research question providing a focus, capturing all meanings that are relevant can be a daunting task. As a result, coding frames quickly expand and become difficult to handle. Therefore, creating the coding frame one step at a time—for example, beginning with one subtopic—can be advantageous. Likewise, if the data consist of several subsets (e.g., interviews with different groups of participants, data from different time frames), it is helpful to develop the frame for one subset of the material first and then gradually to expand it.

There are no hard-and-fast rules as for how much material is needed to develop the preliminary version of a coding frame. As mentioned earlier, the coding frame should be exhaustive, that is, it should be possible to assign all relevant passages to one of the subcategories. Consequently, all the different meanings that are to be represented in the frame should also be present in the material used for developing the frame. If, on one hand, the material is fairly homogeneous (if, e.g., all interviewees address similar points), 20–30% of the material may be sufficient for creating an exhaustive frame. If, on the other hand, the material is quite heterogeneous, encompassing a wide variety of meanings, it may be necessary to use all the available material until an exhaustive frame has been developed.

Structuring and Generating

Main categories provide a basic structure from which the material is analysed. Therefore, developing main categories is referred to as structuring , whereas creating subcategories is called generating .

There are two basic strategies for developing the categories in a coding frame: concept-driven and data-driven procedures. When using a concept-driven strategy, researchers rely on previous knowledge. This can be knowledge that has been formalized in theoretical terms. Here, theoretical concepts are turned into categories. But less formalized knowledge is also suitable, such as the findings of previous studies on the topic. If the data were collected as part of a study and a topic or observation guide was used, the data collection instrument can also serve as previous knowledge in developing a coding frame (e.g., interview questions). Concept-driven procedures are especially useful for structuring the material (i.e., creating main categories). They can also be used to add subcategories. But to obtain a valid and exhaustive frame, using a concept-driven strategy only usually will not be sufficient and data-driven categories have to be added.

Data-driven categories are generated based on the material. Strategies for developing data-driven categories fall into two groups. The first group comprises strategies for generating subcategories with a view to a given main category, and the second group consists of strategies for developing an entire coding frame in a data-driven manner. The most commonly used strategy for generating subcategories to a given main category is structuring (Mayring, 2014): The material is examined until a first passage that falls under the given main category is found, and a subcategory that captures the core theme or concept underlying this passage is created. The next passage that falls within the scope of the main category is examined as to whether its meaning is already captured by the existing subcategories. If this is not the case, a new subcategory is created. Deciding whether the meaning of a passage has already been captured requires a constant (re)assessment of similarity and difference of meaning. It is generally useful to generate a new subcategory if the meaning in question is either conceptually relevant or if it occurs repeatedly throughout the material. A variant of this strategy that is especially useful for contrasting cases is presented by Boyatzis (1998).

If not much previous knowledge is available or if already existing concepts and theories are considered to be insufficient, it may be necessary to develop an entire coding frame, including both main categories and subcategories, in a data-driven manner. One strategy for doing this is summarizing (Mayring, 2014): Relevant passages are identified and progressively paraphrased until only the core proposition remains. Identical or sufficiently similar core propositions are grouped together under the same label into one category, and categories are placed in relation to each other to create a structure. Alternatively, open coding strategies from grounded theory methodology have also been used to generate entire coding frames (Mayring, 2014; Schreier, 2012).

At the stage of creating the frame, it is especially important that categories be conceptualized as unidimensional and that the subcategories within any one main category be conceptually distinct and hence mutually exclusive. Coding frames will often be conceptually richer if they are developed by a research team. One way of doing this is to have each researcher within the team apply the data-driven strategy of choice to a selected part of the material and to compare and integrate the resulting categories.

Software can be very helpful in supporting this step. It allows for creating category names and arranging them in a hierarchical structure. While creating the frame, categories and the respective text passages can be linked by “coding” the text passages accordingly. If this is done, it is important to keep in mind that coding at this stage does not have the function of analysing the data; it merely serves the purpose of creating the frame, and analysis proper occurs only after the frame has been piloted, modified, and finalized. Categories are easily moved from one part of the structure to another if necessary. It is also possible to merge categories, if this is indicated. When deciding whether to divide one category into subcategories, it can be helpful to have the software display all passages that are linked to the same code (i.e., fall within the same category). Comparing these passages shows whether their meanings are sufficiently diverse to justify creating subcategories. Again, the memo function of software is useful for noting ideas as they occur while reading through the material, such as ideas for additional categories, questions to ask of the material, or concerns that the meaning of a category may have shifted over the course of creating a frame and should be checked before piloting.

Defining Categories

Once a preliminary structure of the coding frame has been developed, all categories in the frame have to be defined. This is important from both a conceptual and a methodological perspective. Definitions provide the intension and extension of the underlying concepts, especially in relation to each other. From a methodological perspective, defining categories is a prerequisite for making the coding frame reliable and valid. Reliability is assessed by double coding part of the material—for example, having two coders independently of each other assign relevant passages to the categories of the coding frame and then comparing their coding of the material. This can only be done to the extent that the coders are aware of the exact meaning of the categories they are using, and this meaning is specified by the definitions. Moreover, coding frames are required to be valid; that is, they are required to adequately describe the meaning of the material. Comparing the range of meaning of the material with the range of meaning of the categories requires that their meaning be specified.

Category definitions consist of (a) a label, (b) a description, (c) examples, and (d) decision rules. The label consists of a name that concisely summarizes to what the category refers. The description constitutes the definition proper. It can be conceptualized as a coding instruction, specifying to the coder when the category is to be used and what kinds of meanings in the material fall under the heading of the respective category. This can include enumerations, hypothetical examples, and if appropriate, indicators. Indicators are words which often occur in passages that exemplify the category, although they are not definitive: A passage containing the indicator may fall under a different category, and passages that do not contain the keyword may be assigned to the respective category all the same. Examples of passages that fall under the respective category serve to illustrate its meaning and make it more vivid. Finally, decision rules are needed where there is potential overlap between two or more subcategories within the same main category. To ensure the mutual exclusiveness of the subcategories, decision rules specify the conditions for coding a passage under the various subcategories. All subcategories in the coding frame should be defined in this way. For main categories, only brief descriptions are necessary.

Software allows for the easy modification of category definitions, and the memo function is helpful in noting additional concerns that may need to be included in a given definition, such as the potential overlap between subcategories and the resulting need to add decision rules. Previous versions of the frame should always be saved in case one wants to revert to the earlier version of a category definition.

Revising and Expanding the Frame

The steps outlined so far result in a coding frame with main categories and subcategories that have all been defined. At this stage, it is worthwhile to go over the frame again, examine it in terms of the core requirements of unidimensionality and mutual exclusiveness of subcategories, and revise it accordingly.

If, following this suggestion, only the part of the material referring to a given topic or only material for a given subset of the data was selected, the frame needs to be expanded. First, other subsets providing data on the same topic, such as another group of interviewees, are included. If, for example, interviews were conducted with persons from three different European countries, such as France, Italy, and Hungary, frame development might have started out with data on Topic 1 from France. In a next step, the frame is expanded to include interview data on Topic 1 with interviewees from Italy. In principle, all the steps described in this section are gone through again: A suitable number of interviewees speaking about the topic are selected. Next, the researcher checks whether any additional categories and subcategories are required to adequately capture the meaning of the new material. If so, the new categories are defined, or else previous definitions may require modification. These steps are repeated for every subset of data on the same topic. Once all subsets have been included, the resulting frame is again checked for compliance with the criteria of unidimensionality and mutual exclusiveness of subcategories. The entire process is then repeated for the next topic, and this sequence of steps continues until the frame is complete.

Piloting and Modifying the Coding Frame

Before applying it to the entire material, the coding frame should be piloted, evaluated, and modified accordingly. Doing so helps to identify inconsistencies in the frame, overlaps between subcategories, unclear definitions, and other problems. Trying out the coding frame occurs during the pilot phase. This phase consists of the following steps: (1) selecting material for the pilot phase, (2) dividing the material into units, (3) performing the trial coding, (4) comparing and discussing the trial coding, and (5) modifying the coding frame.

Selecting Material for the Pilot Phase

Because the coding frame is usually modified on the basis of the pilot coding, it is tried out on part of the material only (so as to avoid having to code large amounts of material again, following the modification). It is therefore necessary to select material for this step. In line with the purpose of the pilot phase, namely to identify any inconsistencies or unclear passages in the coding frame, the material should be selected so that as many main categories and subcategories as possible can in fact be applied. If the material contains data from different subsets, material from all subsets should be included here. On the other hand, as the pilot phase usually results in some modification of the coding frame, the material that is used for the pilot phase will have to be coded again once the final version of the frame is applied to the entire material. As recoding data requires additional work, the extent of this additional work should be limited as far as possible. Selecting material for the pilot phase therefore will be a compromise between selecting as much material as is necessary to evaluate the coding frame and as little material as possible so as to keep any recoding to a minimum.

Dividing the Material Into Units

All the data collected for any one case (e.g., the full text of the interview conducted with one person; all documents collected that pertain to a single law) are referred to as the unit of analysis . But the categories in the coding frame usually do not refer to the entire unit of analysis but to shorter passages within the material. Each passage whose meaning can be subsumed under a (sub)category of the coding frame is referred to as a unit of coding or a segment . Because one is usually interested in the meaning of these smaller passages in qualitative content analysis, it is necessary to identify and mark these in the material before applying the coding frame. It is important to carry out this segmentation of the material before the pilot coding, since the pilot coding involves the comparison of two rounds of coding. If the units of coding are not specified in advance, then different passages might be coded in each round and the two rounds of coding would not be comparable.

Division of the material into units of coding is usually based on thematic criteria. According to a thematic criterion, one unit ends and another one begins when there is a change of topic, with topic referring to the focus of a given main category. When coding for reasons why interviewees decided to join the Gezi Park protests (Acar & Uluc, 2016), for example, each unit of coding would correspond to one reason for participating in the protest. A new unit would begin where there is either a change of topic to a different reason or where a participant starts to talk about something completely different that is captured by a different main category, such as experiences of solidarity, or empowerment, or any other experiences. Thus, units of coding always refer to a given main category, and different main categories may require units of coding of different sizes. If software is used, units of coding can be marked by assigning each unit to the respective main category.

There are different ways of handling units of coding in qualitative content analysis. Some authors, such as Kuckartz (2014), prefer larger units of coding that can stand on their own; they include all the context information that is necessary to understand the meaning of the passage in question. Other authors, such as Schreier (2012), distinguish between units of coding and context units. The unit of coding is here defined as any passage whose meaning matches the definition of a given subcategory. This may also be one word as part of a longer enumeration. The additional context that is needed to understand the meaning of the passage is called the context unit . This context is read, but it is not coded.

Performing the Trial Coding

Once all the material for the pilot phase has been divided into units of coding, the actual trial coding is carried out by coding each unit twice (double coding), that is, assigning each unit of coding to one of the (sub)categories in the coding frame. As noted previously, ideally, this is done by two coders independently of each other. If this is not possible, one coder can apply the coding frame to the same material twice, with sufficient time in between (10 or 14 days). If two coders work on the material, one of them is usually the researcher while the other coder is a person who has either been involved in developing the coding frame or has had sufficient time to familiarize themselves with the research topic, the frame, and the material. The double coding is typically carried out using two separate files, with the units of coding clearly marked. Coders make a note of any thoughts on the coding frame, the definitions of the categories, uncertainties about which category to choose, or any other comments on any part of the coding process. When using software, such notes can be made through the memo function.

Comparing and Discussing the Trial Coding

The trial coding is followed by a comparison between the two rounds of coding, identifying and marking all units that were assigned to one category in Round 1 (or by Coder 1) and to a different category in Round 2 (or by Coder 2). If two coders performed the trial coding, the information about units that are coded differently forms the basis for an exchange between the coders, whereby the coders explain the reasons underlying their decisions. This exchange as well as any notes that the coders wrote down are important tools in the subsequent modification and improvement of the coding frame: Coders point out which parts of the category definition are unclear, where there is overlap between categories, and where additional categories may be missing, and the coding frame is modified accordingly. If one person coded the same material twice, units that were coded differently will usually show that certain categories are especially likely to be used interchangeably, indicating that the definitions of these categories are not yet sufficiently distinct.

The handling and use of information from the trial coding highlights different traditions and different quality criteria within qualitative content analysis (Forman & Damschroeder, 2007). Within a predominantly qualitative framework, the core criterion for evaluating the coding frame is validity in a broader sense. Here, the main concern is to arrive at a conceptually valid coding frame, and any differences in opinion between coders that result in a conceptually more elaborate coding frame only contribute to increasing the validity of the frame (Kuckartz, 2014). From this perspective, the exchange between the coders and the subsequent modification of the coding frame are the goals and the end points of the pilot phase. From a quantitatively informed tradition in qualitative content analysis, however, the information about units that were coded differently in the two rounds of coding is used to quantitatively assess reliability by calculating a coefficient such as κ (Mayring, 2014).

From within both a predominantly quantitative and a predominantly qualitative perspective, the trial coding can also be used to assess the frame in terms of validity in the narrow sense, that is, the extent to which the coding frame adequately describes the material (Schreier, 2012). High coding frequencies of any “miscellaneous” categories are especially important here. They indicate that the substantive categories are not yet able to sufficiently describe and conceptualize the material and that more categories need to be added.

Modifying the Coding Frame

Next, the coding frame is modified based on the coders’ notes during the pilot phase, their exchange, any inter- or intrarater coefficients, and validity assessments. If only minor changes are necessary at this stage, the modified frame can then be applied to the entire material. If major changes are made, involving structural changes in how main categories and subcategories relate to each other and the addition of several new categories, a second round of pilot coding should be carried out if possible.

The Main Coding Phase

Once the coding frame has been finalized, it is applied to all the material. Accordingly, this requires dividing all the material into units of coding and performing the actual coding by assigning the units of coding to the categories of the coding frame. At this stage, the coding frame is generally no longer modified, except in minor respects if necessary. If additional important information emerges from the material that is not yet included in the coding frame, one option is to extend the frame accordingly and conduct another round of pilot coding before continuing with the main coding. Another option is to make a note of this information and report it as a starting point for further research.

Because the quality of the coding frame has been established during the pilot phase and the subsequent modification of the frame, it is not necessary to double code the full set of materials. To make sure that categories are applied consistently throughout, however, it is useful to double code part of the material. Depending on the degree of difference between the two rounds of coding during the pilot phase and the extent of modifications necessary, up to around one third of the material can be double-coded. Following the double coding, the coding of these units is again compared. If a unit is assigned to different categories, coders give their reasons and, if possible, reach an agreement as to how the unit is to be coded. If they cannot reach an agreement, a third person familiar with the research topic can be brought in, or the unit is left out of any subsequent analyses. If this happens repeatedly, this may indicate that the material is in large parts polyvalent—that is, carries different meanings simultaneously—and that qualitative content analysis is not the best method for data analysis.

Presenting the Results

There are three main strategies for presenting the findings of qualitative content analysis. The first strategy is descriptive and uses a narrative format. The second strategy involves making use of visual displays, and the third strategy uses the findings of qualitative content analysis to conduct further analyses (see Kuckartz, 2014, chapters 5 and 6; Schreier, 2012, chapter 11).

Descriptive Presentation in a Narrative Format

The main coding can be the end point of the analysis, followed by a presentation of the findings. Typically, this takes the form of presenting the coding frame in detail, illustrated by quotations from the material. This format of presenting the results is especially suitable if the analysis is predominantly data based and the coding frame emerges as the main finding of the study. If coding frequencies are given, researchers should be careful about the scope of their conclusions. They should, for example, be careful not to generalize from a small sample of interviewees to an entire population, and they should not overemphasize differences in coding frequencies between subgroups in the data—unless they have indeed tested for statistical difference.

An alternative format is to descriptively present the findings with an emphasis on the cases. Instead of featuring the coding frame by successively describing and illustrating each category, the focus here is on presenting a profile of each case included in the study. This draws upon the summary information collated for each case before beginning qualitative content analysis as well as the subcategories coded for the case within each of the main categories. This presentation format is especially suitable if only a few cases have been included in the study and if there is a strong emphasis on within-case information, such as following change across time, or triangulation of data within cases.

Software supports this strategy by displaying all passages of the material that have been assigned to the same category. This display can be filtered so as to show these passages only for a subset of the data or even one particular case. This makes it easy to select suitable quotations to include in the presentation.

Using Visual Displays

A narrative descriptive presentation can be enhanced by making use of visual displays, especially matrices, that is, tables containing textual information (Miles, Huberman, & Saldana, 2014). This can be information about the categories, such as category definitions, information about participants, such as details about the subgroup to which they belong, and quotations from the material. Like descriptive presentations, matrices can be designed as cross-case or within-case displays. Cross-case displays are especially suitable for illustrating differences between subsets of data, such as different groups of participants or material from different time frames. By providing quotations from the material for the different subsets that were assigned to the same category, for instance, subtle differences of expression can be captured. Within-case displays supplement a case-oriented format of presenting results, showing how selected categories are expressed within this particular case. The great advantage of matrices compared to a purely narrative presentation format is that matrices combine a succinct overview and summary of the material with the vivid illustration of selected parts of that material through the use of quotations. Unlike a descriptive presentation in a narrative format, however, visual displays do not constitute a stand-alone strategy of presenting the findings of qualitative content analysis but are used in combination with a descriptive narrative presentation. Software again supports this strategy: Commercial software packages allow the user to construct matrices from within the software.

Conducting Further Analyses

Finally, the coding conducted in qualitative content analysis can in turn serve as the starting point for further analyses which can be qualitative or quantitative, or both.

The core strategy for conducting further qualitative analyses is to examine the findings for any cooccurrences between subcategories, that is, subcategories that are often coded together or often follow each other and thus constitute a pattern. This is analogous to what is referred to in the literature on (inductive) coding as second-order coding (Saldana, 2016). Software is essential for carrying out this step: Especially with large amounts of material and a comprehensive coding frame, looking for patterns “by hand” would be almost impossible. Commercial software packages typically provide several sophisticated search strategies. These can be used to follow up on hunches or to inductively search for any patterns that emerge from the data.

If any such patterns can be identified, they can in turn serve as a starting point for creating empirically grounded types and typologies (Kluge, 2000). A type consists of a group of cases that are similar to each other (internally homogeneous) and different from other groups (externally heterogeneous). Based on qualitative content analysis, cases can be considered similar if they share a pattern of co-occurrences of categories (i.e., cases are similar to the extent that they have been coded with the same subcategories and to the extent that these subcategories occur together and constitute a pattern). Taken together, the types constitute a typology in the way in which they are distributed across the space of categories and their cooccurrences. Constructing typologies is thus based on identifying cooccurrences and patterns in the data and takes this second-order coding one step further in identifying groups of cases (Kuckartz, 2014). On one hand, typologies expand on the findings of qualitative content analysis by requiring an examination of cooccurrences and patterns in the coding. On the other hand, constructing types serve to further reduce the findings, making key results and differences between groups of cases quickly accessible.

Qualitative content analysis is essentially a variable-based method for data analysis, whereby categories serve as the variables in a cross-case comparison. Because of this variable orientation, the method lends itself well to being combined with subsequent quantitative, statistical analysis. In the simplest case, this consists of a descriptive listing of coding frequencies, either across the entire sample or separately for subsets of the data. If the sample is sufficiently large, inference statistical tests can also follow after qualitative content analysis. Because of the distribution characteristics, these will usually be tests for nominal scale data, such as χ 2 tests for differences between subgroups. But more sophisticated analyses, such as cluster or configuration frequency analysis, are also an option. The combination of qualitative content analysis and subsequent statistical analysis is again supported by software: Here, software packages are needed that support the export of frequency data to a file format (e.g., Microsoft Excel) that is compatible with statistical software packages.

Recent Developments

Qualitative content analysis has its origin in quantitative content analysis, and it has retained some of its quantitative characteristics, such as the focus on data reduction, variable orientation, and the emphasis on double coding to the point of reliability assessment. As a consequence, it has sometimes been described as a hybrid combining qualitative and quantitative features (Fielding & Schreier, 2001; Groeben & Rustemeyer, 1994). This raises the question to what extent qualitative content analysis is indeed a genuinely qualitative research method. Criticisms concern, for example, the lack of ontological and epistemological grounding, the limited ability of the method to reconstruct and describe latent content, and the scarcity of reflections on the concept of “category”—a lack that is all the more surprising considering that categories constitute the core of the method (Janssen, et al., 2017; Stamann, Janssen & Schreier, 2016). Recently, suggestions have been made to elaborate upon the qualitative elements of the method. These include, for instance, strengthening the role of case summaries in applying the method (Kuckartz, 2014), stressing the iterative procedure inherent in developing a coding frame (Kuckartz, 2014), and reflecting upon the process of developing the coding frame and including intersubjective elements in this process (Stamann, et al., 2016).

Initially, validity in qualitative content analysis referred to the validity of the coding frame (i.e., measurement validity as conceptualized in the quantitative research tradition). Recently, several suggestions have been made for applying quality criteria from the qualitative research tradition, such as credibility, dependability, and transferability, to the entire research process in qualitative content analysis (Bengtsson, 2016; Elo et al., 2014). At the same time, this is yet another example of the tradition of elaborating upon the specifically qualitative elements of the method.

A final important development concerns the uses of qualitative content analysis in the era of big data and text mining. On one hand, the automated analysis of large amounts of textual data with the goal of making inferences from the text seems to push qualitative content analysis even further to the margins, encouraging the integration with quantitative content analysis (see, e.g.,Wiedemann, 2013 on the computer-assisted analysis of textual data in the social sciences). On the other hand, authors like Welles (2014) have argued in favour of combining text mining of large data sets with the in-depth analysis of smaller subsets of the data. This opens up a space for a mixed-methods type of combination of text mining and qualitative content analysis in which both methods complement each other. Suggestions have also been made for developing new variants of the method that are especially suitable for the analysis of large data sets, such as computational hermeneutics (Mohr, Wagner-Pacifici, & Breiger, 2015).

Sign in to access this content

Get a 30 day free trial, more like this, sage recommends.

We found other relevant content for you on other Sage platforms.

Have you created a personal profile? Login or create a profile so that you can save clips, playlists and searches

  • Sign in/register

Navigating away from this page will delete your results

Please save your results to "My Self-Assessments" in your profile before navigating away from this page.

Sign in to my profile

Sign up for a free trial and experience all Sage Learning Resources have to offer.

You must have a valid academic email address to sign up.

Get off-campus access

  • View or download all content my institution has access to.

Sign up for a free trial and experience all Sage Research Methods has to offer.

  • view my profile
  • view my lists

Read our research on: TikTok | Podcasts | Election 2024

Regions & Countries

Content analysis.

One type of research that has played a major role in the Pew Research Center’s work over time, particularly in our ongoing work focused on journalism and news, has been content analysis, a tool that allows us to look at the way messages change over time and vary across mediums and outlets.

Content analysis has been defined as “the systematic, objective, quantitative analysis of message characteristics.” [Kimberly Neuendorf, The Content Analysis Guidebook, Sage Publications] At Pew Research Center, much of our content analysis has been used to study news reporting and social media, but the methodology can be applied to many different forms of communication, from transcripts of speeches to Twitter feeds. We have measured the “news agenda” (the topics being covered by the news media), the framing of conversations and many other characteristics of messages.

At Pew Research, we began our content analysis work under the guidance of some of the nation’s top content analysis methodologists and with a large team of human coders. We have always followed rigorous standards of validity and replicability by explaining our methods and conducting intercoder testing. In recent years, we have begun experimenting with the potential for computer coding. The center’s use of these new computer algorithms creates new opportunities and challenges for our work. In the following sections, we lay out some of the key methods underlying our content analysis and coding work.

About Pew Research Center Pew Research Center is a nonpartisan fact tank that informs the public about the issues, attitudes and trends shaping the world. It conducts public opinion polling, demographic research, media content analysis and other empirical social science research. Pew Research Center does not take policy positions. It is a subsidiary of The Pew Charitable Trusts .

Research Methodologies Guide

  • Action Research
  • Bibliometrics
  • Case Studies

Content Analysis

  • Digital Scholarship This link opens in a new window
  • Documentary
  • Ethnography
  • Focus Groups
  • Grounded Theory
  • Life Histories/Autobiographies
  • Longitudinal
  • Participant Observation
  • Qualitative Research (General)
  • Quasi-Experimental Design
  • Usability Studies

Content analysis is defined as 

"the systematic reading of a body of texts, images, and symbolic matter, not necessarily from an author's or user's perspective" ( Krippendorff , 2004).

Content analysis is distinguished from other kinds of social science research in that it does not require the collection of data from people. Like documentary research, content analysis is the study of recorded information, or information which has been recorded in texts, media, or physical items. 

For more information about content analysis, review the resources below:

Books and articles

Below, a few tools and online guides that can help you start your Content Analysis research are listed. These include free online resources and resources available only through ISU Library.

  • Quantitative Content Analysis by Kate Huxley Publication Date: 2020 This entry examines quantitative content analysis, which is a method based on the systematic coding and quantification of content—be that written, visual, or oral content.
  • Qualitative Content Analysis The article describes an approach of systematic, rule guided qualitative text analysis, which tries to preserve some methodological strengths of quantitative content analysis and widen them to a concept of qualitative procedure.
  • Basic Content Analysis by Robert Philip Weber Call Number: H61 W422 1990 Publication Date: 1990

Additional Resources

  • An Introduction to Content Analysis A tutorial-type guide to content analysis from Colorado State University.
  • Overview of Content Analysis An article from the peer-reviewed online journal, Practical Assessment, Research & Evaluation by Steve Stemler of Yale University.
  • << Previous: Case Studies
  • Next: Digital Scholarship >>
  • Last Updated: Dec 19, 2023 2:12 PM
  • URL: https://instr.iastate.libguides.com/researchmethods

Have a language expert improve your writing

Run a free plagiarism check in 10 minutes, generate accurate citations for free.

  • Knowledge Base
  • Dissertation
  • What Is a Research Methodology? | Steps & Tips

What Is a Research Methodology? | Steps & Tips

Published on August 25, 2022 by Shona McCombes and Tegan George. Revised on November 20, 2023.

Your research methodology discusses and explains the data collection and analysis methods you used in your research. A key part of your thesis, dissertation , or research paper , the methodology chapter explains what you did and how you did it, allowing readers to evaluate the reliability and validity of your research and your dissertation topic .

It should include:

  • The type of research you conducted
  • How you collected and analyzed your data
  • Any tools or materials you used in the research
  • How you mitigated or avoided research biases
  • Why you chose these methods
  • Your methodology section should generally be written in the past tense .
  • Academic style guides in your field may provide detailed guidelines on what to include for different types of studies.
  • Your citation style might provide guidelines for your methodology section (e.g., an APA Style methods section ).

Instantly correct all language mistakes in your text

Upload your document to correct all your mistakes in minutes

upload-your-document-ai-proofreader

Table of contents

How to write a research methodology, why is a methods section important, step 1: explain your methodological approach, step 2: describe your data collection methods, step 3: describe your analysis method, step 4: evaluate and justify the methodological choices you made, tips for writing a strong methodology chapter, other interesting articles, frequently asked questions about methodology.

Receive feedback on language, structure, and formatting

Professional editors proofread and edit your paper by focusing on:

  • Academic style
  • Vague sentences
  • Style consistency

See an example

research methodology content analysis

Your methods section is your opportunity to share how you conducted your research and why you chose the methods you chose. It’s also the place to show that your research was rigorously conducted and can be replicated .

It gives your research legitimacy and situates it within your field, and also gives your readers a place to refer to if they have any questions or critiques in other sections.

You can start by introducing your overall approach to your research. You have two options here.

Option 1: Start with your “what”

What research problem or question did you investigate?

  • Aim to describe the characteristics of something?
  • Explore an under-researched topic?
  • Establish a causal relationship?

And what type of data did you need to achieve this aim?

  • Quantitative data , qualitative data , or a mix of both?
  • Primary data collected yourself, or secondary data collected by someone else?
  • Experimental data gathered by controlling and manipulating variables, or descriptive data gathered via observations?

Option 2: Start with your “why”

Depending on your discipline, you can also start with a discussion of the rationale and assumptions underpinning your methodology. In other words, why did you choose these methods for your study?

  • Why is this the best way to answer your research question?
  • Is this a standard methodology in your field, or does it require justification?
  • Were there any ethical considerations involved in your choices?
  • What are the criteria for validity and reliability in this type of research ? How did you prevent bias from affecting your data?

Once you have introduced your reader to your methodological approach, you should share full details about your data collection methods .

Quantitative methods

In order to be considered generalizable, you should describe quantitative research methods in enough detail for another researcher to replicate your study.

Here, explain how you operationalized your concepts and measured your variables. Discuss your sampling method or inclusion and exclusion criteria , as well as any tools, procedures, and materials you used to gather your data.

Surveys Describe where, when, and how the survey was conducted.

  • How did you design the questionnaire?
  • What form did your questions take (e.g., multiple choice, Likert scale )?
  • Were your surveys conducted in-person or virtually?
  • What sampling method did you use to select participants?
  • What was your sample size and response rate?

Experiments Share full details of the tools, techniques, and procedures you used to conduct your experiment.

  • How did you design the experiment ?
  • How did you recruit participants?
  • How did you manipulate and measure the variables ?
  • What tools did you use?

Existing data Explain how you gathered and selected the material (such as datasets or archival data) that you used in your analysis.

  • Where did you source the material?
  • How was the data originally produced?
  • What criteria did you use to select material (e.g., date range)?

The survey consisted of 5 multiple-choice questions and 10 questions measured on a 7-point Likert scale.

The goal was to collect survey responses from 350 customers visiting the fitness apparel company’s brick-and-mortar location in Boston on July 4–8, 2022, between 11:00 and 15:00.

Here, a customer was defined as a person who had purchased a product from the company on the day they took the survey. Participants were given 5 minutes to fill in the survey anonymously. In total, 408 customers responded, but not all surveys were fully completed. Due to this, 371 survey results were included in the analysis.

  • Information bias
  • Omitted variable bias
  • Regression to the mean
  • Survivorship bias
  • Undercoverage bias
  • Sampling bias

Qualitative methods

In qualitative research , methods are often more flexible and subjective. For this reason, it’s crucial to robustly explain the methodology choices you made.

Be sure to discuss the criteria you used to select your data, the context in which your research was conducted, and the role you played in collecting your data (e.g., were you an active participant, or a passive observer?)

Interviews or focus groups Describe where, when, and how the interviews were conducted.

  • How did you find and select participants?
  • How many participants took part?
  • What form did the interviews take ( structured , semi-structured , or unstructured )?
  • How long were the interviews?
  • How were they recorded?

Participant observation Describe where, when, and how you conducted the observation or ethnography .

  • What group or community did you observe? How long did you spend there?
  • How did you gain access to this group? What role did you play in the community?
  • How long did you spend conducting the research? Where was it located?
  • How did you record your data (e.g., audiovisual recordings, note-taking)?

Existing data Explain how you selected case study materials for your analysis.

  • What type of materials did you analyze?
  • How did you select them?

In order to gain better insight into possibilities for future improvement of the fitness store’s product range, semi-structured interviews were conducted with 8 returning customers.

Here, a returning customer was defined as someone who usually bought products at least twice a week from the store.

Surveys were used to select participants. Interviews were conducted in a small office next to the cash register and lasted approximately 20 minutes each. Answers were recorded by note-taking, and seven interviews were also filmed with consent. One interviewee preferred not to be filmed.

  • The Hawthorne effect
  • Observer bias
  • The placebo effect
  • Response bias and Nonresponse bias
  • The Pygmalion effect
  • Recall bias
  • Social desirability bias
  • Self-selection bias

Mixed methods

Mixed methods research combines quantitative and qualitative approaches. If a standalone quantitative or qualitative study is insufficient to answer your research question, mixed methods may be a good fit for you.

Mixed methods are less common than standalone analyses, largely because they require a great deal of effort to pull off successfully. If you choose to pursue mixed methods, it’s especially important to robustly justify your methods.

The only proofreading tool specialized in correcting academic writing - try for free!

The academic proofreading tool has been trained on 1000s of academic texts and by native English editors. Making it the most accurate and reliable proofreading tool for students.

research methodology content analysis

Try for free

Next, you should indicate how you processed and analyzed your data. Avoid going into too much detail: you should not start introducing or discussing any of your results at this stage.

In quantitative research , your analysis will be based on numbers. In your methods section, you can include:

  • How you prepared the data before analyzing it (e.g., checking for missing data , removing outliers , transforming variables)
  • Which software you used (e.g., SPSS, Stata or R)
  • Which statistical tests you used (e.g., two-tailed t test , simple linear regression )

In qualitative research, your analysis will be based on language, images, and observations (often involving some form of textual analysis ).

Specific methods might include:

  • Content analysis : Categorizing and discussing the meaning of words, phrases and sentences
  • Thematic analysis : Coding and closely examining the data to identify broad themes and patterns
  • Discourse analysis : Studying communication and meaning in relation to their social context

Mixed methods combine the above two research methods, integrating both qualitative and quantitative approaches into one coherent analytical process.

Above all, your methodology section should clearly make the case for why you chose the methods you did. This is especially true if you did not take the most standard approach to your topic. In this case, discuss why other methods were not suitable for your objectives, and show how this approach contributes new knowledge or understanding.

In any case, it should be overwhelmingly clear to your reader that you set yourself up for success in terms of your methodology’s design. Show how your methods should lead to results that are valid and reliable, while leaving the analysis of the meaning, importance, and relevance of your results for your discussion section .

  • Quantitative: Lab-based experiments cannot always accurately simulate real-life situations and behaviors, but they are effective for testing causal relationships between variables .
  • Qualitative: Unstructured interviews usually produce results that cannot be generalized beyond the sample group , but they provide a more in-depth understanding of participants’ perceptions, motivations, and emotions.
  • Mixed methods: Despite issues systematically comparing differing types of data, a solely quantitative study would not sufficiently incorporate the lived experience of each participant, while a solely qualitative study would be insufficiently generalizable.

Remember that your aim is not just to describe your methods, but to show how and why you applied them. Again, it’s critical to demonstrate that your research was rigorously conducted and can be replicated.

1. Focus on your objectives and research questions

The methodology section should clearly show why your methods suit your objectives and convince the reader that you chose the best possible approach to answering your problem statement and research questions .

2. Cite relevant sources

Your methodology can be strengthened by referencing existing research in your field. This can help you to:

  • Show that you followed established practice for your type of research
  • Discuss how you decided on your approach by evaluating existing research
  • Present a novel methodological approach to address a gap in the literature

3. Write for your audience

Consider how much information you need to give, and avoid getting too lengthy. If you are using methods that are standard for your discipline, you probably don’t need to give a lot of background or justification.

Regardless, your methodology should be a clear, well-structured text that makes an argument for your approach, not just a list of technical details and procedures.

If you want to know more about statistics , methodology , or research bias , make sure to check out some of our other articles with explanations and examples.

  • Normal distribution
  • Measures of central tendency
  • Chi square tests
  • Confidence interval
  • Quartiles & Quantiles

Methodology

  • Cluster sampling
  • Stratified sampling
  • Thematic analysis
  • Cohort study
  • Peer review
  • Ethnography

Research bias

  • Implicit bias
  • Cognitive bias
  • Conformity bias
  • Hawthorne effect
  • Availability heuristic
  • Attrition bias

Methodology refers to the overarching strategy and rationale of your research project . It involves studying the methods used in your field and the theories or principles behind them, in order to develop an approach that matches your objectives.

Methods are the specific tools and procedures you use to collect and analyze data (for example, experiments, surveys , and statistical tests ).

In shorter scientific papers, where the aim is to report the findings of a specific study, you might simply describe what you did in a methods section .

In a longer or more complex research project, such as a thesis or dissertation , you will probably include a methodology section , where you explain your approach to answering the research questions and cite relevant sources to support your choice of methods.

In a scientific paper, the methodology always comes after the introduction and before the results , discussion and conclusion . The same basic structure also applies to a thesis, dissertation , or research proposal .

Depending on the length and type of document, you might also include a literature review or theoretical framework before the methodology.

Quantitative research deals with numbers and statistics, while qualitative research deals with words and meanings.

Quantitative methods allow you to systematically measure variables and test hypotheses . Qualitative methods allow you to explore concepts and experiences in more detail.

Reliability and validity are both about how well a method measures something:

  • Reliability refers to the  consistency of a measure (whether the results can be reproduced under the same conditions).
  • Validity   refers to the  accuracy of a measure (whether the results really do represent what they are supposed to measure).

If you are doing experimental research, you also have to consider the internal and external validity of your experiment.

A sample is a subset of individuals from a larger population . Sampling means selecting the group that you will actually collect data from in your research. For example, if you are researching the opinions of students in your university, you could survey a sample of 100 students.

In statistics, sampling allows you to test a hypothesis about the characteristics of a population.

Cite this Scribbr article

If you want to cite this source, you can copy and paste the citation or click the “Cite this Scribbr article” button to automatically add the citation to our free Citation Generator.

McCombes, S. & George, T. (2023, November 20). What Is a Research Methodology? | Steps & Tips. Scribbr. Retrieved March 12, 2024, from https://www.scribbr.com/dissertation/methodology/

Is this article helpful?

Shona McCombes

Shona McCombes

Other students also liked, what is a theoretical framework | guide to organizing, what is a research design | types, guide & examples, qualitative vs. quantitative research | differences, examples & methods, what is your plagiarism score.

IMAGES

  1. 15 Research Methodology Examples (2023)

    research methodology content analysis

  2. Content Analysis For Research

    research methodology content analysis

  3. Research Methodology Explained

    research methodology content analysis

  4. Content Analysis

    research methodology content analysis

  5. Types of Research Methodology: Uses, Types & Benefits

    research methodology content analysis

  6. Research Methods

    research methodology content analysis

VIDEO

  1. Research Methodology Part I simple concepts

  2. Research Methodology-20

  3. Research Methods

  4. Research methodology and intellectual properties-Geographical Indications

  5. Research Methodology part two

  6. Research methodology video

COMMENTS

  1. Content Analysis

    Content analysis is a research method used to identify patterns in recorded communication. To conduct content analysis, you systematically collect data from a set of texts, which can be written, oral, or visual: Books, newspapers and magazines. Speeches and interviews. Web content and social media posts. Photographs and films.

  2. Content Analysis Method and Examples

    When done well, is considered a relatively "exact" research method. Content analysis is a readily-understood and an inexpensive research method. A more powerful tool when combined with other research methods such as interviews, observation, and use of archival records. It is very useful for analyzing historical material, especially for ...

  3. Content Analysis

    Content analysis is a research method used to analyze and interpret the characteristics of various forms of communication, such as text, images, or audio. It involves systematically analyzing the content of these materials, identifying patterns, themes, and other relevant features, and drawing inferences or conclusions based on the findings.

  4. Content Analysis

    Content analysis is a research method used to identify patterns in recorded communication. To conduct content analysis, you systematically collect data from a set of texts, which can be written, oral, or visual: Books, newspapers, and magazines; Speeches and interviews;

  5. A hands-on guide to doing content analysis

    Many articles and books are available that describe qualitative research methods and provide overviews of content analysis procedures , , , , , ... When reflecting on the proposed study aim together with the student, we often suggest content analysis methodology as the best fit for the study and the student, especially the novice researcher. ...

  6. Sage Research Methods

    The Fourth Edition has been completely revised to offer readers the most current techniques and research on content analysis, including new information on reliability and social media. Readers will also gain practical advice and experience for teaching academic and commercial researchers how to conduct content analysis.

  7. Qualitative Content Analysis 101 (+ Examples)

    Content analysis is a qualitative analysis method that focuses on recorded human artefacts such as manuscripts, voice recordings and journals. Content analysis investigates these written, spoken and visual artefacts without explicitly extracting data from participants - this is called unobtrusive research. In other words, with content ...

  8. Chapter 17. Content Analysis

    Chapter 17. Content Analysis Introduction. Content analysis is a term that is used to mean both a method of data collection and a method of data analysis. Archival and historical works can be the source of content analysis, but so too can the contemporary media coverage of a story, blogs, comment posts, films, cartoons, advertisements, brand packaging, and photographs posted on Instagram or ...

  9. How to do a content analysis [7 steps]

    In research, content analysis is the process of analyzing content and its features with the aim of identifying patterns and the presence of words, themes, and concepts within the content. Simply put, content analysis is a research method that aims to present the trends, patterns, concepts, and ideas in content as objective, quantitative or ...

  10. Content Analysis

    In his 1952 text on the subject of content analysis, Bernard Berelson traced the origins of the method to communication research and then listed what he called six distinguishing features of the approach. As one might expect, the six defining features reflect the concerns of social science as taught in the 1950s, an age in which the calls for an "objective," "systematic," and ...

  11. Reflexive Content Analysis: An Approach to Qualitative Data Analysis

    These problems underscore a need for a qualitative content analysis method that is not only well-defined but also clearly applicable within qualitative research frameworks. This method should enable the effective reduction and description of data, while establishing a clear distinction from other qualitative methods.

  12. (PDF) Content Analysis: A Flexible Methodology

    Content analysis is a highly flexible research method that has been widely used in library and information science (LIS) studies with varying research goals and objectives. The research method is ...

  13. Content Analysis

    Abstract. In this chapter, the focus is on ways in which content analysis can be used to investigate and describe interview and textual data. The chapter opens with a contextualization of the method and then proceeds to an examination of the role of content analysis in relation to both quantitative and qualitative modes of social research.

  14. How to plan and perform a qualitative study using content analysis

    Abstract. This paper describes the research process - from planning to presentation, with the emphasis on credibility throughout the whole process - when the methodology of qualitative content analysis is chosen in a qualitative study. The groundwork for the credibility initiates when the planning of the study begins.

  15. Sage Research Methods

    Content analysis is one of the most important but complex research methodologies in the social sciences. In this thoroughly updated Second Edition of The Content Analysis Guidebook, author Kimberly Neuendorf draws on examples from across numerous disciplines to clarify the complicated aspects of content analysis through step-by-step instruction and practical advice.

  16. Exploring Content Analysis in Qualitative Research

    Content analysis, in its simplest form, is a research method for interpreting and quantifying textual data, such as speeches, interviews, articles, social media posts, and so on. It allows researchers to sift through large volumes of data to identify patterns, themes, or biases and turn these into quantifiable variables that can be further ...

  17. Sage Research Methods Foundations

    The main part of the contribution focuses on describing and illustrating the steps in qualitative content analysis: deciding on a research question and selecting material; creating a preliminary version of the coding frame, including strategies for arriving at main categories and subcategories and how to define categories; piloting and ...

  18. Content Analysis

    Content analysis has been defined as "the systematic, objective, quantitative analysis of message characteristics." [Kimberly Neuendorf, The Content Analysis Guidebook, Sage Publications] At Pew Research Center, much of our content analysis has been used to study news reporting and social media, but the methodology can be applied to many ...

  19. Three Approaches to Qualitative Content Analysis

    Content analysis is a widely used qualitative research technique. Rather than being a single method, current applications of content analysis show three distinct approaches: conventional, directed, or summative. All three approaches are used to interpret meaning from the content of text data and, hence, adhere to the naturalistic paradigm.

  20. Content Analysis

    Content Analysis. Content analysis is defined as. "the systematic reading of a body of texts, images, and symbolic matter, not necessarily from an author's or user's perspective" (Krippendorff, 2004). Content analysis is distinguished from other kinds of social science research in that it does not require the collection of data from people.

  21. What Is a Research Methodology?

    Qualitative methods. In qualitative research, your analysis will be based on language, images, and observations (often involving some form of textual analysis). Specific methods might include: Content analysis: Categorizing and discussing the meaning of words, phrases and sentences

  22. A hands-on guide to doing content analysis

    Many articles and books are available that describe qualitative research methods and provide overviews of content analysis procedures [1], [2], ... When reflecting on the proposed study aim together with the student, we often suggest content analysis methodology as the best fit for the study and the student, especially the novice researcher. ...

  23. (PDF) Content Analysis: a short overview

    Content analysis (CA) is a research methodology to make sense of the (often unstructured) content of messages - b e they texts, images, sym bols or audio data. In s hort it could be sa id to

  24. Research on modified lumped parameter method in torsional vibration

    MLPM is derived from the lumped parameter method (LPM) and incorporates corrections to the stiffness matrix based on the multi-degree-of-freedom modal calculation using a finite element model. This allows MLPM to combine the high accuracy of finite element method (FEM) with the fast computing speed of LPM.

  25. BayesianSSA: a Bayesian statistical model based on structural ...

    Chemical bioproduction has attracted attention as a key technology in a decarbonized society. In computational design for chemical bioproduction, it is necessary to predict changes in metabolic fluxes when up-/down-regulating enzymatic reactions, that is, responses of the system to enzyme perturbations. Structural sensitivity analysis (SSA) was previously developed as a method to predict ...