Statology

Statistics Made Easy

How to Write Hypothesis Test Conclusions (With Examples)

A   hypothesis test is used to test whether or not some hypothesis about a population parameter is true.

To perform a hypothesis test in the real world, researchers obtain a random sample from the population and perform a hypothesis test on the sample data, using a null and alternative hypothesis:

  • Null Hypothesis (H 0 ): The sample data occurs purely from chance.
  • Alternative Hypothesis (H A ): The sample data is influenced by some non-random cause.

If the p-value of the hypothesis test is less than some significance level (e.g. α = .05), then we reject the null hypothesis .

Otherwise, if the p-value is not less than some significance level then we fail to reject the null hypothesis .

When writing the conclusion of a hypothesis test, we typically include:

  • Whether we reject or fail to reject the null hypothesis.
  • The significance level.
  • A short explanation in the context of the hypothesis test.

For example, we would write:

We reject the null hypothesis at the 5% significance level.   There is sufficient evidence to support the claim that…

Or, we would write:

We fail to reject the null hypothesis at the 5% significance level.   There is not sufficient evidence to support the claim that…

The following examples show how to write a hypothesis test conclusion in both scenarios.

Example 1: Reject the Null Hypothesis Conclusion

Suppose a biologist believes that a certain fertilizer will cause plants to grow more during a one-month period than they normally do, which is currently 20 inches. To test this, she applies the fertilizer to each of the plants in her laboratory for one month.

She then performs a hypothesis test at a 5% significance level using the following hypotheses:

  • H 0 : μ = 20 inches (the fertilizer will have no effect on the mean plant growth)
  • H A : μ > 20 inches (the fertilizer will cause mean plant growth to increase)

Suppose the p-value of the test turns out to be 0.002.

Here is how she would report the results of the hypothesis test:

We reject the null hypothesis at the 5% significance level.   There is sufficient evidence to support the claim that this particular fertilizer causes plants to grow more during a one-month period than they normally do.

Example 2: Fail to Reject the Null Hypothesis Conclusion

Suppose the manager of a manufacturing plant wants to test whether or not some new method changes the number of defective widgets produced per month, which is currently 250. To test this, he measures the mean number of defective widgets produced before and after using the new method for one month.

He performs a hypothesis test at a 10% significance level using the following hypotheses:

  • H 0 : μ after = μ before (the mean number of defective widgets is the same before and after using the new method)
  • H A : μ after ≠ μ before (the mean number of defective widgets produced is different before and after using the new method)

Suppose the p-value of the test turns out to be 0.27.

Here is how he would report the results of the hypothesis test:

We fail to reject the null hypothesis at the 10% significance level.   There is not sufficient evidence to support the claim that the new method leads to a change in the number of defective widgets produced per month.

Additional Resources

The following tutorials provide additional information about hypothesis testing:

Introduction to Hypothesis Testing 4 Examples of Hypothesis Testing in Real Life How to Write a Null Hypothesis

5 example of hypothesis and conclusion

Hey there. My name is Zach Bobbitt. I have a Master of Science degree in Applied Statistics and I’ve worked on machine learning algorithms for professional businesses in both healthcare and retail. I’m passionate about statistics, machine learning, and data visualization and I created Statology to be a resource for both students and teachers alike.  My goal with this site is to help you learn statistics through using simple terms, plenty of real-world examples, and helpful illustrations.

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

  • Affiliate Program

Wordvice

  • UNITED STATES
  • 台灣 (TAIWAN)
  • TÜRKIYE (TURKEY)
  • Academic Editing Services
  • - Research Paper
  • - Journal Manuscript
  • - Dissertation
  • - College & University Assignments
  • Admissions Editing Services
  • - Application Essay
  • - Personal Statement
  • - Recommendation Letter
  • - Cover Letter
  • - CV/Resume
  • Business Editing Services
  • - Business Documents
  • - Report & Brochure
  • - Website & Blog
  • Writer Editing Services
  • - Script & Screenplay
  • Our Editors
  • Client Reviews
  • Editing & Proofreading Prices
  • Wordvice Points
  • Partner Discount
  • Plagiarism Checker
  • APA Citation Generator
  • MLA Citation Generator
  • Chicago Citation Generator
  • Vancouver Citation Generator
  • - APA Style
  • - MLA Style
  • - Chicago Style
  • - Vancouver Style
  • Writing & Editing Guide
  • Academic Resources
  • Admissions Resources

How to Write a Research Hypothesis: Good & Bad Examples

5 example of hypothesis and conclusion

What is a research hypothesis?

A research hypothesis is an attempt at explaining a phenomenon or the relationships between phenomena/variables in the real world. Hypotheses are sometimes called “educated guesses”, but they are in fact (or let’s say they should be) based on previous observations, existing theories, scientific evidence, and logic. A research hypothesis is also not a prediction—rather, predictions are ( should be) based on clearly formulated hypotheses. For example, “We tested the hypothesis that KLF2 knockout mice would show deficiencies in heart development” is an assumption or prediction, not a hypothesis. 

The research hypothesis at the basis of this prediction is “the product of the KLF2 gene is involved in the development of the cardiovascular system in mice”—and this hypothesis is probably (hopefully) based on a clear observation, such as that mice with low levels of Kruppel-like factor 2 (which KLF2 codes for) seem to have heart problems. From this hypothesis, you can derive the idea that a mouse in which this particular gene does not function cannot develop a normal cardiovascular system, and then make the prediction that we started with. 

What is the difference between a hypothesis and a prediction?

You might think that these are very subtle differences, and you will certainly come across many publications that do not contain an actual hypothesis or do not make these distinctions correctly. But considering that the formulation and testing of hypotheses is an integral part of the scientific method, it is good to be aware of the concepts underlying this approach. The two hallmarks of a scientific hypothesis are falsifiability (an evaluation standard that was introduced by the philosopher of science Karl Popper in 1934) and testability —if you cannot use experiments or data to decide whether an idea is true or false, then it is not a hypothesis (or at least a very bad one).

So, in a nutshell, you (1) look at existing evidence/theories, (2) come up with a hypothesis, (3) make a prediction that allows you to (4) design an experiment or data analysis to test it, and (5) come to a conclusion. Of course, not all studies have hypotheses (there is also exploratory or hypothesis-generating research), and you do not necessarily have to state your hypothesis as such in your paper. 

But for the sake of understanding the principles of the scientific method, let’s first take a closer look at the different types of hypotheses that research articles refer to and then give you a step-by-step guide for how to formulate a strong hypothesis for your own paper.

Types of Research Hypotheses

Hypotheses can be simple , which means they describe the relationship between one single independent variable (the one you observe variations in or plan to manipulate) and one single dependent variable (the one you expect to be affected by the variations/manipulation). If there are more variables on either side, you are dealing with a complex hypothesis. You can also distinguish hypotheses according to the kind of relationship between the variables you are interested in (e.g., causal or associative ). But apart from these variations, we are usually interested in what is called the “alternative hypothesis” and, in contrast to that, the “null hypothesis”. If you think these two should be listed the other way round, then you are right, logically speaking—the alternative should surely come second. However, since this is the hypothesis we (as researchers) are usually interested in, let’s start from there.

Alternative Hypothesis

If you predict a relationship between two variables in your study, then the research hypothesis that you formulate to describe that relationship is your alternative hypothesis (usually H1 in statistical terms). The goal of your hypothesis testing is thus to demonstrate that there is sufficient evidence that supports the alternative hypothesis, rather than evidence for the possibility that there is no such relationship. The alternative hypothesis is usually the research hypothesis of a study and is based on the literature, previous observations, and widely known theories. 

Null Hypothesis

The hypothesis that describes the other possible outcome, that is, that your variables are not related, is the null hypothesis ( H0 ). Based on your findings, you choose between the two hypotheses—usually that means that if your prediction was correct, you reject the null hypothesis and accept the alternative. Make sure, however, that you are not getting lost at this step of the thinking process: If your prediction is that there will be no difference or change, then you are trying to find support for the null hypothesis and reject H1. 

Directional Hypothesis

While the null hypothesis is obviously “static”, the alternative hypothesis can specify a direction for the observed relationship between variables—for example, that mice with higher expression levels of a certain protein are more active than those with lower levels. This is then called a one-tailed hypothesis. 

Another example for a directional one-tailed alternative hypothesis would be that 

H1: Attending private classes before important exams has a positive effect on performance. 

Your null hypothesis would then be that

H0: Attending private classes before important exams has no/a negative effect on performance.

Nondirectional Hypothesis

A nondirectional hypothesis does not specify the direction of the potentially observed effect, only that there is a relationship between the studied variables—this is called a two-tailed hypothesis. For instance, if you are studying a new drug that has shown some effects on pathways involved in a certain condition (e.g., anxiety) in vitro in the lab, but you can’t say for sure whether it will have the same effects in an animal model or maybe induce other/side effects that you can’t predict and potentially increase anxiety levels instead, you could state the two hypotheses like this:

H1: The only lab-tested drug (somehow) affects anxiety levels in an anxiety mouse model.

You then test this nondirectional alternative hypothesis against the null hypothesis:

H0: The only lab-tested drug has no effect on anxiety levels in an anxiety mouse model.

hypothesis in a research paper

How to Write a Hypothesis for a Research Paper

Now that we understand the important distinctions between different kinds of research hypotheses, let’s look at a simple process of how to write a hypothesis.

Writing a Hypothesis Step:1

Ask a question, based on earlier research. Research always starts with a question, but one that takes into account what is already known about a topic or phenomenon. For example, if you are interested in whether people who have pets are happier than those who don’t, do a literature search and find out what has already been demonstrated. You will probably realize that yes, there is quite a bit of research that shows a relationship between happiness and owning a pet—and even studies that show that owning a dog is more beneficial than owning a cat ! Let’s say you are so intrigued by this finding that you wonder: 

What is it that makes dog owners even happier than cat owners? 

Let’s move on to Step 2 and find an answer to that question.

Writing a Hypothesis Step 2:

Formulate a strong hypothesis by answering your own question. Again, you don’t want to make things up, take unicorns into account, or repeat/ignore what has already been done. Looking at the dog-vs-cat papers your literature search returned, you see that most studies are based on self-report questionnaires on personality traits, mental health, and life satisfaction. What you don’t find is any data on actual (mental or physical) health measures, and no experiments. You therefore decide to make a bold claim come up with the carefully thought-through hypothesis that it’s maybe the lifestyle of the dog owners, which includes walking their dog several times per day, engaging in fun and healthy activities such as agility competitions, and taking them on trips, that gives them that extra boost in happiness. You could therefore answer your question in the following way:

Dog owners are happier than cat owners because of the dog-related activities they engage in.

Now you have to verify that your hypothesis fulfills the two requirements we introduced at the beginning of this resource article: falsifiability and testability . If it can’t be wrong and can’t be tested, it’s not a hypothesis. We are lucky, however, because yes, we can test whether owning a dog but not engaging in any of those activities leads to lower levels of happiness or well-being than owning a dog and playing and running around with them or taking them on trips.  

Writing a Hypothesis Step 3:

Make your predictions and define your variables. We have verified that we can test our hypothesis, but now we have to define all the relevant variables, design our experiment or data analysis, and make precise predictions. You could, for example, decide to study dog owners (not surprising at this point), let them fill in questionnaires about their lifestyle as well as their life satisfaction (as other studies did), and then compare two groups of active and inactive dog owners. Alternatively, if you want to go beyond the data that earlier studies produced and analyzed and directly manipulate the activity level of your dog owners to study the effect of that manipulation, you could invite them to your lab, select groups of participants with similar lifestyles, make them change their lifestyle (e.g., couch potato dog owners start agility classes, very active ones have to refrain from any fun activities for a certain period of time) and assess their happiness levels before and after the intervention. In both cases, your independent variable would be “ level of engagement in fun activities with dog” and your dependent variable would be happiness or well-being . 

Examples of a Good and Bad Hypothesis

Let’s look at a few examples of good and bad hypotheses to get you started.

Good Hypothesis Examples

Bad hypothesis examples, tips for writing a research hypothesis.

If you understood the distinction between a hypothesis and a prediction we made at the beginning of this article, then you will have no problem formulating your hypotheses and predictions correctly. To refresh your memory: We have to (1) look at existing evidence, (2) come up with a hypothesis, (3) make a prediction, and (4) design an experiment. For example, you could summarize your dog/happiness study like this:

(1) While research suggests that dog owners are happier than cat owners, there are no reports on what factors drive this difference. (2) We hypothesized that it is the fun activities that many dog owners (but very few cat owners) engage in with their pets that increases their happiness levels. (3) We thus predicted that preventing very active dog owners from engaging in such activities for some time and making very inactive dog owners take up such activities would lead to an increase and decrease in their overall self-ratings of happiness, respectively. (4) To test this, we invited dog owners into our lab, assessed their mental and emotional well-being through questionnaires, and then assigned them to an “active” and an “inactive” group, depending on… 

Note that you use “we hypothesize” only for your hypothesis, not for your experimental prediction, and “would” or “if – then” only for your prediction, not your hypothesis. A hypothesis that states that something “would” affect something else sounds as if you don’t have enough confidence to make a clear statement—in which case you can’t expect your readers to believe in your research either. Write in the present tense, don’t use modal verbs that express varying degrees of certainty (such as may, might, or could ), and remember that you are not drawing a conclusion while trying not to exaggerate but making a clear statement that you then, in a way, try to disprove . And if that happens, that is not something to fear but an important part of the scientific process.

Similarly, don’t use “we hypothesize” when you explain the implications of your research or make predictions in the conclusion section of your manuscript, since these are clearly not hypotheses in the true sense of the word. As we said earlier, you will find that many authors of academic articles do not seem to care too much about these rather subtle distinctions, but thinking very clearly about your own research will not only help you write better but also ensure that even that infamous Reviewer 2 will find fewer reasons to nitpick about your manuscript. 

Perfect Your Manuscript With Professional Editing

Now that you know how to write a strong research hypothesis for your research paper, you might be interested in our free AI proofreader , Wordvice AI, which finds and fixes errors in grammar, punctuation, and word choice in academic texts. Or if you are interested in human proofreading , check out our English editing services , including research paper editing and manuscript editing .

On the Wordvice academic resources website , you can also find many more articles and other resources that can help you with writing the other parts of your research paper , with making a research paper outline before you put everything together, or with writing an effective cover letter once you are ready to submit.

Have a language expert improve your writing

Run a free plagiarism check in 10 minutes, automatically generate references for free.

  • Knowledge Base
  • Methodology
  • How to Write a Strong Hypothesis | Guide & Examples

How to Write a Strong Hypothesis | Guide & Examples

Published on 6 May 2022 by Shona McCombes .

A hypothesis is a statement that can be tested by scientific research. If you want to test a relationship between two or more variables, you need to write hypotheses before you start your experiment or data collection.

Table of contents

What is a hypothesis, developing a hypothesis (with example), hypothesis examples, frequently asked questions about writing hypotheses.

A hypothesis states your predictions about what your research will find. It is a tentative answer to your research question that has not yet been tested. For some research projects, you might have to write several hypotheses that address different aspects of your research question.

A hypothesis is not just a guess – it should be based on existing theories and knowledge. It also has to be testable, which means you can support or refute it through scientific research methods (such as experiments, observations, and statistical analysis of data).

Variables in hypotheses

Hypotheses propose a relationship between two or more variables . An independent variable is something the researcher changes or controls. A dependent variable is something the researcher observes and measures.

In this example, the independent variable is exposure to the sun – the assumed cause . The dependent variable is the level of happiness – the assumed effect .

Prevent plagiarism, run a free check.

Step 1: ask a question.

Writing a hypothesis begins with a research question that you want to answer. The question should be focused, specific, and researchable within the constraints of your project.

Step 2: Do some preliminary research

Your initial answer to the question should be based on what is already known about the topic. Look for theories and previous studies to help you form educated assumptions about what your research will find.

At this stage, you might construct a conceptual framework to identify which variables you will study and what you think the relationships are between them. Sometimes, you’ll have to operationalise more complex constructs.

Step 3: Formulate your hypothesis

Now you should have some idea of what you expect to find. Write your initial answer to the question in a clear, concise sentence.

Step 4: Refine your hypothesis

You need to make sure your hypothesis is specific and testable. There are various ways of phrasing a hypothesis, but all the terms you use should have clear definitions, and the hypothesis should contain:

  • The relevant variables
  • The specific group being studied
  • The predicted outcome of the experiment or analysis

Step 5: Phrase your hypothesis in three ways

To identify the variables, you can write a simple prediction in if … then form. The first part of the sentence states the independent variable and the second part states the dependent variable.

In academic research, hypotheses are more commonly phrased in terms of correlations or effects, where you directly state the predicted relationship between variables.

If you are comparing two groups, the hypothesis can state what difference you expect to find between them.

Step 6. Write a null hypothesis

If your research involves statistical hypothesis testing , you will also have to write a null hypothesis. The null hypothesis is the default position that there is no association between the variables. The null hypothesis is written as H 0 , while the alternative hypothesis is H 1 or H a .

Hypothesis testing is a formal procedure for investigating our ideas about the world using statistics. It is used by scientists to test specific predictions, called hypotheses , by calculating how likely it is that a pattern or relationship between variables could have arisen by chance.

A hypothesis is not just a guess. It should be based on existing theories and knowledge. It also has to be testable, which means you can support or refute it through scientific research methods (such as experiments, observations, and statistical analysis of data).

A research hypothesis is your proposed answer to your research question. The research hypothesis usually includes an explanation (‘ x affects y because …’).

A statistical hypothesis, on the other hand, is a mathematical statement about a population parameter. Statistical hypotheses always come in pairs: the null and alternative hypotheses. In a well-designed study , the statistical hypotheses correspond logically to the research hypothesis.

Cite this Scribbr article

If you want to cite this source, you can copy and paste the citation or click the ‘Cite this Scribbr article’ button to automatically add the citation to our free Reference Generator.

McCombes, S. (2022, May 06). How to Write a Strong Hypothesis | Guide & Examples. Scribbr. Retrieved 22 April 2024, from https://www.scribbr.co.uk/research-methods/hypothesis-writing/

Is this article helpful?

Shona McCombes

Shona McCombes

Other students also liked, operationalisation | a guide with examples, pros & cons, what is a conceptual framework | tips & examples, a quick guide to experimental design | 5 steps & examples.

Learn How To Write A Hypothesis For Your Next Research Project!

blog image

Undoubtedly, research plays a crucial role in substantiating or refuting our assumptions. These assumptions act as potential answers to our questions. Such assumptions, also known as hypotheses, are considered key aspects of research. In this blog, we delve into the significance of hypotheses. And provide insights on how to write them effectively. So, let’s dive in and explore the art of writing hypotheses together.

Table of Contents

What is a Hypothesis?

A hypothesis is a crucial starting point in scientific research. It is an educated guess about the relationship between two or more variables. In other words, a hypothesis acts as a foundation for a researcher to build their study.

Here are some examples of well-crafted hypotheses:

  • Increased exposure to natural sunlight improves sleep quality in adults.

A positive relationship between natural sunlight exposure and sleep quality in adult individuals.

  • Playing puzzle games on a regular basis enhances problem-solving abilities in children.

Engaging in frequent puzzle gameplay leads to improved problem-solving skills in children.

  • Students and improved learning hecks.

S tudents using online  paper writing service  platforms (as a learning tool for receiving personalized feedback and guidance) will demonstrate improved writing skills. (compared to those who do not utilize such platforms).

  • The use of APA format in research papers. 

Using the  APA format  helps students stay organized when writing research papers. Organized students can focus better on their topics and, as a result, produce better quality work.

The Building Blocks of a Hypothesis

To better understand the concept of a hypothesis, let’s break it down into its basic components:

  • Variables . A hypothesis involves at least two variables. An independent variable and a dependent variable. The independent variable is the one being changed or manipulated, while the dependent variable is the one being measured or observed.
  • Relationship : A hypothesis proposes a relationship or connection between the variables. This could be a cause-and-effect relationship or a correlation between them.
  • Testability : A hypothesis should be testable and falsifiable, meaning it can be proven right or wrong through experimentation or observation.

Types of Hypotheses

When learning how to write a hypothesis, it’s essential to understand its main types. These include; alternative hypotheses and null hypotheses. In the following section, we explore both types of hypotheses with examples. 

Alternative Hypothesis (H1)

This kind of hypothesis suggests a relationship or effect between the variables. It is the main focus of the study. The researcher wants to either prove or disprove it. Many research divides this hypothesis into two subsections: 

  • Directional 

This type of H1 predicts a specific outcome. Many researchers use this hypothesis to explore the relationship between variables rather than the groups. 

  • Non-directional

You can take a guess from the name. This type of H1 does not provide a specific prediction for the research outcome. 

Here are some examples for your better understanding of how to write a hypothesis.

  • Consuming caffeine improves cognitive performance.  (This hypothesis predicts that there is a positive relationship between caffeine consumption and cognitive performance.)
  • Aerobic exercise leads to reduced blood pressure.  (This hypothesis suggests that engaging in aerobic exercise results in lower blood pressure readings.)
  • Exposure to nature reduces stress levels among employees.  (Here, the hypothesis proposes that employees exposed to natural environments will experience decreased stress levels.)
  • Listening to classical music while studying increases memory retention.  (This hypothesis speculates that studying with classical music playing in the background boosts students’ ability to retain information.)
  • Early literacy intervention improves reading skills in children.  (This hypothesis claims that providing early literacy assistance to children results in enhanced reading abilities.)
  • Time management in nursing students. ( Students who use a  nursing research paper writing service  have more time to focus on their studies and can achieve better grades in other subjects. )

Null Hypothesis (H0)

A null hypothesis assumes no relationship or effect between the variables. If the alternative hypothesis is proven to be false, the null hypothesis is considered to be true. Usually a null hypothesis shows no direct correlation between the defined variables. 

Here are some of the examples

  • The consumption of herbal tea has no effect on sleep quality.  (This hypothesis assumes that herbal tea consumption does not impact the quality of sleep.)
  • The number of hours spent playing video games is unrelated to academic performance.  (Here, the null hypothesis suggests that no relationship exists between video gameplay duration and academic achievement.)
  • Implementing flexible work schedules has no influence on employee job satisfaction.  (This hypothesis contends that providing flexible schedules does not affect how satisfied employees are with their jobs.)
  • Writing ability of a 7th grader is not affected by reading editorial example. ( There is no relationship between reading an  editorial example  and improving a 7th grader’s writing abilities.) 
  • The type of lighting in a room does not affect people’s mood.  (In this null hypothesis, there is no connection between the kind of lighting in a room and the mood of those present.)
  • The use of social media during break time does not impact productivity at work.  (This hypothesis proposes that social media usage during breaks has no effect on work productivity.)

As you learn how to write a hypothesis, remember that aiming for clarity, testability, and relevance to your research question is vital. By mastering this skill, you’re well on your way to conducting impactful scientific research. Good luck!

Importance of a Hypothesis in Research

A well-structured hypothesis is a vital part of any research project for several reasons:

  • It provides clear direction for the study by setting its focus and purpose.
  • It outlines expectations of the research, making it easier to measure results.
  • It helps identify any potential limitations in the study, allowing researchers to refine their approach.

In conclusion, a hypothesis plays a fundamental role in the research process. By understanding its concept and constructing a well-thought-out hypothesis, researchers lay the groundwork for a successful, scientifically sound investigation.

How to Write a Hypothesis?

Here are five steps that you can follow to write an effective hypothesis. 

Step 1: Identify Your Research Question

The first step in learning how to compose a hypothesis is to clearly define your research question. This question is the central focus of your study and will help you determine the direction of your hypothesis.

Step 2: Determine the Variables

When exploring how to write a hypothesis, it’s crucial to identify the variables involved in your study. You’ll need at least two variables:

  • Independent variable : The factor you manipulate or change in your experiment.
  • Dependent variable : The outcome or result you observe or measure, which is influenced by the independent variable.

Step 3: Build the Hypothetical Relationship

In understanding how to compose a hypothesis, constructing the relationship between the variables is key. Based on your research question and variables, predict the expected outcome or connection. This prediction should be specific, testable, and, if possible, expressed in the “If…then” format.

Step 4: Write the Null Hypothesis

When mastering how to write a hypothesis, it’s important to create a null hypothesis as well. The null hypothesis assumes no relationship or effect between the variables, acting as a counterpoint to your primary hypothesis.

Step 5: Review Your Hypothesis

Finally, when learning how to compose a hypothesis, it’s essential to review your hypothesis for clarity, testability, and relevance to your research question. Make any necessary adjustments to ensure it provides a solid basis for your study.

In conclusion, understanding how to write a hypothesis is crucial for conducting successful scientific research. By focusing on your research question and carefully building relationships between variables, you will lay a strong foundation for advancing research and knowledge in your field.

Hypothesis vs. Prediction: What’s the Difference?

Understanding the differences between a hypothesis and a prediction is crucial in scientific research. Often, these terms are used interchangeably, but they have distinct meanings and functions. This segment aims to clarify these differences and explain how to compose a hypothesis correctly, helping you improve the quality of your research projects.

Hypothesis: The Foundation of Your Research

A hypothesis is an educated guess about the relationship between two or more variables. It provides the basis for your research question and is a starting point for an experiment or observational study.

The critical elements for a hypothesis include:

  • Specificity: A clear and concise statement that describes the relationship between variables.
  • Testability: The ability to test the hypothesis through experimentation or observation.

To learn how to write a hypothesis, it’s essential to identify your research question first and then predict the relationship between the variables.

Prediction: The Expected Outcome

A prediction is a statement about a specific outcome you expect to see in your experiment or observational study. It’s derived from the hypothesis and provides a measurable way to test the relationship between variables.

Here’s an example of how to write a hypothesis and a related prediction:

  • Hypothesis: Consuming a high-sugar diet leads to weight gain.
  • Prediction: People who consume a high-sugar diet for six weeks will gain more weight than those who maintain a low-sugar diet during the same period.

Key Differences Between a Hypothesis and a Prediction

While a hypothesis and prediction are both essential components of scientific research, there are some key differences to keep in mind:

  • A hypothesis is an educated guess that suggests a relationship between variables, while a prediction is a specific and measurable outcome based on that hypothesis.
  • A hypothesis can give rise to multiple experiment or observational study predictions.

To conclude, understanding the differences between a hypothesis and a prediction, and learning how to write a hypothesis, are essential steps to form a robust foundation for your research. By creating clear, testable hypotheses along with specific, measurable predictions, you lay the groundwork for scientifically sound investigations.

Here’s a wrap-up for this guide on how to write a hypothesis. We’re confident this article was helpful for many of you. We understand that many students struggle with writing their school research . However, we hope to continue assisting you through our blog tutorial on writing different aspects of academic assignments.

For further information, you can check out our reverent blog or contact our professionals to avail amazing writing services. Paper perk experts tailor assignments to reflect your unique voice and perspectives. Our professionals make sure to stick around till your satisfaction. So what are you waiting for? Pick your required service and order away!

Order Original Papers & Essays

Your First Custom Paper Sample is on Us!

timely deliveries

Timely Deliveries

premium quality

No Plagiarism & AI

unlimited revisions

100% Refund

Try Our Free Paper Writing Service

Related blogs.

blog-img

Connections with Writers and support

safe service

Privacy and Confidentiality Guarantee

quality-score

Average Quality Score

PrepScholar

Choose Your Test

Sat / act prep online guides and tips, what is a hypothesis and how do i write one.

author image

General Education

body-glowing-question-mark

Think about something strange and unexplainable in your life. Maybe you get a headache right before it rains, or maybe you think your favorite sports team wins when you wear a certain color. If you wanted to see whether these are just coincidences or scientific fact, you would form a hypothesis, then create an experiment to see whether that hypothesis is true or not.

But what is a hypothesis, anyway? If you’re not sure about what a hypothesis is--or how to test for one!--you’re in the right place. This article will teach you everything you need to know about hypotheses, including: 

  • Defining the term “hypothesis” 
  • Providing hypothesis examples 
  • Giving you tips for how to write your own hypothesis

So let’s get started!

body-picture-ask-sign

What Is a Hypothesis?

Merriam Webster defines a hypothesis as “an assumption or concession made for the sake of argument.” In other words, a hypothesis is an educated guess . Scientists make a reasonable assumption--or a hypothesis--then design an experiment to test whether it’s true or not. Keep in mind that in science, a hypothesis should be testable. You have to be able to design an experiment that tests your hypothesis in order for it to be valid. 

As you could assume from that statement, it’s easy to make a bad hypothesis. But when you’re holding an experiment, it’s even more important that your guesses be good...after all, you’re spending time (and maybe money!) to figure out more about your observation. That’s why we refer to a hypothesis as an educated guess--good hypotheses are based on existing data and research to make them as sound as possible.

Hypotheses are one part of what’s called the scientific method .  Every (good) experiment or study is based in the scientific method. The scientific method gives order and structure to experiments and ensures that interference from scientists or outside influences does not skew the results. It’s important that you understand the concepts of the scientific method before holding your own experiment. Though it may vary among scientists, the scientific method is generally made up of six steps (in order):

  • Observation
  • Asking questions
  • Forming a hypothesis
  • Analyze the data
  • Communicate your results

You’ll notice that the hypothesis comes pretty early on when conducting an experiment. That’s because experiments work best when they’re trying to answer one specific question. And you can’t conduct an experiment until you know what you’re trying to prove!

Independent and Dependent Variables 

After doing your research, you’re ready for another important step in forming your hypothesis: identifying variables. Variables are basically any factor that could influence the outcome of your experiment . Variables have to be measurable and related to the topic being studied.

There are two types of variables:  independent variables and dependent variables. I ndependent variables remain constant . For example, age is an independent variable; it will stay the same, and researchers can look at different ages to see if it has an effect on the dependent variable. 

Speaking of dependent variables... dependent variables are subject to the influence of the independent variable , meaning that they are not constant. Let’s say you want to test whether a person’s age affects how much sleep they need. In that case, the independent variable is age (like we mentioned above), and the dependent variable is how much sleep a person gets. 

Variables will be crucial in writing your hypothesis. You need to be able to identify which variable is which, as both the independent and dependent variables will be written into your hypothesis. For instance, in a study about exercise, the independent variable might be the speed at which the respondents walk for thirty minutes, and the dependent variable would be their heart rate. In your study and in your hypothesis, you’re trying to understand the relationship between the two variables.

Elements of a Good Hypothesis

The best hypotheses start by asking the right questions . For instance, if you’ve observed that the grass is greener when it rains twice a week, you could ask what kind of grass it is, what elevation it’s at, and if the grass across the street responds to rain in the same way. Any of these questions could become the backbone of experiments to test why the grass gets greener when it rains fairly frequently.

As you’re asking more questions about your first observation, make sure you’re also making more observations . If it doesn’t rain for two weeks and the grass still looks green, that’s an important observation that could influence your hypothesis. You'll continue observing all throughout your experiment, but until the hypothesis is finalized, every observation should be noted.

Finally, you should consult secondary research before writing your hypothesis . Secondary research is comprised of results found and published by other people. You can usually find this information online or at your library. Additionally, m ake sure the research you find is credible and related to your topic. If you’re studying the correlation between rain and grass growth, it would help you to research rain patterns over the past twenty years for your county, published by a local agricultural association. You should also research the types of grass common in your area, the type of grass in your lawn, and whether anyone else has conducted experiments about your hypothesis. Also be sure you’re checking the quality of your research . Research done by a middle school student about what minerals can be found in rainwater would be less useful than an article published by a local university.

body-pencil-notebook-writing

Writing Your Hypothesis

Once you’ve considered all of the factors above, you’re ready to start writing your hypothesis. Hypotheses usually take a certain form when they’re written out in a research report.

When you boil down your hypothesis statement, you are writing down your best guess and not the question at hand . This means that your statement should be written as if it is fact already, even though you are simply testing it.

The reason for this is that, after you have completed your study, you'll either accept or reject your if-then or your null hypothesis. All hypothesis testing examples should be measurable and able to be confirmed or denied. You cannot confirm a question, only a statement! 

In fact, you come up with hypothesis examples all the time! For instance, when you guess on the outcome of a basketball game, you don’t say, “Will the Miami Heat beat the Boston Celtics?” but instead, “I think the Miami Heat will beat the Boston Celtics.” You state it as if it is already true, even if it turns out you’re wrong. You do the same thing when writing your hypothesis.

Additionally, keep in mind that hypotheses can range from very specific to very broad.  These hypotheses can be specific, but if your hypothesis testing examples involve a broad range of causes and effects, your hypothesis can also be broad.  

body-hand-number-two

The Two Types of Hypotheses

Now that you understand what goes into a hypothesis, it’s time to look more closely at the two most common types of hypothesis: the if-then hypothesis and the null hypothesis.

#1: If-Then Hypotheses

First of all, if-then hypotheses typically follow this formula:

If ____ happens, then ____ will happen.

The goal of this type of hypothesis is to test the causal relationship between the independent and dependent variable. It’s fairly simple, and each hypothesis can vary in how detailed it can be. We create if-then hypotheses all the time with our daily predictions. Here are some examples of hypotheses that use an if-then structure from daily life: 

  • If I get enough sleep, I’ll be able to get more work done tomorrow.
  • If the bus is on time, I can make it to my friend’s birthday party. 
  • If I study every night this week, I’ll get a better grade on my exam. 

In each of these situations, you’re making a guess on how an independent variable (sleep, time, or studying) will affect a dependent variable (the amount of work you can do, making it to a party on time, or getting better grades). 

You may still be asking, “What is an example of a hypothesis used in scientific research?” Take one of the hypothesis examples from a real-world study on whether using technology before bed affects children’s sleep patterns. The hypothesis read s:

“We hypothesized that increased hours of tablet- and phone-based screen time at bedtime would be inversely correlated with sleep quality and child attention.”

It might not look like it, but this is an if-then statement. The researchers basically said, “If children have more screen usage at bedtime, then their quality of sleep and attention will be worse.” The sleep quality and attention are the dependent variables and the screen usage is the independent variable. (Usually, the independent variable comes after the “if” and the dependent variable comes after the “then,” as it is the independent variable that affects the dependent variable.) This is an excellent example of how flexible hypothesis statements can be, as long as the general idea of “if-then” and the independent and dependent variables are present.

#2: Null Hypotheses

Your if-then hypothesis is not the only one needed to complete a successful experiment, however. You also need a null hypothesis to test it against. In its most basic form, the null hypothesis is the opposite of your if-then hypothesis . When you write your null hypothesis, you are writing a hypothesis that suggests that your guess is not true, and that the independent and dependent variables have no relationship .

One null hypothesis for the cell phone and sleep study from the last section might say: 

“If children have more screen usage at bedtime, their quality of sleep and attention will not be worse.” 

In this case, this is a null hypothesis because it’s asking the opposite of the original thesis! 

Conversely, if your if-then hypothesis suggests that your two variables have no relationship, then your null hypothesis would suggest that there is one. So, pretend that there is a study that is asking the question, “Does the amount of followers on Instagram influence how long people spend on the app?” The independent variable is the amount of followers, and the dependent variable is the time spent. But if you, as the researcher, don’t think there is a relationship between the number of followers and time spent, you might write an if-then hypothesis that reads:

“If people have many followers on Instagram, they will not spend more time on the app than people who have less.”

In this case, the if-then suggests there isn’t a relationship between the variables. In that case, one of the null hypothesis examples might say:

“If people have many followers on Instagram, they will spend more time on the app than people who have less.”

You then test both the if-then and the null hypothesis to gauge if there is a relationship between the variables, and if so, how much of a relationship. 

feature_tips

4 Tips to Write the Best Hypothesis

If you’re going to take the time to hold an experiment, whether in school or by yourself, you’re also going to want to take the time to make sure your hypothesis is a good one. The best hypotheses have four major elements in common: plausibility, defined concepts, observability, and general explanation.

#1: Plausibility

At first glance, this quality of a hypothesis might seem obvious. When your hypothesis is plausible, that means it’s possible given what we know about science and general common sense. However, improbable hypotheses are more common than you might think. 

Imagine you’re studying weight gain and television watching habits. If you hypothesize that people who watch more than  twenty hours of television a week will gain two hundred pounds or more over the course of a year, this might be improbable (though it’s potentially possible). Consequently, c ommon sense can tell us the results of the study before the study even begins.

Improbable hypotheses generally go against  science, as well. Take this hypothesis example: 

“If a person smokes one cigarette a day, then they will have lungs just as healthy as the average person’s.” 

This hypothesis is obviously untrue, as studies have shown again and again that cigarettes negatively affect lung health. You must be careful that your hypotheses do not reflect your own personal opinion more than they do scientifically-supported findings. This plausibility points to the necessity of research before the hypothesis is written to make sure that your hypothesis has not already been disproven.

#2: Defined Concepts

The more advanced you are in your studies, the more likely that the terms you’re using in your hypothesis are specific to a limited set of knowledge. One of the hypothesis testing examples might include the readability of printed text in newspapers, where you might use words like “kerning” and “x-height.” Unless your readers have a background in graphic design, it’s likely that they won’t know what you mean by these terms. Thus, it’s important to either write what they mean in the hypothesis itself or in the report before the hypothesis.

Here’s what we mean. Which of the following sentences makes more sense to the common person?

If the kerning is greater than average, more words will be read per minute.

If the space between letters is greater than average, more words will be read per minute.

For people reading your report that are not experts in typography, simply adding a few more words will be helpful in clarifying exactly what the experiment is all about. It’s always a good idea to make your research and findings as accessible as possible. 

body-blue-eye

Good hypotheses ensure that you can observe the results. 

#3: Observability

In order to measure the truth or falsity of your hypothesis, you must be able to see your variables and the way they interact. For instance, if your hypothesis is that the flight patterns of satellites affect the strength of certain television signals, yet you don’t have a telescope to view the satellites or a television to monitor the signal strength, you cannot properly observe your hypothesis and thus cannot continue your study.

Some variables may seem easy to observe, but if you do not have a system of measurement in place, you cannot observe your hypothesis properly. Here’s an example: if you’re experimenting on the effect of healthy food on overall happiness, but you don’t have a way to monitor and measure what “overall happiness” means, your results will not reflect the truth. Monitoring how often someone smiles for a whole day is not reasonably observable, but having the participants state how happy they feel on a scale of one to ten is more observable. 

In writing your hypothesis, always keep in mind how you'll execute the experiment.

#4: Generalizability 

Perhaps you’d like to study what color your best friend wears the most often by observing and documenting the colors she wears each day of the week. This might be fun information for her and you to know, but beyond you two, there aren’t many people who could benefit from this experiment. When you start an experiment, you should note how generalizable your findings may be if they are confirmed. Generalizability is basically how common a particular phenomenon is to other people’s everyday life.

Let’s say you’re asking a question about the health benefits of eating an apple for one day only, you need to realize that the experiment may be too specific to be helpful. It does not help to explain a phenomenon that many people experience. If you find yourself with too specific of a hypothesis, go back to asking the big question: what is it that you want to know, and what do you think will happen between your two variables?

body-experiment-chemistry

Hypothesis Testing Examples

We know it can be hard to write a good hypothesis unless you’ve seen some good hypothesis examples. We’ve included four hypothesis examples based on some made-up experiments. Use these as templates or launch pads for coming up with your own hypotheses.

Experiment #1: Students Studying Outside (Writing a Hypothesis)

You are a student at PrepScholar University. When you walk around campus, you notice that, when the temperature is above 60 degrees, more students study in the quad. You want to know when your fellow students are more likely to study outside. With this information, how do you make the best hypothesis possible?

You must remember to make additional observations and do secondary research before writing your hypothesis. In doing so, you notice that no one studies outside when it’s 75 degrees and raining, so this should be included in your experiment. Also, studies done on the topic beforehand suggested that students are more likely to study in temperatures less than 85 degrees. With this in mind, you feel confident that you can identify your variables and write your hypotheses:

If-then: “If the temperature in Fahrenheit is less than 60 degrees, significantly fewer students will study outside.”

Null: “If the temperature in Fahrenheit is less than 60 degrees, the same number of students will study outside as when it is more than 60 degrees.”

These hypotheses are plausible, as the temperatures are reasonably within the bounds of what is possible. The number of people in the quad is also easily observable. It is also not a phenomenon specific to only one person or at one time, but instead can explain a phenomenon for a broader group of people.

To complete this experiment, you pick the month of October to observe the quad. Every day (except on the days where it’s raining)from 3 to 4 PM, when most classes have released for the day, you observe how many people are on the quad. You measure how many people come  and how many leave. You also write down the temperature on the hour. 

After writing down all of your observations and putting them on a graph, you find that the most students study on the quad when it is 70 degrees outside, and that the number of students drops a lot once the temperature reaches 60 degrees or below. In this case, your research report would state that you accept or “failed to reject” your first hypothesis with your findings.

Experiment #2: The Cupcake Store (Forming a Simple Experiment)

Let’s say that you work at a bakery. You specialize in cupcakes, and you make only two colors of frosting: yellow and purple. You want to know what kind of customers are more likely to buy what kind of cupcake, so you set up an experiment. Your independent variable is the customer’s gender, and the dependent variable is the color of the frosting. What is an example of a hypothesis that might answer the question of this study?

Here’s what your hypotheses might look like: 

If-then: “If customers’ gender is female, then they will buy more yellow cupcakes than purple cupcakes.”

Null: “If customers’ gender is female, then they will be just as likely to buy purple cupcakes as yellow cupcakes.”

This is a pretty simple experiment! It passes the test of plausibility (there could easily be a difference), defined concepts (there’s nothing complicated about cupcakes!), observability (both color and gender can be easily observed), and general explanation ( this would potentially help you make better business decisions ).

body-bird-feeder

Experiment #3: Backyard Bird Feeders (Integrating Multiple Variables and Rejecting the If-Then Hypothesis)

While watching your backyard bird feeder, you realized that different birds come on the days when you change the types of seeds. You decide that you want to see more cardinals in your backyard, so you decide to see what type of food they like the best and set up an experiment. 

However, one morning, you notice that, while some cardinals are present, blue jays are eating out of your backyard feeder filled with millet. You decide that, of all of the other birds, you would like to see the blue jays the least. This means you'll have more than one variable in your hypothesis. Your new hypotheses might look like this: 

If-then: “If sunflower seeds are placed in the bird feeders, then more cardinals will come than blue jays. If millet is placed in the bird feeders, then more blue jays will come than cardinals.”

Null: “If either sunflower seeds or millet are placed in the bird, equal numbers of cardinals and blue jays will come.”

Through simple observation, you actually find that cardinals come as often as blue jays when sunflower seeds or millet is in the bird feeder. In this case, you would reject your “if-then” hypothesis and “fail to reject” your null hypothesis . You cannot accept your first hypothesis, because it’s clearly not true. Instead you found that there was actually no relation between your different variables. Consequently, you would need to run more experiments with different variables to see if the new variables impact the results.

Experiment #4: In-Class Survey (Including an Alternative Hypothesis)

You’re about to give a speech in one of your classes about the importance of paying attention. You want to take this opportunity to test a hypothesis you’ve had for a while: 

If-then: If students sit in the first two rows of the classroom, then they will listen better than students who do not.

Null: If students sit in the first two rows of the classroom, then they will not listen better or worse than students who do not.

You give your speech and then ask your teacher if you can hand out a short survey to the class. On the survey, you’ve included questions about some of the topics you talked about. When you get back the results, you’re surprised to see that not only do the students in the first two rows not pay better attention, but they also scored worse than students in other parts of the classroom! Here, both your if-then and your null hypotheses are not representative of your findings. What do you do?

This is when you reject both your if-then and null hypotheses and instead create an alternative hypothesis . This type of hypothesis is used in the rare circumstance that neither of your hypotheses is able to capture your findings . Now you can use what you’ve learned to draft new hypotheses and test again! 

Key Takeaways: Hypothesis Writing

The more comfortable you become with writing hypotheses, the better they will become. The structure of hypotheses is flexible and may need to be changed depending on what topic you are studying. The most important thing to remember is the purpose of your hypothesis and the difference between the if-then and the null . From there, in forming your hypothesis, you should constantly be asking questions, making observations, doing secondary research, and considering your variables. After you have written your hypothesis, be sure to edit it so that it is plausible, clearly defined, observable, and helpful in explaining a general phenomenon.

Writing a hypothesis is something that everyone, from elementary school children competing in a science fair to professional scientists in a lab, needs to know how to do. Hypotheses are vital in experiments and in properly executing the scientific method . When done correctly, hypotheses will set up your studies for success and help you to understand the world a little better, one experiment at a time.

body-whats-next-post-it-note

What’s Next?

If you’re studying for the science portion of the ACT, there’s definitely a lot you need to know. We’ve got the tools to help, though! Start by checking out our ultimate study guide for the ACT Science subject test. Once you read through that, be sure to download our recommended ACT Science practice tests , since they’re one of the most foolproof ways to improve your score. (And don’t forget to check out our expert guide book , too.)

If you love science and want to major in a scientific field, you should start preparing in high school . Here are the science classes you should take to set yourself up for success.

If you’re trying to think of science experiments you can do for class (or for a science fair!), here’s a list of 37 awesome science experiments you can do at home

author image

Ashley Sufflé Robinson has a Ph.D. in 19th Century English Literature. As a content writer for PrepScholar, Ashley is passionate about giving college-bound students the in-depth information they need to get into the school of their dreams.

Student and Parent Forum

Our new student and parent forum, at ExpertHub.PrepScholar.com , allow you to interact with your peers and the PrepScholar staff. See how other students and parents are navigating high school, college, and the college admissions process. Ask questions; get answers.

Join the Conversation

Ask a Question Below

Have any questions about this article or other topics? Ask below and we'll reply!

Improve With Our Famous Guides

  • For All Students

The 5 Strategies You Must Be Using to Improve 160+ SAT Points

How to Get a Perfect 1600, by a Perfect Scorer

Series: How to Get 800 on Each SAT Section:

Score 800 on SAT Math

Score 800 on SAT Reading

Score 800 on SAT Writing

Series: How to Get to 600 on Each SAT Section:

Score 600 on SAT Math

Score 600 on SAT Reading

Score 600 on SAT Writing

Free Complete Official SAT Practice Tests

What SAT Target Score Should You Be Aiming For?

15 Strategies to Improve Your SAT Essay

The 5 Strategies You Must Be Using to Improve 4+ ACT Points

How to Get a Perfect 36 ACT, by a Perfect Scorer

Series: How to Get 36 on Each ACT Section:

36 on ACT English

36 on ACT Math

36 on ACT Reading

36 on ACT Science

Series: How to Get to 24 on Each ACT Section:

24 on ACT English

24 on ACT Math

24 on ACT Reading

24 on ACT Science

What ACT target score should you be aiming for?

ACT Vocabulary You Must Know

ACT Writing: 15 Tips to Raise Your Essay Score

How to Get Into Harvard and the Ivy League

How to Get a Perfect 4.0 GPA

How to Write an Amazing College Essay

What Exactly Are Colleges Looking For?

Is the ACT easier than the SAT? A Comprehensive Guide

Should you retake your SAT or ACT?

When should you take the SAT or ACT?

Stay Informed

5 example of hypothesis and conclusion

Get the latest articles and test prep tips!

Looking for Graduate School Test Prep?

Check out our top-rated graduate blogs here:

GRE Online Prep Blog

GMAT Online Prep Blog

TOEFL Online Prep Blog

Holly R. "I am absolutely overjoyed and cannot thank you enough for helping me!”
  • Resources Home 🏠
  • Try SciSpace Copilot
  • Search research papers
  • Add Copilot Extension
  • Try AI Detector
  • Try Paraphraser
  • Try Citation Generator
  • April Papers
  • June Papers
  • July Papers

SciSpace Resources

The Craft of Writing a Strong Hypothesis

Deeptanshu D

Table of Contents

Writing a hypothesis is one of the essential elements of a scientific research paper. It needs to be to the point, clearly communicating what your research is trying to accomplish. A blurry, drawn-out, or complexly-structured hypothesis can confuse your readers. Or worse, the editor and peer reviewers.

A captivating hypothesis is not too intricate. This blog will take you through the process so that, by the end of it, you have a better idea of how to convey your research paper's intent in just one sentence.

What is a Hypothesis?

The first step in your scientific endeavor, a hypothesis, is a strong, concise statement that forms the basis of your research. It is not the same as a thesis statement , which is a brief summary of your research paper .

The sole purpose of a hypothesis is to predict your paper's findings, data, and conclusion. It comes from a place of curiosity and intuition . When you write a hypothesis, you're essentially making an educated guess based on scientific prejudices and evidence, which is further proven or disproven through the scientific method.

The reason for undertaking research is to observe a specific phenomenon. A hypothesis, therefore, lays out what the said phenomenon is. And it does so through two variables, an independent and dependent variable.

The independent variable is the cause behind the observation, while the dependent variable is the effect of the cause. A good example of this is “mixing red and blue forms purple.” In this hypothesis, mixing red and blue is the independent variable as you're combining the two colors at your own will. The formation of purple is the dependent variable as, in this case, it is conditional to the independent variable.

Different Types of Hypotheses‌

Types-of-hypotheses

Types of hypotheses

Some would stand by the notion that there are only two types of hypotheses: a Null hypothesis and an Alternative hypothesis. While that may have some truth to it, it would be better to fully distinguish the most common forms as these terms come up so often, which might leave you out of context.

Apart from Null and Alternative, there are Complex, Simple, Directional, Non-Directional, Statistical, and Associative and casual hypotheses. They don't necessarily have to be exclusive, as one hypothesis can tick many boxes, but knowing the distinctions between them will make it easier for you to construct your own.

1. Null hypothesis

A null hypothesis proposes no relationship between two variables. Denoted by H 0 , it is a negative statement like “Attending physiotherapy sessions does not affect athletes' on-field performance.” Here, the author claims physiotherapy sessions have no effect on on-field performances. Even if there is, it's only a coincidence.

2. Alternative hypothesis

Considered to be the opposite of a null hypothesis, an alternative hypothesis is donated as H1 or Ha. It explicitly states that the dependent variable affects the independent variable. A good  alternative hypothesis example is “Attending physiotherapy sessions improves athletes' on-field performance.” or “Water evaporates at 100 °C. ” The alternative hypothesis further branches into directional and non-directional.

  • Directional hypothesis: A hypothesis that states the result would be either positive or negative is called directional hypothesis. It accompanies H1 with either the ‘<' or ‘>' sign.
  • Non-directional hypothesis: A non-directional hypothesis only claims an effect on the dependent variable. It does not clarify whether the result would be positive or negative. The sign for a non-directional hypothesis is ‘≠.'

3. Simple hypothesis

A simple hypothesis is a statement made to reflect the relation between exactly two variables. One independent and one dependent. Consider the example, “Smoking is a prominent cause of lung cancer." The dependent variable, lung cancer, is dependent on the independent variable, smoking.

4. Complex hypothesis

In contrast to a simple hypothesis, a complex hypothesis implies the relationship between multiple independent and dependent variables. For instance, “Individuals who eat more fruits tend to have higher immunity, lesser cholesterol, and high metabolism.” The independent variable is eating more fruits, while the dependent variables are higher immunity, lesser cholesterol, and high metabolism.

5. Associative and casual hypothesis

Associative and casual hypotheses don't exhibit how many variables there will be. They define the relationship between the variables. In an associative hypothesis, changing any one variable, dependent or independent, affects others. In a casual hypothesis, the independent variable directly affects the dependent.

6. Empirical hypothesis

Also referred to as the working hypothesis, an empirical hypothesis claims a theory's validation via experiments and observation. This way, the statement appears justifiable and different from a wild guess.

Say, the hypothesis is “Women who take iron tablets face a lesser risk of anemia than those who take vitamin B12.” This is an example of an empirical hypothesis where the researcher  the statement after assessing a group of women who take iron tablets and charting the findings.

7. Statistical hypothesis

The point of a statistical hypothesis is to test an already existing hypothesis by studying a population sample. Hypothesis like “44% of the Indian population belong in the age group of 22-27.” leverage evidence to prove or disprove a particular statement.

Characteristics of a Good Hypothesis

Writing a hypothesis is essential as it can make or break your research for you. That includes your chances of getting published in a journal. So when you're designing one, keep an eye out for these pointers:

  • A research hypothesis has to be simple yet clear to look justifiable enough.
  • It has to be testable — your research would be rendered pointless if too far-fetched into reality or limited by technology.
  • It has to be precise about the results —what you are trying to do and achieve through it should come out in your hypothesis.
  • A research hypothesis should be self-explanatory, leaving no doubt in the reader's mind.
  • If you are developing a relational hypothesis, you need to include the variables and establish an appropriate relationship among them.
  • A hypothesis must keep and reflect the scope for further investigations and experiments.

Separating a Hypothesis from a Prediction

Outside of academia, hypothesis and prediction are often used interchangeably. In research writing, this is not only confusing but also incorrect. And although a hypothesis and prediction are guesses at their core, there are many differences between them.

A hypothesis is an educated guess or even a testable prediction validated through research. It aims to analyze the gathered evidence and facts to define a relationship between variables and put forth a logical explanation behind the nature of events.

Predictions are assumptions or expected outcomes made without any backing evidence. They are more fictionally inclined regardless of where they originate from.

For this reason, a hypothesis holds much more weight than a prediction. It sticks to the scientific method rather than pure guesswork. "Planets revolve around the Sun." is an example of a hypothesis as it is previous knowledge and observed trends. Additionally, we can test it through the scientific method.

Whereas "COVID-19 will be eradicated by 2030." is a prediction. Even though it results from past trends, we can't prove or disprove it. So, the only way this gets validated is to wait and watch if COVID-19 cases end by 2030.

Finally, How to Write a Hypothesis

Quick-tips-on-how-to-write-a-hypothesis

Quick tips on writing a hypothesis

1.  Be clear about your research question

A hypothesis should instantly address the research question or the problem statement. To do so, you need to ask a question. Understand the constraints of your undertaken research topic and then formulate a simple and topic-centric problem. Only after that can you develop a hypothesis and further test for evidence.

2. Carry out a recce

Once you have your research's foundation laid out, it would be best to conduct preliminary research. Go through previous theories, academic papers, data, and experiments before you start curating your research hypothesis. It will give you an idea of your hypothesis's viability or originality.

Making use of references from relevant research papers helps draft a good research hypothesis. SciSpace Discover offers a repository of over 270 million research papers to browse through and gain a deeper understanding of related studies on a particular topic. Additionally, you can use SciSpace Copilot , your AI research assistant, for reading any lengthy research paper and getting a more summarized context of it. A hypothesis can be formed after evaluating many such summarized research papers. Copilot also offers explanations for theories and equations, explains paper in simplified version, allows you to highlight any text in the paper or clip math equations and tables and provides a deeper, clear understanding of what is being said. This can improve the hypothesis by helping you identify potential research gaps.

3. Create a 3-dimensional hypothesis

Variables are an essential part of any reasonable hypothesis. So, identify your independent and dependent variable(s) and form a correlation between them. The ideal way to do this is to write the hypothetical assumption in the ‘if-then' form. If you use this form, make sure that you state the predefined relationship between the variables.

In another way, you can choose to present your hypothesis as a comparison between two variables. Here, you must specify the difference you expect to observe in the results.

4. Write the first draft

Now that everything is in place, it's time to write your hypothesis. For starters, create the first draft. In this version, write what you expect to find from your research.

Clearly separate your independent and dependent variables and the link between them. Don't fixate on syntax at this stage. The goal is to ensure your hypothesis addresses the issue.

5. Proof your hypothesis

After preparing the first draft of your hypothesis, you need to inspect it thoroughly. It should tick all the boxes, like being concise, straightforward, relevant, and accurate. Your final hypothesis has to be well-structured as well.

Research projects are an exciting and crucial part of being a scholar. And once you have your research question, you need a great hypothesis to begin conducting research. Thus, knowing how to write a hypothesis is very important.

Now that you have a firmer grasp on what a good hypothesis constitutes, the different kinds there are, and what process to follow, you will find it much easier to write your hypothesis, which ultimately helps your research.

Now it's easier than ever to streamline your research workflow with SciSpace Discover . Its integrated, comprehensive end-to-end platform for research allows scholars to easily discover, write and publish their research and fosters collaboration.

It includes everything you need, including a repository of over 270 million research papers across disciplines, SEO-optimized summaries and public profiles to show your expertise and experience.

If you found these tips on writing a research hypothesis useful, head over to our blog on Statistical Hypothesis Testing to learn about the top researchers, papers, and institutions in this domain.

Frequently Asked Questions (FAQs)

1. what is the definition of hypothesis.

According to the Oxford dictionary, a hypothesis is defined as “An idea or explanation of something that is based on a few known facts, but that has not yet been proved to be true or correct”.

2. What is an example of hypothesis?

The hypothesis is a statement that proposes a relationship between two or more variables. An example: "If we increase the number of new users who join our platform by 25%, then we will see an increase in revenue."

3. What is an example of null hypothesis?

A null hypothesis is a statement that there is no relationship between two variables. The null hypothesis is written as H0. The null hypothesis states that there is no effect. For example, if you're studying whether or not a particular type of exercise increases strength, your null hypothesis will be "there is no difference in strength between people who exercise and people who don't."

4. What are the types of research?

• Fundamental research

• Applied research

• Qualitative research

• Quantitative research

• Mixed research

• Exploratory research

• Longitudinal research

• Cross-sectional research

• Field research

• Laboratory research

• Fixed research

• Flexible research

• Action research

• Policy research

• Classification research

• Comparative research

• Causal research

• Inductive research

• Deductive research

5. How to write a hypothesis?

• Your hypothesis should be able to predict the relationship and outcome.

• Avoid wordiness by keeping it simple and brief.

• Your hypothesis should contain observable and testable outcomes.

• Your hypothesis should be relevant to the research question.

6. What are the 2 types of hypothesis?

• Null hypotheses are used to test the claim that "there is no difference between two groups of data".

• Alternative hypotheses test the claim that "there is a difference between two data groups".

7. Difference between research question and research hypothesis?

A research question is a broad, open-ended question you will try to answer through your research. A hypothesis is a statement based on prior research or theory that you expect to be true due to your study. Example - Research question: What are the factors that influence the adoption of the new technology? Research hypothesis: There is a positive relationship between age, education and income level with the adoption of the new technology.

8. What is plural for hypothesis?

The plural of hypothesis is hypotheses. Here's an example of how it would be used in a statement, "Numerous well-considered hypotheses are presented in this part, and they are supported by tables and figures that are well-illustrated."

9. What is the red queen hypothesis?

The red queen hypothesis in evolutionary biology states that species must constantly evolve to avoid extinction because if they don't, they will be outcompeted by other species that are evolving. Leigh Van Valen first proposed it in 1973; since then, it has been tested and substantiated many times.

10. Who is known as the father of null hypothesis?

The father of the null hypothesis is Sir Ronald Fisher. He published a paper in 1925 that introduced the concept of null hypothesis testing, and he was also the first to use the term itself.

11. When to reject null hypothesis?

You need to find a significant difference between your two populations to reject the null hypothesis. You can determine that by running statistical tests such as an independent sample t-test or a dependent sample t-test. You should reject the null hypothesis if the p-value is less than 0.05.

5 example of hypothesis and conclusion

You might also like

Consensus GPT vs. SciSpace GPT: Choose the Best GPT for Research

Consensus GPT vs. SciSpace GPT: Choose the Best GPT for Research

Sumalatha G

Literature Review and Theoretical Framework: Understanding the Differences

Nikhil Seethi

Types of Essays in Academic Writing - Quick Guide (2024)

science made simple logo

The Scientific Method by Science Made Simple

Understanding and using the scientific method.

The Scientific Method is a process used to design and perform experiments. It's important to minimize experimental errors and bias, and increase confidence in the accuracy of your results.

science experiment

In the previous sections, we talked about how to pick a good topic and specific question to investigate. Now we will discuss how to carry out your investigation.

Steps of the Scientific Method

  • Observation/Research
  • Experimentation

Now that you have settled on the question you want to ask, it's time to use the Scientific Method to design an experiment to answer that question.

If your experiment isn't designed well, you may not get the correct answer. You may not even get any definitive answer at all!

The Scientific Method is a logical and rational order of steps by which scientists come to conclusions about the world around them. The Scientific Method helps to organize thoughts and procedures so that scientists can be confident in the answers they find.

OBSERVATION is first step, so that you know how you want to go about your research.

HYPOTHESIS is the answer you think you'll find.

PREDICTION is your specific belief about the scientific idea: If my hypothesis is true, then I predict we will discover this.

EXPERIMENT is the tool that you invent to answer the question, and

CONCLUSION is the answer that the experiment gives.

Don't worry, it isn't that complicated. Let's take a closer look at each one of these steps. Then you can understand the tools scientists use for their science experiments, and use them for your own.

OBSERVATION

observation  magnifying glass

This step could also be called "research." It is the first stage in understanding the problem.

After you decide on topic, and narrow it down to a specific question, you will need to research everything that you can find about it. You can collect information from your own experiences, books, the internet, or even smaller "unofficial" experiments.

Let's continue the example of a science fair idea about tomatoes in the garden. You like to garden, and notice that some tomatoes are bigger than others and wonder why.

Because of this personal experience and an interest in the problem, you decide to learn more about what makes plants grow.

For this stage of the Scientific Method, it's important to use as many sources as you can find. The more information you have on your science fair topic, the better the design of your experiment is going to be, and the better your science fair project is going to be overall.

Also try to get information from your teachers or librarians, or professionals who know something about your science fair project. They can help to guide you to a solid experimental setup.

research science fair topic

The next stage of the Scientific Method is known as the "hypothesis." This word basically means "a possible solution to a problem, based on knowledge and research."

The hypothesis is a simple statement that defines what you think the outcome of your experiment will be.

All of the first stage of the Scientific Method -- the observation, or research stage -- is designed to help you express a problem in a single question ("Does the amount of sunlight in a garden affect tomato size?") and propose an answer to the question based on what you know. The experiment that you will design is done to test the hypothesis.

Using the example of the tomato experiment, here is an example of a hypothesis:

TOPIC: "Does the amount of sunlight a tomato plant receives affect the size of the tomatoes?"

HYPOTHESIS: "I believe that the more sunlight a tomato plant receives, the larger the tomatoes will grow.

This hypothesis is based on:

(1) Tomato plants need sunshine to make food through photosynthesis, and logically, more sun means more food, and;

(2) Through informal, exploratory observations of plants in a garden, those with more sunlight appear to grow bigger.

science fair project ideas

The hypothesis is your general statement of how you think the scientific phenomenon in question works.

Your prediction lets you get specific -- how will you demonstrate that your hypothesis is true? The experiment that you will design is done to test the prediction.

An important thing to remember during this stage of the scientific method is that once you develop a hypothesis and a prediction, you shouldn't change it, even if the results of your experiment show that you were wrong.

An incorrect prediction does NOT mean that you "failed." It just means that the experiment brought some new facts to light that maybe you hadn't thought about before.

Continuing our tomato plant example, a good prediction would be: Increasing the amount of sunlight tomato plants in my experiment receive will cause an increase in their size compared to identical plants that received the same care but less light.

This is the part of the scientific method that tests your hypothesis. An experiment is a tool that you design to find out if your ideas about your topic are right or wrong.

It is absolutely necessary to design a science fair experiment that will accurately test your hypothesis. The experiment is the most important part of the scientific method. It's the logical process that lets scientists learn about the world.

On the next page, we'll discuss the ways that you can go about designing a science fair experiment idea.

The final step in the scientific method is the conclusion. This is a summary of the experiment's results, and how those results match up to your hypothesis.

You have two options for your conclusions: based on your results, either:

(1) YOU CAN REJECT the hypothesis, or

(2) YOU CAN NOT REJECT the hypothesis.

This is an important point!

You can not PROVE the hypothesis with a single experiment, because there is a chance that you made an error somewhere along the way.

What you can say is that your results SUPPORT the original hypothesis.

If your original hypothesis didn't match up with the final results of your experiment, don't change the hypothesis.

Instead, try to explain what might have been wrong with your original hypothesis. What information were you missing when you made your prediction? What are the possible reasons the hypothesis and experimental results didn't match up?

Remember, a science fair experiment isn't a failure simply because does not agree with your hypothesis. No one will take points off if your prediction wasn't accurate. Many important scientific discoveries were made as a result of experiments gone wrong!

A science fair experiment is only a failure if its design is flawed. A flawed experiment is one that (1) doesn't keep its variables under control, and (2) doesn't sufficiently answer the question that you asked of it.

Search This Site:

Science Fairs

  • Introduction
  • Project Ideas
  • Types of Projects
  • Pick a Topic
  • Scientific Method
  • Design Your Experiment
  • Present Your Project
  • What Judges Want
  • Parent Info

Recommended *

  • Sample Science Projects - botany, ecology, microbiology, nutrition

scientific method book

* This site contains affiliate links to carefully chosen, high quality products. We may receive a commission for purchases made through these links.

  • Terms of Service

Copyright © 2006 - 2023, Science Made Simple, Inc. All Rights Reserved.

The science fair projects & ideas, science articles and all other material on this website are covered by copyright laws and may not be reproduced without permission.

  • Bipolar Disorder
  • Therapy Center
  • When To See a Therapist
  • Types of Therapy
  • Best Online Therapy
  • Best Couples Therapy
  • Best Family Therapy
  • Managing Stress
  • Sleep and Dreaming
  • Understanding Emotions
  • Self-Improvement
  • Healthy Relationships
  • Student Resources
  • Personality Types
  • Guided Meditations
  • Verywell Mind Insights
  • 2023 Verywell Mind 25
  • Mental Health in the Classroom
  • Editorial Process
  • Meet Our Review Board
  • Crisis Support

How to Write a Great Hypothesis

Hypothesis Definition, Format, Examples, and Tips

Kendra Cherry, MS, is a psychosocial rehabilitation specialist, psychology educator, and author of the "Everything Psychology Book."

5 example of hypothesis and conclusion

Amy Morin, LCSW, is a psychotherapist and international bestselling author. Her books, including "13 Things Mentally Strong People Don't Do," have been translated into more than 40 languages. Her TEDx talk,  "The Secret of Becoming Mentally Strong," is one of the most viewed talks of all time.

5 example of hypothesis and conclusion

Verywell / Alex Dos Diaz

  • The Scientific Method

Hypothesis Format

Falsifiability of a hypothesis.

  • Operationalization

Hypothesis Types

Hypotheses examples.

  • Collecting Data

A hypothesis is a tentative statement about the relationship between two or more variables. It is a specific, testable prediction about what you expect to happen in a study. It is a preliminary answer to your question that helps guide the research process.

Consider a study designed to examine the relationship between sleep deprivation and test performance. The hypothesis might be: "This study is designed to assess the hypothesis that sleep-deprived people will perform worse on a test than individuals who are not sleep-deprived."

At a Glance

A hypothesis is crucial to scientific research because it offers a clear direction for what the researchers are looking to find. This allows them to design experiments to test their predictions and add to our scientific knowledge about the world. This article explores how a hypothesis is used in psychology research, how to write a good hypothesis, and the different types of hypotheses you might use.

The Hypothesis in the Scientific Method

In the scientific method , whether it involves research in psychology, biology, or some other area, a hypothesis represents what the researchers think will happen in an experiment. The scientific method involves the following steps:

  • Forming a question
  • Performing background research
  • Creating a hypothesis
  • Designing an experiment
  • Collecting data
  • Analyzing the results
  • Drawing conclusions
  • Communicating the results

The hypothesis is a prediction, but it involves more than a guess. Most of the time, the hypothesis begins with a question which is then explored through background research. At this point, researchers then begin to develop a testable hypothesis.

Unless you are creating an exploratory study, your hypothesis should always explain what you  expect  to happen.

In a study exploring the effects of a particular drug, the hypothesis might be that researchers expect the drug to have some type of effect on the symptoms of a specific illness. In psychology, the hypothesis might focus on how a certain aspect of the environment might influence a particular behavior.

Remember, a hypothesis does not have to be correct. While the hypothesis predicts what the researchers expect to see, the goal of the research is to determine whether this guess is right or wrong. When conducting an experiment, researchers might explore numerous factors to determine which ones might contribute to the ultimate outcome.

In many cases, researchers may find that the results of an experiment  do not  support the original hypothesis. When writing up these results, the researchers might suggest other options that should be explored in future studies.

In many cases, researchers might draw a hypothesis from a specific theory or build on previous research. For example, prior research has shown that stress can impact the immune system. So a researcher might hypothesize: "People with high-stress levels will be more likely to contract a common cold after being exposed to the virus than people who have low-stress levels."

In other instances, researchers might look at commonly held beliefs or folk wisdom. "Birds of a feather flock together" is one example of folk adage that a psychologist might try to investigate. The researcher might pose a specific hypothesis that "People tend to select romantic partners who are similar to them in interests and educational level."

Elements of a Good Hypothesis

So how do you write a good hypothesis? When trying to come up with a hypothesis for your research or experiments, ask yourself the following questions:

  • Is your hypothesis based on your research on a topic?
  • Can your hypothesis be tested?
  • Does your hypothesis include independent and dependent variables?

Before you come up with a specific hypothesis, spend some time doing background research. Once you have completed a literature review, start thinking about potential questions you still have. Pay attention to the discussion section in the  journal articles you read . Many authors will suggest questions that still need to be explored.

How to Formulate a Good Hypothesis

To form a hypothesis, you should take these steps:

  • Collect as many observations about a topic or problem as you can.
  • Evaluate these observations and look for possible causes of the problem.
  • Create a list of possible explanations that you might want to explore.
  • After you have developed some possible hypotheses, think of ways that you could confirm or disprove each hypothesis through experimentation. This is known as falsifiability.

In the scientific method ,  falsifiability is an important part of any valid hypothesis. In order to test a claim scientifically, it must be possible that the claim could be proven false.

Students sometimes confuse the idea of falsifiability with the idea that it means that something is false, which is not the case. What falsifiability means is that  if  something was false, then it is possible to demonstrate that it is false.

One of the hallmarks of pseudoscience is that it makes claims that cannot be refuted or proven false.

The Importance of Operational Definitions

A variable is a factor or element that can be changed and manipulated in ways that are observable and measurable. However, the researcher must also define how the variable will be manipulated and measured in the study.

Operational definitions are specific definitions for all relevant factors in a study. This process helps make vague or ambiguous concepts detailed and measurable.

For example, a researcher might operationally define the variable " test anxiety " as the results of a self-report measure of anxiety experienced during an exam. A "study habits" variable might be defined by the amount of studying that actually occurs as measured by time.

These precise descriptions are important because many things can be measured in various ways. Clearly defining these variables and how they are measured helps ensure that other researchers can replicate your results.

Replicability

One of the basic principles of any type of scientific research is that the results must be replicable.

Replication means repeating an experiment in the same way to produce the same results. By clearly detailing the specifics of how the variables were measured and manipulated, other researchers can better understand the results and repeat the study if needed.

Some variables are more difficult than others to define. For example, how would you operationally define a variable such as aggression ? For obvious ethical reasons, researchers cannot create a situation in which a person behaves aggressively toward others.

To measure this variable, the researcher must devise a measurement that assesses aggressive behavior without harming others. The researcher might utilize a simulated task to measure aggressiveness in this situation.

Hypothesis Checklist

  • Does your hypothesis focus on something that you can actually test?
  • Does your hypothesis include both an independent and dependent variable?
  • Can you manipulate the variables?
  • Can your hypothesis be tested without violating ethical standards?

The hypothesis you use will depend on what you are investigating and hoping to find. Some of the main types of hypotheses that you might use include:

  • Simple hypothesis : This type of hypothesis suggests there is a relationship between one independent variable and one dependent variable.
  • Complex hypothesis : This type suggests a relationship between three or more variables, such as two independent and dependent variables.
  • Null hypothesis : This hypothesis suggests no relationship exists between two or more variables.
  • Alternative hypothesis : This hypothesis states the opposite of the null hypothesis.
  • Statistical hypothesis : This hypothesis uses statistical analysis to evaluate a representative population sample and then generalizes the findings to the larger group.
  • Logical hypothesis : This hypothesis assumes a relationship between variables without collecting data or evidence.

A hypothesis often follows a basic format of "If {this happens} then {this will happen}." One way to structure your hypothesis is to describe what will happen to the  dependent variable  if you change the  independent variable .

The basic format might be: "If {these changes are made to a certain independent variable}, then we will observe {a change in a specific dependent variable}."

A few examples of simple hypotheses:

  • "Students who eat breakfast will perform better on a math exam than students who do not eat breakfast."
  • "Students who experience test anxiety before an English exam will get lower scores than students who do not experience test anxiety."​
  • "Motorists who talk on the phone while driving will be more likely to make errors on a driving course than those who do not talk on the phone."
  • "Children who receive a new reading intervention will have higher reading scores than students who do not receive the intervention."

Examples of a complex hypothesis include:

  • "People with high-sugar diets and sedentary activity levels are more likely to develop depression."
  • "Younger people who are regularly exposed to green, outdoor areas have better subjective well-being than older adults who have limited exposure to green spaces."

Examples of a null hypothesis include:

  • "There is no difference in anxiety levels between people who take St. John's wort supplements and those who do not."
  • "There is no difference in scores on a memory recall task between children and adults."
  • "There is no difference in aggression levels between children who play first-person shooter games and those who do not."

Examples of an alternative hypothesis:

  • "People who take St. John's wort supplements will have less anxiety than those who do not."
  • "Adults will perform better on a memory task than children."
  • "Children who play first-person shooter games will show higher levels of aggression than children who do not." 

Collecting Data on Your Hypothesis

Once a researcher has formed a testable hypothesis, the next step is to select a research design and start collecting data. The research method depends largely on exactly what they are studying. There are two basic types of research methods: descriptive research and experimental research.

Descriptive Research Methods

Descriptive research such as  case studies ,  naturalistic observations , and surveys are often used when  conducting an experiment is difficult or impossible. These methods are best used to describe different aspects of a behavior or psychological phenomenon.

Once a researcher has collected data using descriptive methods, a  correlational study  can examine how the variables are related. This research method might be used to investigate a hypothesis that is difficult to test experimentally.

Experimental Research Methods

Experimental methods  are used to demonstrate causal relationships between variables. In an experiment, the researcher systematically manipulates a variable of interest (known as the independent variable) and measures the effect on another variable (known as the dependent variable).

Unlike correlational studies, which can only be used to determine if there is a relationship between two variables, experimental methods can be used to determine the actual nature of the relationship—whether changes in one variable actually  cause  another to change.

The hypothesis is a critical part of any scientific exploration. It represents what researchers expect to find in a study or experiment. In situations where the hypothesis is unsupported by the research, the research still has value. Such research helps us better understand how different aspects of the natural world relate to one another. It also helps us develop new hypotheses that can then be tested in the future.

Thompson WH, Skau S. On the scope of scientific hypotheses .  R Soc Open Sci . 2023;10(8):230607. doi:10.1098/rsos.230607

Taran S, Adhikari NKJ, Fan E. Falsifiability in medicine: what clinicians can learn from Karl Popper [published correction appears in Intensive Care Med. 2021 Jun 17;:].  Intensive Care Med . 2021;47(9):1054-1056. doi:10.1007/s00134-021-06432-z

Eyler AA. Research Methods for Public Health . 1st ed. Springer Publishing Company; 2020. doi:10.1891/9780826182067.0004

Nosek BA, Errington TM. What is replication ?  PLoS Biol . 2020;18(3):e3000691. doi:10.1371/journal.pbio.3000691

Aggarwal R, Ranganathan P. Study designs: Part 2 - Descriptive studies .  Perspect Clin Res . 2019;10(1):34-36. doi:10.4103/picr.PICR_154_18

Nevid J. Psychology: Concepts and Applications. Wadworth, 2013.

By Kendra Cherry, MSEd Kendra Cherry, MS, is a psychosocial rehabilitation specialist, psychology educator, and author of the "Everything Psychology Book."

Get science-backed answers as you write with Paperpal's Research feature

How to Write a Conclusion for Research Papers (with Examples)

How to Write a Conclusion for Research Papers (with Examples)

The conclusion of a research paper is a crucial section that plays a significant role in the overall impact and effectiveness of your research paper. However, this is also the section that typically receives less attention compared to the introduction and the body of the paper. The conclusion serves to provide a concise summary of the key findings, their significance, their implications, and a sense of closure to the study. Discussing how can the findings be applied in real-world scenarios or inform policy, practice, or decision-making is especially valuable to practitioners and policymakers. The research paper conclusion also provides researchers with clear insights and valuable information for their own work, which they can then build on and contribute to the advancement of knowledge in the field.

The research paper conclusion should explain the significance of your findings within the broader context of your field. It restates how your results contribute to the existing body of knowledge and whether they confirm or challenge existing theories or hypotheses. Also, by identifying unanswered questions or areas requiring further investigation, your awareness of the broader research landscape can be demonstrated.

Remember to tailor the research paper conclusion to the specific needs and interests of your intended audience, which may include researchers, practitioners, policymakers, or a combination of these.

Table of Contents

What is a conclusion in a research paper, summarizing conclusion, editorial conclusion, externalizing conclusion, importance of a good research paper conclusion, how to write a conclusion for your research paper, research paper conclusion examples.

  • How to write a research paper conclusion with Paperpal? 

Frequently Asked Questions

A conclusion in a research paper is the final section where you summarize and wrap up your research, presenting the key findings and insights derived from your study. The research paper conclusion is not the place to introduce new information or data that was not discussed in the main body of the paper. When working on how to conclude a research paper, remember to stick to summarizing and interpreting existing content. The research paper conclusion serves the following purposes: 1

  • Warn readers of the possible consequences of not attending to the problem.
  • Recommend specific course(s) of action.
  • Restate key ideas to drive home the ultimate point of your research paper.
  • Provide a “take-home” message that you want the readers to remember about your study.

5 example of hypothesis and conclusion

Types of conclusions for research papers

In research papers, the conclusion provides closure to the reader. The type of research paper conclusion you choose depends on the nature of your study, your goals, and your target audience. I provide you with three common types of conclusions:

A summarizing conclusion is the most common type of conclusion in research papers. It involves summarizing the main points, reiterating the research question, and restating the significance of the findings. This common type of research paper conclusion is used across different disciplines.

An editorial conclusion is less common but can be used in research papers that are focused on proposing or advocating for a particular viewpoint or policy. It involves presenting a strong editorial or opinion based on the research findings and offering recommendations or calls to action.

An externalizing conclusion is a type of conclusion that extends the research beyond the scope of the paper by suggesting potential future research directions or discussing the broader implications of the findings. This type of conclusion is often used in more theoretical or exploratory research papers.

Align your conclusion’s tone with the rest of your research paper. Start Writing with Paperpal Now!  

The conclusion in a research paper serves several important purposes:

  • Offers Implications and Recommendations : Your research paper conclusion is an excellent place to discuss the broader implications of your research and suggest potential areas for further study. It’s also an opportunity to offer practical recommendations based on your findings.
  • Provides Closure : A good research paper conclusion provides a sense of closure to your paper. It should leave the reader with a feeling that they have reached the end of a well-structured and thought-provoking research project.
  • Leaves a Lasting Impression : Writing a well-crafted research paper conclusion leaves a lasting impression on your readers. It’s your final opportunity to leave them with a new idea, a call to action, or a memorable quote.

5 example of hypothesis and conclusion

Writing a strong conclusion for your research paper is essential to leave a lasting impression on your readers. Here’s a step-by-step process to help you create and know what to put in the conclusion of a research paper: 2

  • Research Statement : Begin your research paper conclusion by restating your research statement. This reminds the reader of the main point you’ve been trying to prove throughout your paper. Keep it concise and clear.
  • Key Points : Summarize the main arguments and key points you’ve made in your paper. Avoid introducing new information in the research paper conclusion. Instead, provide a concise overview of what you’ve discussed in the body of your paper.
  • Address the Research Questions : If your research paper is based on specific research questions or hypotheses, briefly address whether you’ve answered them or achieved your research goals. Discuss the significance of your findings in this context.
  • Significance : Highlight the importance of your research and its relevance in the broader context. Explain why your findings matter and how they contribute to the existing knowledge in your field.
  • Implications : Explore the practical or theoretical implications of your research. How might your findings impact future research, policy, or real-world applications? Consider the “so what?” question.
  • Future Research : Offer suggestions for future research in your area. What questions or aspects remain unanswered or warrant further investigation? This shows that your work opens the door for future exploration.
  • Closing Thought : Conclude your research paper conclusion with a thought-provoking or memorable statement. This can leave a lasting impression on your readers and wrap up your paper effectively. Avoid introducing new information or arguments here.
  • Proofread and Revise : Carefully proofread your conclusion for grammar, spelling, and clarity. Ensure that your ideas flow smoothly and that your conclusion is coherent and well-structured.

Write your research paper conclusion 2x faster with Paperpal. Try it now!

Remember that a well-crafted research paper conclusion is a reflection of the strength of your research and your ability to communicate its significance effectively. It should leave a lasting impression on your readers and tie together all the threads of your paper. Now you know how to start the conclusion of a research paper and what elements to include to make it impactful, let’s look at a research paper conclusion sample.

5 example of hypothesis and conclusion

How to write a research paper conclusion with Paperpal?

A research paper conclusion is not just a summary of your study, but a synthesis of the key findings that ties the research together and places it in a broader context. A research paper conclusion should be concise, typically around one paragraph in length. However, some complex topics may require a longer conclusion to ensure the reader is left with a clear understanding of the study’s significance. Paperpal, an AI writing assistant trusted by over 800,000 academics globally, can help you write a well-structured conclusion for your research paper. 

  • Sign Up or Log In: Create a new Paperpal account or login with your details.  
  • Navigate to Features : Once logged in, head over to the features’ side navigation pane. Click on Templates and you’ll find a suite of generative AI features to help you write better, faster.  
  • Generate an outline: Under Templates, select ‘Outlines’. Choose ‘Research article’ as your document type.  
  • Select your section: Since you’re focusing on the conclusion, select this section when prompted.  
  • Choose your field of study: Identifying your field of study allows Paperpal to provide more targeted suggestions, ensuring the relevance of your conclusion to your specific area of research. 
  • Provide a brief description of your study: Enter details about your research topic and findings. This information helps Paperpal generate a tailored outline that aligns with your paper’s content. 
  • Generate the conclusion outline: After entering all necessary details, click on ‘generate’. Paperpal will then create a structured outline for your conclusion, to help you start writing and build upon the outline.  
  • Write your conclusion: Use the generated outline to build your conclusion. The outline serves as a guide, ensuring you cover all critical aspects of a strong conclusion, from summarizing key findings to highlighting the research’s implications. 
  • Refine and enhance: Paperpal’s ‘Make Academic’ feature can be particularly useful in the final stages. Select any paragraph of your conclusion and use this feature to elevate the academic tone, ensuring your writing is aligned to the academic journal standards. 

By following these steps, Paperpal not only simplifies the process of writing a research paper conclusion but also ensures it is impactful, concise, and aligned with academic standards. Sign up with Paperpal today and write your research paper conclusion 2x faster .  

The research paper conclusion is a crucial part of your paper as it provides the final opportunity to leave a strong impression on your readers. In the research paper conclusion, summarize the main points of your research paper by restating your research statement, highlighting the most important findings, addressing the research questions or objectives, explaining the broader context of the study, discussing the significance of your findings, providing recommendations if applicable, and emphasizing the takeaway message. The main purpose of the conclusion is to remind the reader of the main point or argument of your paper and to provide a clear and concise summary of the key findings and their implications. All these elements should feature on your list of what to put in the conclusion of a research paper to create a strong final statement for your work.

A strong conclusion is a critical component of a research paper, as it provides an opportunity to wrap up your arguments, reiterate your main points, and leave a lasting impression on your readers. Here are the key elements of a strong research paper conclusion: 1. Conciseness : A research paper conclusion should be concise and to the point. It should not introduce new information or ideas that were not discussed in the body of the paper. 2. Summarization : The research paper conclusion should be comprehensive enough to give the reader a clear understanding of the research’s main contributions. 3 . Relevance : Ensure that the information included in the research paper conclusion is directly relevant to the research paper’s main topic and objectives; avoid unnecessary details. 4 . Connection to the Introduction : A well-structured research paper conclusion often revisits the key points made in the introduction and shows how the research has addressed the initial questions or objectives. 5. Emphasis : Highlight the significance and implications of your research. Why is your study important? What are the broader implications or applications of your findings? 6 . Call to Action : Include a call to action or a recommendation for future research or action based on your findings.

The length of a research paper conclusion can vary depending on several factors, including the overall length of the paper, the complexity of the research, and the specific journal requirements. While there is no strict rule for the length of a conclusion, but it’s generally advisable to keep it relatively short. A typical research paper conclusion might be around 5-10% of the paper’s total length. For example, if your paper is 10 pages long, the conclusion might be roughly half a page to one page in length.

In general, you do not need to include citations in the research paper conclusion. Citations are typically reserved for the body of the paper to support your arguments and provide evidence for your claims. However, there may be some exceptions to this rule: 1. If you are drawing a direct quote or paraphrasing a specific source in your research paper conclusion, you should include a citation to give proper credit to the original author. 2. If your conclusion refers to or discusses specific research, data, or sources that are crucial to the overall argument, citations can be included to reinforce your conclusion’s validity.

The conclusion of a research paper serves several important purposes: 1. Summarize the Key Points 2. Reinforce the Main Argument 3. Provide Closure 4. Offer Insights or Implications 5. Engage the Reader. 6. Reflect on Limitations

Remember that the primary purpose of the research paper conclusion is to leave a lasting impression on the reader, reinforcing the key points and providing closure to your research. It’s often the last part of the paper that the reader will see, so it should be strong and well-crafted.

  • Makar, G., Foltz, C., Lendner, M., & Vaccaro, A. R. (2018). How to write effective discussion and conclusion sections. Clinical spine surgery, 31(8), 345-346.
  • Bunton, D. (2005). The structure of PhD conclusion chapters.  Journal of English for academic purposes ,  4 (3), 207-224.

Paperpal is a comprehensive AI writing toolkit that helps students and researchers achieve 2x the writing in half the time. It leverages 21+ years of STM experience and insights from millions of research articles to provide in-depth academic writing, language editing, and submission readiness support to help you write better, faster.  

Get accurate academic translations, rewriting support, grammar checks, vocabulary suggestions, and generative AI assistance that delivers human precision at machine speed. Try for free or upgrade to Paperpal Prime starting at US$19 a month to access premium features, including consistency, plagiarism, and 30+ submission readiness checks to help you succeed.  

Experience the future of academic writing – Sign up to Paperpal and start writing for free!  

Related Reads:

  • 5 Reasons for Rejection After Peer Review
  • Ethical Research Practices For Research with Human Subjects

7 Ways to Improve Your Academic Writing Process

  • Paraphrasing in Academic Writing: Answering Top Author Queries

Preflight For Editorial Desk: The Perfect Hybrid (AI + Human) Assistance Against Compromised Manuscripts

You may also like, phd qualifying exam: tips for success , ai in education: it’s time to change the..., is it ethical to use ai-generated abstracts without..., what are journal guidelines on using generative ai..., quillbot review: features, pricing, and free alternatives, what is an academic paper types and elements , should you use ai tools like chatgpt for..., 9 steps to publish a research paper, what are the different types of research papers, how to make translating academic papers less challenging.

How to Write Hypothesis Test Conclusions (With Examples)

A   hypothesis test is used to test whether or not some hypothesis about a population parameter is true.

To perform a hypothesis test in the real world, researchers obtain a random sample from the population and perform a hypothesis test on the sample data, using a null and alternative hypothesis:

  • Null Hypothesis (H 0 ): The sample data occurs purely from chance.
  • Alternative Hypothesis (H A ): The sample data is influenced by some non-random cause.

If the p-value of the hypothesis test is less than some significance level (e.g. α = .05), then we reject the null hypothesis .

Otherwise, if the p-value is not less than some significance level then we fail to reject the null hypothesis .

When writing the conclusion of a hypothesis test, we typically include:

  • Whether we reject or fail to reject the null hypothesis.
  • The significance level.
  • A short explanation in the context of the hypothesis test.

For example, we would write:

We reject the null hypothesis at the 5% significance level.   There is sufficient evidence to support the claim that…

Or, we would write:

We fail to reject the null hypothesis at the 5% significance level.   There is not sufficient evidence to support the claim that…

The following examples show how to write a hypothesis test conclusion in both scenarios.

Example 1: Reject the Null Hypothesis Conclusion

Suppose a biologist believes that a certain fertilizer will cause plants to grow more during a one-month period than they normally do, which is currently 20 inches. To test this, she applies the fertilizer to each of the plants in her laboratory for one month.

She then performs a hypothesis test at a 5% significance level using the following hypotheses:

  • H 0 : μ = 20 inches (the fertilizer will have no effect on the mean plant growth)
  • H A : μ > 20 inches (the fertilizer will cause mean plant growth to increase)

Suppose the p-value of the test turns out to be 0.002.

Here is how she would report the results of the hypothesis test:

We reject the null hypothesis at the 5% significance level.   There is sufficient evidence to support the claim that this particular fertilizer causes plants to grow more during a one-month period than they normally do.

Example 2: Fail to Reject the Null Hypothesis Conclusion

Suppose the manager of a manufacturing plant wants to test whether or not some new method changes the number of defective widgets produced per month, which is currently 250. To test this, he measures the mean number of defective widgets produced before and after using the new method for one month.

He performs a hypothesis test at a 10% significance level using the following hypotheses:

  • H 0 : μ after = μ before (the mean number of defective widgets is the same before and after using the new method)
  • H A : μ after ≠ μ before (the mean number of defective widgets produced is different before and after using the new method)

Suppose the p-value of the test turns out to be 0.27.

Here is how he would report the results of the hypothesis test:

We fail to reject the null hypothesis at the 10% significance level.   There is not sufficient evidence to support the claim that the new method leads to a change in the number of defective widgets produced per month.

Additional Resources

The following tutorials provide additional information about hypothesis testing:

Introduction to Hypothesis Testing 4 Examples of Hypothesis Testing in Real Life How to Write a Null Hypothesis

9.5 Additional Information and Full Hypothesis Test Examples

  • In a hypothesis test problem, you may see words such as "the level of significance is 1%." The "1%" is the preconceived or preset α .
  • The statistician setting up the hypothesis test selects the value of α to use before collecting the sample data.
  • If no level of significance is given, a common standard to use is α = 0.05.
  • When you calculate the p -value and draw the picture, the p -value is the area in the left tail, the right tail, or split evenly between the two tails. For this reason, we call the hypothesis test left, right, or two tailed.
  • The alternative hypothesis , H a H a , tells you if the test is left, right, or two-tailed. It is the key to conducting the appropriate test.
  • H a never has a symbol that contains an equal sign.
  • Thinking about the meaning of the p -value : A data analyst (and anyone else) should have more confidence that he made the correct decision to reject the null hypothesis with a smaller p -value (for example, 0.001 as opposed to 0.04) even if using the 0.05 level for alpha. Similarly, for a large p -value such as 0.4, as opposed to a p -value of 0.056 (alpha = 0.05 is less than either number), a data analyst should have more confidence that she made the correct decision in not rejecting the null hypothesis. This makes the data analyst use judgment rather than mindlessly applying rules.

The following examples illustrate a left-, right-, and two-tailed test.

Example 9.11

H o : μ = 5, H a : μ < 5

Test of a single population mean. H a tells you the test is left-tailed. The picture of the p -value is as follows:

Try It 9.11

H 0 : μ = 10, H a : μ < 10

Assume the p -value is 0.0935. What type of test is this? Draw the picture of the p -value.

Example 9.12

H 0 : p ≤ 0.2   H a : p > 0.2

This is a test of a single population proportion. H a tells you the test is right-tailed . The picture of the p -value is as follows:

Try It 9.12

H 0 : μ ≤ 1, H a : μ > 1

Assume the p -value is 0.1243. What type of test is this? Draw the picture of the p -value.

Example 9.13

H 0 : p = 50   H a : p ≠ 50

This is a test of a single population mean. H a tells you the test is two-tailed . The picture of the p -value is as follows.

Try It 9.13

H 0 : p = 0.5, H a : p ≠ 0.5

Assume the p -value is 0.2564. What type of test is this? Draw the picture of the p -value.

Full Hypothesis Test Examples

Example 9.14.

Jeffrey, as an eight-year old, established a mean time of 16.43 seconds for swimming the 25-yard freestyle, with a standard deviation of 0.8 seconds . His dad, Frank, thought that Jeffrey could swim the 25-yard freestyle faster using goggles. Frank bought Jeffrey a new pair of expensive goggles and timed Jeffrey for 15 25-yard freestyle swims . For the 15 swims, Jeffrey's mean time was 16 seconds. Frank thought that the goggles helped Jeffrey to swim faster than the 16.43 seconds. Conduct a hypothesis test using a preset α = 0.05. Assume that the swim times for the 25-yard freestyle are normal.

Set up the Hypothesis Test:

Since the problem is about a mean, this is a test of a single population mean .

H 0 : μ = 16.43   H a : μ < 16.43

For Jeffrey to swim faster, his time will be less than 16.43 seconds. The "<" tells you this is left-tailed.

Determine the distribution needed:

Random variable: X ¯ X ¯ = the mean time to swim the 25-yard freestyle.

Distribution for the test: X ¯ X ¯ is normal (population standard deviation is known: σ = 0.8)

X ¯ ~ N ( μ , σ X n ) X ¯ ~ N ( μ , σ X n ) Therefore, X ¯ ~ N ( 16.43 , 0.8 15 ) X ¯ ~ N ( 16.43 , 0.8 15 )

μ = 16.43 comes from H 0 and not the data. σ = 0.8, and n = 15.

Calculate the p -value using the normal distribution for a mean:

p -value = P ( x ¯ x ¯ < 16) = 0.0187 where the sample mean in the problem is given as 16.

p -value = 0.0187 (This is called the actual level of significance .) The p -value is the area to the left of the sample mean is given as 16.

μ = 16.43 comes from H 0 . Our assumption is μ = 16.43.

Interpretation of the p -value: If H 0 is true , there is a 0.0187 probability (1.87%)that Jeffrey's mean time to swim the 25-yard freestyle is 16 seconds or less. Because a 1.87% chance is small, the mean time of 16 seconds or less is unlikely to have happened randomly. It is a rare event.

Compare α and the p -value:

α = 0.05 p -value = 0.0187 α > p -value

Make a decision: Since α > p -value, reject H 0 .

This means that you reject μ = 16.43. In other words, you do not think Jeffrey swims the 25-yard freestyle in 16.43 seconds but faster with the new goggles.

Conclusion: At the 5% significance level, we conclude that Jeffrey swims faster using the new goggles. The sample data show there is sufficient evidence that Jeffrey's mean time to swim the 25-yard freestyle is less than 16.43 seconds.

The p -value can easily be calculated.

Using the TI-83, 83+, 84, 84+ Calculator

Press STAT and arrow over to TESTS . Press 1:Z-Test . Arrow over to Stats and press ENTER . Arrow down and enter 16.43 for μ 0 (null hypothesis), .8 for σ , 16 for the sample mean, and 15 for n . Arrow down to μ : (alternate hypothesis) and arrow over to < μ 0 . Press ENTER . Arrow down to Calculate and press ENTER . The calculator not only calculates the p -value ( p = 0.0187) but it also calculates the test statistic ( z -score) for the sample mean. μ < 16.43 is the alternative hypothesis. Do this set of instructions again except arrow to Draw (instead of Calculate ). Press ENTER . A shaded graph appears with z = -2.08 (test statistic) and p = 0.0187 ( p -value). Make sure when you use Draw that no other equations are highlighted in Y = and the plots are turned off.

When the calculator does a Z -Test, the Z-Test function finds the p -value by doing a normal probability calculation using the central limit theorem :

P ( x ¯ < 16 ) = P ( x ¯ < 16 ) = 2nd DISTR normcdf ( − 10 ^ 99 , 16 , 16.43 , 0.8 / 15 ) ( − 10 ^ 99 , 16 , 16.43 , 0.8 / 15 ) .

The Type I and Type II errors for this problem are as follows:

The Type I error is to conclude that Jeffrey swims the 25-yard freestyle, on average, in less than 16.43 seconds when, in fact, he actually swims the 25-yard freestyle, on average, in 16.43 seconds. (Reject the null hypothesis when the null hypothesis is true.)

The Type II error is that there is not evidence to conclude that Jeffrey swims the 25-yard free-style, on average, in less than 16.43 seconds when, in fact, he actually does swim the 25-yard free-style, on average, in less than 16.43 seconds. (Do not reject the null hypothesis when the null hypothesis is false.)

Try It 9.14

The mean throwing distance of a football for Marco, a high school freshman quarterback, is 40 yards, with a standard deviation of two yards. The team coach tells Marco to adjust his grip to get more distance. The coach records the distances for 20 throws. For the 20 throws, Marco’s mean distance was 45 yards. The coach thought the different grip helped Marco throw farther than 40 yards. Conduct a hypothesis test using a preset α = 0.05. Assume the throw distances for footballs are normal.

First, determine what type of test this is, set up the hypothesis test, find the p -value, sketch the graph, and state your conclusion.

Press STAT and arrow over to TESTS. Press 1:Z-Test. Arrow over to Stats and press ENTER. Arrow down and enter 40 for μ 0 (null hypothesis), 2 for σ , 45 for the sample mean, and 20 for n . Arrow down to μ : (alternative hypothesis) and set it either as <, ≠, or >. Press ENTER. Arrow down to Calculate and press ENTER. The calculator not only calculates the p -value but it also calculates the test statistic ( z -score) for the sample mean. Select <, ≠, or > for the alternative hypothesis. Do this set of instructions again except arrow to Draw (instead of Calculate). Press ENTER. A shaded graph appears with test statistic and p -value. Make sure when you use Draw that no other equations are highlighted in Y = and the plots are turned off.

Historical Note ( Example 9.14 )

The traditional way to compare the two probabilities, α and the p -value, is to compare the critical value ( z -score from α ) to the test statistic ( z -score from data). The calculated test statistic for the p -value is –2.08. (From the Central Limit Theorem, the test statistic formula is z = x ¯ − μ X ( σ X n ) z = x ¯ − μ X ( σ X n ) . For this problem, x ¯ x ¯ = 16, μ X = 16.43 from the null hypothes is, σ X = 0.8, and n = 15.) You can find the critical value for α = 0.05 in the normal table (see 15.Tables in the Table of Contents). The z -score for an area to the left equal to 0.05 is midway between –1.65 and –1.64 (0.05 is midway between 0.0505 and 0.0495). The z -score is –1.645. Since –1.645 > –2.08 (which demonstrates that α > p -value), reject H 0 . Traditionally, the decision to reject or not reject was done in this way. Today, comparing the two probabilities α and the p -value is very common. For this problem, the p -value, 0.0187 is considerably smaller than α , 0.05. You can be confident about your decision to reject. The graph shows α , the p -value, and the test statistic and the critical value.

Example 9.15

A college football coach records the mean weight that his players can bench press as 275 pounds , with a standard deviation of 55 pounds . Three of his players thought that the mean weight was more than that amount. They asked 30 of their teammates for their estimated maximum lift on the bench press exercise. The data ranged from 205 pounds to 385 pounds. The actual different weights were (frequencies are in parentheses) 205(3) ; 215(3) ; 225(1) ; 241(2) ; 252(2) ; 265(2) ; 275(2) ; 313(2) ; 316(5) ; 338(2) ; 341(1) ; 345(2) ; 368(2) ; 385(1) .

Conduct a hypothesis test using a 2.5% level of significance to determine if the bench press mean is more than 275 pounds .

Since the problem is about a mean weight, this is a test of a single population mean .

H 0 : μ = 275 H a : μ > 275 This is a right-tailed test.

Calculating the distribution needed:

Random variable: X ¯ X ¯ = the mean weight, in pounds, lifted by the football players.

Distribution for the test: It is normal because σ is known.

X ¯ ~ N ( 275 , 55 30 ) X ¯ ~ N ( 275 , 55 30 )

x ¯ = 286.2 x ¯ = 286.2 pounds (from the data).

σ = 55 pounds (Always use σ if you know it.) We assume μ = 275 pounds unless our data shows us otherwise.

Calculate the p -value using the normal distribution for a mean and using the sample mean as input (see Appendix G Notes for the TI-83, 83+, 84, 84+ Calculators for using the data as input):

p -value = P ( x ¯ > 286.2 ) = 0.1323 p -value = P ( x ¯ > 286.2 ) = 0.1323 .

Interpretation of the p -value: If H 0 is true, then there is a 0.1331 probability (13.23%) that the football players can lift a mean weight of 286.2 pounds or more. Because a 13.23% chance is large enough, a mean weight lift of 286.2 pounds or more is not a rare event.

α = 0.025 p -value = 0.1323

Make a decision: Since α < p -value, do not reject H 0 .

Conclusion: At the 2.5% level of significance, from the sample data, there is not sufficient evidence to conclude that the true mean weight lifted is more than 275 pounds.

Put the data and frequencies into lists. Press STAT and arrow over to TESTS . Press 1:Z-Test . Arrow over to Data and press ENTER . Arrow down and enter 275 for μ 0 , 55 for σ , the name of the list where you put the data, and the name of the list where you put the frequencies. Arrow down to μ: and arrow over to > μ 0 . Press ENTER . Arrow down to Calculate and press ENTER . The calculator not only calculates the p -value ( p = 0.1331, a little different from the previous calculation - in it we used the sample mean rounded to one decimal place instead of the data) but it also calculates the test statistic ( z -score) for the sample mean, the sample mean, and the sample standard deviation. μ > 275 is the alternative hypothesis. Do this set of instructions again except arrow to Draw (instead of Calculate ). Press ENTER . A shaded graph appears with z = 1.112 (test statistic) and p = 0.1331 ( p -value). Make sure when you use Draw that no other equations are highlighted in Y = and the plots are turned off.

Example 9.16

Statistics students believe that the mean score on the first statistics test is 65. A statistics instructor thinks the mean score is higher than 65. He samples ten statistics students and obtains the scores 65 ; 65 ; 70 ; 67 ; 66 ; 63 ; 63 ; 68 ; 72 ; 71 . He performs a hypothesis test using a 5% level of significance. The data are assumed to be from a normal distribution.

Set up the hypothesis test:

A 5% level of significance means that α = 0.05. This is a test of a single population mean .

H 0 : μ = 65   H a : μ > 65

Since the instructor thinks the average score is higher, use a ">". The ">" means the test is right-tailed.

Random variable: X ¯ X ¯ = average score on the first statistics test.

Distribution for the test: If you read the problem carefully, you will notice that there is no population standard deviation given . You are only given n = 10 sample data values. Notice also that the data come from a normal distribution. This means that the distribution for the test is a student's t .

Use t df . Therefore, the distribution for the test is t 9 where n = 10 and df = 10 - 1 = 9.

Calculate the p -value using the Student's t -distribution:

p -value = P ( x ¯ x ¯ > 67) = 0.0396 where the sample mean and sample standard deviation are calculated as 67 and 3.1972 from the data.

Interpretation of the p -value: If the null hypothesis is true, then there is a 0.0396 probability (3.96%) that the sample mean is 67 or more.

Since α = 0.05 and p -value = 0.0396. α > p -value.

This means you reject μ = 65. In other words, you believe the average test score is greater than 65.

Conclusion: At a 5% level of significance, the sample data show sufficient evidence that the mean (average) test score is greater than 65, just as the math instructor thinks.

Put the data into a list. Press STAT and arrow over to TESTS . Press 2:T-Test . Arrow over to Data and press ENTER . Arrow down and enter 65 for μ 0 , the name of the list where you put the data, and 1 for Freq: . Arrow down to μ : and arrow over to > μ 0 . Press ENTER . Arrow down to Calculate and press ENTER . The calculator not only calculates the p -value ( p = 0.0396) but it also calculates the test statistic ( t -score) for the sample mean, the sample mean, and the sample standard deviation. μ > 65 is the alternative hypothesis. Do this set of instructions again except arrow to Draw (instead of Calculate ). Press ENTER . A shaded graph appears with t = 1.9781 (test statistic) and p = 0.0396 ( p -value). Make sure when you use Draw that no other equations are highlighted in Y = and the plots are turned off.

Try It 9.16

It is believed that a stock price for a particular company will grow at a rate of $5 per week with a standard deviation of $1. An investor believes the stock won’t grow as quickly. The changes in stock price is recorded for ten weeks and are as follows: $4, $3, $2, $3, $1, $7, $2, $1, $1, $2. Perform a hypothesis test using a 5% level of significance. State the null and alternative hypotheses, find the p -value, state your conclusion, and identify the Type I and Type II errors.

Example 9.17

Joon believes that 50% of first-time brides in the United States are younger than their grooms. She performs a hypothesis test to determine if the percentage is the same or different from 50% . Joon samples 100 first-time brides and 53 reply that they are younger than their grooms. For the hypothesis test, she uses a 1% level of significance.

The 1% level of significance means that α = 0.01. This is a test of a single population proportion .

H 0 : p = 0.50   H a : p ≠ 0.50

The words "is the same or different from" tell you this is a two-tailed test.

Calculate the distribution needed:

Random variable: P′ = the percent of of first-time brides who are younger than their grooms.

Distribution for the test: The problem contains no mention of a mean. The information is given in terms of percentages. Use the distribution for P′ , the estimated proportion.

P ′ ~ N ( p , p ⋅ q n ) P ′ ~ N ( p , p ⋅ q n ) Therefore, P ′ ~ N ( 0.5 , 0.5 ⋅ 0.5 100 ) P ′ ~ N ( 0.5 , 0.5 ⋅ 0.5 100 )

where p = 0.50, q = 1− p = 0.50, and n = 100

Calculate the p -value using the normal distribution for proportions:

p -value = P ( p′ < 0.47 or p′ > 0.53) = 0.5485

where x = 53, p′ = x n  =  53 100 x n  =  53 100 = 0.53.

Interpretation of the p -value: If the null hypothesis is true, there is 0.5485 probability (54.85%) that the sample (estimated) proportion p ' p ' is 0.53 or more OR 0.47 or less (see the graph in Figure 9.10 ).

μ = p = 0.50 comes from H 0 , the null hypothesis.

p′ = 0.53. Since the curve is symmetrical and the test is two-tailed, the p′ for the left tail is equal to 0.50 – 0.03 = 0.47 where μ = p = 0.50. (0.03 is the difference between 0.53 and 0.50.)

Since α = 0.01 and p -value = 0.5485. α < p -value.

Make a decision: Since α < p -value, you cannot reject H 0 .

Conclusion: At the 1% level of significance, the sample data do not show sufficient evidence that the percentage of first-time brides who are younger than their grooms is different from 50%.

Press STAT and arrow over to TESTS . Press 5:1-PropZTest . Enter .5 for p 0 , 53 for x and 100 for n . Arrow down to Prop and arrow to not equals p 0 . Press ENTER . Arrow down to Calculate and press ENTER . The calculator calculates the p -value ( p = 0.5485) and the test statistic ( z -score). Prop not equals .5 is the alternate hypothesis. Do this set of instructions again except arrow to Draw (instead of Calculate ). Press ENTER . A shaded graph appears with z = 0.6 (test statistic) and p = 0.5485 ( p -value). Make sure when you use Draw that no other equations are highlighted in Y = and the plots are turned off.

The Type I and Type II errors are as follows:

The Type I error is to conclude that the proportion of first-time brides who are younger than their grooms is different from 50% when, in fact, the proportion is actually 50%. (Reject the null hypothesis when the null hypothesis is true).

The Type II error is there is not enough evidence to conclude that the proportion of first time brides who are younger than their grooms differs from 50% when, in fact, the proportion does differ from 50%. (Do not reject the null hypothesis when the null hypothesis is false.)

Try It 9.17

A teacher believes that 85% of students in the class will want to go on a field trip to the local zoo. She performs a hypothesis test to determine if the percentage is the same or different from 85%. The teacher samples 50 students and 39 reply that they would want to go to the zoo. For the hypothesis test, use a 1% level of significance.

Example 9.18

Suppose a consumer group suspects that the proportion of households that have three cell phones is 30%. A cell phone company has reason to believe that the proportion is not 30%. Before they start a big advertising campaign, they conduct a hypothesis test. Their marketing people survey 150 households with the result that 43 of the households have three cell phones.

a. The value that helps determine the p -value is p′ . Calculate p′ .

b. What is a success for this problem?

c. What is the level of significance?

d. Draw the graph for this problem. Draw the horizontal axis. Label and shade appropriately. Calculate the p -value.

e. Make a decision. _____________(Reject/Do not reject) H 0 because____________.

H 0 : p = 0.30 H a : p ≠ 0.30

The random variable is P′ = proportion of households that have three cell phones.

The distribution for the hypothesis test is P ' ~ N ( 0.30 , ( 0.30 ) ⋅ ( 0.70 ) 150 ) P ' ~ N ( 0.30 , ( 0.30 ) ⋅ ( 0.70 ) 150 )

a. p′ = x n x n where x is the number of successes and n is the total number in the sample.

x = 43, n = 150

p′ = 43 150 43 150

b. A success is having three cell phones in a household.

c. The level of significance is the preset α . Since α is not given, assume that α = 0.05.

d. p -value = 0.7216

e. Assuming that α = 0.05, α < p -value. The decision is do not reject H 0 because there is not sufficient evidence to conclude that the proportion of households that have three cell phones is not 30%.

Try It 9.18

Marketers believe that 92% of adults in the United States own a cell phone. A cell phone manufacturer believes that number is actually lower. 200 American adults are surveyed, of which, 174 report having cell phones. Use a 5% level of significance. State the null and alternative hypothesis, find the p -value, state your conclusion, and identify the Type I and Type II errors.

The next example is a poem written by a statistics student named Nicole Hart. The solution to the problem follows the poem. Notice that the hypothesis test is for a single population proportion. This means that the null and alternate hypotheses use the parameter p . The distribution for the test is normal. The estimated proportion p ′ is the proportion of fleas killed to the total fleas found on Fido. This is sample information. The problem gives a preconceived α = 0.01, for comparison, and a 95% confidence interval computation. The poem is clever and humorous, so please enjoy it!

Example 9.19

My dog has so many fleas, They do not come off with ease. As for shampoo, I have tried many types Even one called Bubble Hype, Which only killed 25% of the fleas, Unfortunately I was not pleased. I've used all kinds of soap, Until I had given up hope Until one day I saw An ad that put me in awe. A shampoo used for dogs Called GOOD ENOUGH to Clean a Hog Guaranteed to kill more fleas. I gave Fido a bath And after doing the math His number of fleas Started dropping by 3's! Before his shampoo I counted 42. At the end of his bath, I redid the math And the new shampoo had killed 17 fleas. So now I was pleased. Now it is time for you to have some fun With the level of significance being .01, You must help me figure out Use the new shampoo or go without?

H 0 : p ≤ 0.25    H a : p > 0.25

In words, CLEARLY state what your random variable X ¯ X ¯ or P′ represents.

P′ = The proportion of fleas that are killed by the new shampoo

State the distribution to use for the test.

Normal: N ( 0.25 , ( 0.25 ) ( 1 − 0.25 ) 42 ) N ( 0.25 , ( 0.25 ) ( 1 − 0.25 ) 42 )

Test Statistic: z = 2.3163

p -value = 0.0103

In one to two complete sentences, explain what the p -value means for this problem.

If the null hypothesis is true (the proportion is 0.25), then there is a 0.0103 probability that the sample (estimated) proportion is 0.4048 ( 17 42 ) ( 17 42 ) or more.

Use the previous information to sketch a picture of this situation. CLEARLY, label and scale the horizontal axis and shade the region(s) corresponding to the p -value.

Indicate the correct decision (“reject” or “do not reject” the null hypothesis), the reason for it, and write an appropriate conclusion, using complete sentences.

Conclusion: At the 1% level of significance, the sample data do not show sufficient evidence that the percentage of fleas that are killed by the new shampoo is more than 25%.

Construct a 95% confidence interval for the true mean or proportion. Include a sketch of the graph of the situation. Label the point estimate and the lower and upper bounds of the confidence interval.

Confidence Interval: (0.26,0.55) We are 95% confident that the true population proportion p of fleas that are killed by the new shampoo is between 26% and 55%.

This test result is not very definitive since the p -value is very close to alpha. In reality, one would probably do more tests by giving the dog another bath after the fleas have had a chance to return.

Example 9.20

The National Institute of Standards and Technology provides exact data on conductivity properties of materials. Following are conductivity measurements for 11 randomly selected pieces of a particular type of glass.

1.11; 1.07; 1.11; 1.07; 1.12; 1.08; .98; .98 1.02; .95; .95 Is there convincing evidence that the average conductivity of this type of glass is greater than one? Use a significance level of 0.05. Assume the population is normal.

Let’s follow a four-step process to answer this statistical question.

  • H 0 : μ ≤ 1
  • H a : μ > 1
  • Plan : We are testing a sample mean without a known population standard deviation. Therefore, we need to use a Student's t-distribution. Assume the underlying population is normal.
  • State the Conclusions : Since the p -value ( p = 0.036) is less than our alpha value, we will reject the null hypothesis. It is reasonable to state that the data supports the claim that the average conductivity level is greater than one.

Example 9.21

In a study of 420,019 cell phone users, 172 of the subjects developed brain cancer. Test the claim that cell phone users developed brain cancer at a greater rate than that for non-cell phone users (the rate of brain cancer for non-cell phone users is 0.0340%). Since this is a critical issue, use a 0.005 significance level. Explain why the significance level should be so low in terms of a Type I error.

We will follow the four-step process.

  • H 0 : p ≤ 0.00034
  • H a : p > 0.00034

If we commit a Type I error, we are essentially accepting a false claim. Since the claim describes cancer-causing environments, we want to minimize the chances of incorrectly identifying causes of cancer.

  • We will be testing a sample proportion with x = 172 and n = 420,019. The sample is sufficiently large because we have np = 420,019(0.00034) = 142.8, nq = 420,019(0.99966) = 419,876.2, two independent outcomes, and a fixed probability of success p = 0.00034. Thus we will be able to generalize our results to the population.
  • Since the p -value = 0.0073 is greater than our alpha value = 0.005, we cannot reject the null. Therefore, we conclude that there is not enough evidence to support the claim of higher brain cancer rates for the cell phone users.

Example 9.22

According to the US Census there are approximately 268,608,618 residents aged 12 and older. Statistics from the Rape, Abuse, and Incest National Network indicate that, on average, 207,754 rapes occur each year (male and female) for persons aged 12 and older. This translates into a percentage of sexual assaults of 0.078%. In Daviess County, KY, there were reported 11 rapes for a population of 37,937. Conduct an appropriate hypothesis test to determine if there is a statistically significant difference between the local sexual assault percentage and the national sexual assault percentage. Use a significance level of 0.01.

We will follow the four-step plan.

  • We need to test whether the proportion of sexual assaults in Daviess County, KY is significantly different from the national average.
  • H 0 : p = 0.00078
  • H a : p ≠ 0.00078
  • Since the p -value, p = 0.00063, is less than the alpha level of 0.01, the sample data indicates that we should reject the null hypothesis. In conclusion, the sample data support the claim that the proportion of sexual assaults in Daviess County, Kentucky is different from the national average proportion.

As an Amazon Associate we earn from qualifying purchases.

This book may not be used in the training of large language models or otherwise be ingested into large language models or generative AI offerings without OpenStax's permission.

Want to cite, share, or modify this book? This book uses the Creative Commons Attribution License and you must attribute OpenStax.

Access for free at https://openstax.org/books/introductory-statistics/pages/1-introduction
  • Authors: Barbara Illowsky, Susan Dean
  • Publisher/website: OpenStax
  • Book title: Introductory Statistics
  • Publication date: Sep 19, 2013
  • Location: Houston, Texas
  • Book URL: https://openstax.org/books/introductory-statistics/pages/1-introduction
  • Section URL: https://openstax.org/books/introductory-statistics/pages/9-5-additional-information-and-full-hypothesis-test-examples

© Jun 23, 2022 OpenStax. Textbook content produced by OpenStax is licensed under a Creative Commons Attribution License . The OpenStax name, OpenStax logo, OpenStax book covers, OpenStax CNX name, and OpenStax CNX logo are not subject to the Creative Commons license and may not be reproduced without the prior and express written consent of Rice University.

If you're seeing this message, it means we're having trouble loading external resources on our website.

If you're behind a web filter, please make sure that the domains *.kastatic.org and *.kasandbox.org are unblocked.

To log in and use all the features of Khan Academy, please enable JavaScript in your browser.

AP®︎/College Statistics

Course: ap®︎/college statistics   >   unit 10.

  • Idea behind hypothesis testing
  • Examples of null and alternative hypotheses
  • Writing null and alternative hypotheses
  • P-values and significance tests
  • Comparing P-values to different significance levels
  • Estimating a P-value from a simulation
  • Estimating P-values from simulations

Using P-values to make conclusions

  • (Choice A)   Fail to reject H 0 ‍   A Fail to reject H 0 ‍  
  • (Choice B)   Reject H 0 ‍   and accept H a ‍   B Reject H 0 ‍   and accept H a ‍  
  • (Choice C)   Accept H 0 ‍   C Accept H 0 ‍  
  • (Choice A)   The evidence suggests that these subjects can do better than guessing when identifying the bottled water. A The evidence suggests that these subjects can do better than guessing when identifying the bottled water.
  • (Choice B)   We don't have enough evidence to say that these subjects can do better than guessing when identifying the bottled water. B We don't have enough evidence to say that these subjects can do better than guessing when identifying the bottled water.
  • (Choice C)   The evidence suggests that these subjects were simply guessing when identifying the bottled water. C The evidence suggests that these subjects were simply guessing when identifying the bottled water.
  • (Choice A)   She would have rejected H a ‍   . A She would have rejected H a ‍   .
  • (Choice B)   She would have accepted H 0 ‍   . B She would have accepted H 0 ‍   .
  • (Choice C)   She would have rejected H 0 ‍   and accepted H a ‍   . C She would have rejected H 0 ‍   and accepted H a ‍   .
  • (Choice D)   She would have reached the same conclusion using either α = 0.05 ‍   or α = 0.10 ‍   . D She would have reached the same conclusion using either α = 0.05 ‍   or α = 0.10 ‍   .
  • (Choice A)   The evidence suggests that these bags are being filled with a mean amount that is different than 7.4  kg ‍   . A The evidence suggests that these bags are being filled with a mean amount that is different than 7.4  kg ‍   .
  • (Choice B)   We don't have enough evidence to say that these bags are being filled with a mean amount that is different than 7.4  kg ‍   . B We don't have enough evidence to say that these bags are being filled with a mean amount that is different than 7.4  kg ‍   .
  • (Choice C)   The evidence suggests that these bags are being filled with a mean amount of 7.4  kg ‍   . C The evidence suggests that these bags are being filled with a mean amount of 7.4  kg ‍   .
  • (Choice A)   They would have rejected H a ‍   . A They would have rejected H a ‍   .
  • (Choice B)   They would have accepted H 0 ‍   . B They would have accepted H 0 ‍   .
  • (Choice C)   They would have failed to reject H 0 ‍   . C They would have failed to reject H 0 ‍   .
  • (Choice D)   They would have reached the same conclusion using either α = 0.05 ‍   or α = 0.01 ‍   . D They would have reached the same conclusion using either α = 0.05 ‍   or α = 0.01 ‍   .

Ethics and the significance level α ‍  

Want to join the conversation.

  • Upvote Button navigates to signup page
  • Downvote Button navigates to signup page
  • Flag Button navigates to signup page

Incredible Answer

senioritis

Understanding the Role of Hypotheses and Conclusions in Mathematical Reasoning

Hypothesis and conclusion.

In the context of mathematics and logic, a hypothesis is a statement or proposition that is assumed to be true for the purpose of a logical argument or investigation. It is usually denoted by “H” or “P” and is the starting point for many mathematical proofs.

For example, let’s consider the hypothesis: “If it is raining outside, then the ground is wet.” This statement assumes that whenever it rains, the ground will be wet.

The conclusion, on the other hand, is the statement or proposition that is inferred or reached by logical reasoning, based on the hypothesis or given information. It is typically denoted by “C” or “Q”.

Using the same example, the conclusion derived from the hypothesis could be: “It is currently raining outside, so the ground is wet.” This conclusion is based on the assumption that the given condition of rain implies a wet ground.

In mathematics, hypotheses and conclusions are commonly used in proofs and logical arguments. By stating a hypothesis and then deducing a conclusion from it, mathematicians can demonstrate the validity of certain mathematical concepts, theorems, or formulas.

It’s important to note that in mathematics, a hypothesis is not the same as a guess or a prediction. It is a statement that is assumed to be true and serves as the basis for logical reasoning, while the conclusion is the logical consequence or outcome that is drawn from the hypothesis.

More Answers:

Recent posts, ramses ii a prominent pharaoh and legacy of ancient egypt.

Ramses II (c. 1279–1213 BCE) Ramses II, also known as Ramses the Great, was one of the most prominent and powerful pharaohs of ancient Egypt.

Formula for cyclic adenosine monophosphate & Its Significance

Is the formula of cyclic adenosine monophosphate (cAMP) $ce{C_{10}H_{11}N_{5}O_{6}P}$ or $ce{C_{10}H_{12}N_{5}O_{6}P}$? Does it matter? The correct formula for cyclic adenosine monophosphate (cAMP) is $ce{C_{10}H_{11}N_{5}O_{6}P}$. The

Development of a Turtle Inside its Egg

How does a turtle develop inside its egg? The development of a turtle inside its egg is a fascinating process that involves several stages and

The Essential Molecule in Photosynthesis for Energy and Biomass

Why does photosynthesis specifically produce glucose? Photosynthesis is the biological process by which plants, algae, and some bacteria convert sunlight, carbon dioxide (CO2), and water

How the Human Body Recycles its Energy Currency

Source for “The human body recycles its body weight of ATP each day”? The statement that “the human body recycles its body weight of ATP

Don't Miss Out! Sign Up Now!

Sign up now to get started for free!

Module 9: Hypothesis Testing With One Sample

9.5: additional information and full hypothesis test examples, learning outcomes.

  • Conduct and interpret hypothesis tests for a single population proportion.
  • In a hypothesis test problem, you may see words such as “the level of significance is 1%.” The “1%” is the preconceived or preset α .
  • The statistician setting up the hypothesis test selects the value of α to use before collecting the sample data.
  • If no level of significance is given, a common standard to use is α = 0.05.
  • When you calculate the p -value and draw the picture, the p -value is the area in the left tail, the right tail, or split evenly between the two tails. For this reason, we call the hypothesis test left, right, or two tailed.
  • The alternative hypothesis , H a , tells you if the test is left, right, or two-tailed. It is the key to conducting the appropriate test.
  • H a never has a symbol that contains an equal sign.
  • Thinking about the meaning of the p -value : A data analyst (and anyone else) should have more confidence that he made the correct decision to reject the null hypothesis with a smaller p -value (for example, 0.001 as opposed to 0.04) even if using the 0.05 level for alpha. Similarly, for a large p -value such as 0.4, as opposed to a p -value of 0.056 (alpha = 0.05 is less than either number), a data analyst should have more confidence that she made the correct decision in not rejecting the null hypothesis. This makes the data analyst use judgment rather than mindlessly applying rules.

The following examples illustrate a left-, right-, and two-tailed test.

H o : μ = 5, H a : μ < 5

Test of a single population mean. H a tells you the test is left-tailed. The picture of the p -value is as follows:

Normal distribution curve of a single population mean with a value of 5 on the x-axis and the p-value points to the area on the left tail of the curve.

H 0 : μ = 10, H a : μ < 10

Assume the p -value is 0.0935. What type of test is this? Draw the picture of the p -value.

left-tailed test

5 example of hypothesis and conclusion

H 0 : p ≤ 0.2   H a : p > 0.2

This is a test of a single population proportion. H a tells you the test is right-tailed . The picture of the p -value is as follows:

Normal distribution curve of a single population proportion with the value of 0.2 on the x-axis. The p-value points to the area on the right tail of the curve.

H 0 : μ ≤ 1, H a : μ > 1

Assume the p -value is 0.1243. What type of test is this? Draw the picture of the p -value.

right-tailed test

5 example of hypothesis and conclusion

This is a test of a single population mean. H a tells you the test is two-tailed . The picture of the p -value is as follows.

Normal distribution curve of a single population mean with a value of 50 on the x-axis. The p-value formulas, 1/2(p-value), for a two-tailed test is shown for the areas on the left and right tails of the curve.

H 0 : p = 0.5, H a : p ≠ 0.5

Assume the p -value is 0.2564. What type of test is this? Draw the picture of the p -value.

two-tailed test

5 example of hypothesis and conclusion

Full Hypothesis Test Examples

Jeffrey, as an eight-year old, established a mean time of 16.43 seconds for swimming the 25-yard freestyle, with a standard deviation of 0.8 seconds . His dad, Frank, thought that Jeffrey could swim the 25-yard freestyle faster using goggles. Frank bought Jeffrey a new pair of expensive goggles and timed Jeffrey for 15 25-yard freestyle swims . For the 15 swims, Jeffrey’s mean time was 16 seconds. Frank thought that the goggles helped Jeffrey to swim faster than the 16.43 seconds. Conduct a hypothesis test using a preset α = 0.05. Assume that the swim times for the 25-yard freestyle are normal.

Set up the Hypothesis Test:

Since the problem is about a mean, this is a test of a single population mean .

H 0 : μ = 16.43   H a : μ < 16.43

For Jeffrey to swim faster, his time will be less than 16.43 seconds. The “<” tells you this is left-tailed.

Determine the distribution needed:

Random variable: [latex]\overline{X}[/latex]= the mean time to swim the 25-yard freestyle.

Distribution for the test: [latex]\overline{X}[/latex] is normal (population standard deviation is known: σ = 0.8)

μ = 16.43 comes from H 0 and not the data. σ = 0.8, and n = 15.

Calculate the p -value using the normal distribution for a mean:

p -value = P[latex]\left(\overline{x}<{16}\right)[/latex]= 0.0187 where the sample mean in the problem is given as 16.

p -value = 0.0187 (This is called the actual level of significance .) The p -value is the area to the left of the sample mean is given as 16.

Normal distribution curve for the average time to swim the 25-yard freestyle with values 16, as the sample mean, and 16.43 on the x-axis. A vertical upward line extends from 16 on the x-axis to the curve. An arrow points to the left tail of the curve.

μ = 16.43 comes from H 0 . Our assumption is μ = 16.43.

Interpretation of the p -value: If H 0 is true , there is a 0.0187 probability (1.87%)that Jeffrey’s mean time to swim the 25-yard freestyle is 16 seconds or less. Because a 1.87% chance is small, the mean time of 16 seconds or less is unlikely to have happened randomly. It is a rare event.

Compare α and the p -value:

α = 0.05 p -value = 0.0187 α > p -value

Make a decision: Since α > p -value, reject H 0 .

This means that you reject μ = 16.43. In other words, you do not think Jeffrey swims the 25-yard freestyle in 16.43 seconds but faster with the new goggles.

Conclusion: At the 5% significance level, we conclude that Jeffrey swims faster using the new goggles. The sample data show there is sufficient evidence that Jeffrey’s mean time to swim the 25-yard freestyle is less than 16.43 seconds.

The p -value can easily be calculated.

Press STAT and arrow over to TESTS . Press 1:Z-Test . Arrow over to Stats and press ENTER . Arrow down and enter 16.43 for μ 0 (null hypothesis), .8 for σ , 16 for the sample mean, and 15 for n . Arrow down to μ : (alternate hypothesis) and arrow over to < μ 0 . Press ENTER . Arrow down to Calculate and press ENTER . The calculator not only calculates the p -value ( p = 0.0187) but it also calculates the test statistic ( z -score) for the sample mean. μ < 16.43 is the alternative hypothesis. Do this set of instructions again except arrow to Draw (instead of Calculate ). Press ENTER . A shaded graph appears with z = -2.08 (test statistic) and p = 0.0187 ( p -value). Make sure when you use Draw that no other equations are highlighted in Y = and the plots are turned off.

When the calculator does a Z -Test, the Z-Test function finds the p -value by doing a normal probability calculation using the central limit theorem: P[latex]\left(\overline{x}<{16}\right)[/latex]2nd DISTR normcdf ([latex]{-10}^{99}[/latex], 16,16.43,[latex]\frac{{0.8}}{{\sqrt{15}}}[/latex]).

The Type I and Type II errors for this problem are as follows:

The Type I error is to conclude that Jeffrey swims the 25-yard freestyle, on average, in less than 16.43 seconds when, in fact, he actually swims the 25-yard freestyle, on average, in 16.43 seconds. (Reject the null hypothesis when the null hypothesis is true.)

The Type II error is that there is not evidence to conclude that Jeffrey swims the 25-yard free-style, on average, in less than 16.43 seconds when, in fact, he actually does swim the 25-yard free-style, on average, in less than 16.43 seconds. (Do not reject the null hypothesis when the null hypothesis is false.)

The mean throwing distance of a football for a Marco, a high school freshman quarterback, is 40 yards, with a standard deviation of two yards. The team coach tells Marco to adjust his grip to get more distance. The coach records the distances for 20 throws. For the 20 throws, Marco’s mean distance was 45 yards. The coach thought the different grip helped Marco throw farther than 40 yards. Conduct a hypothesis test using a preset α = 0.05. Assume the throw distances for footballs are normal.

First, determine what type of test this is, set up the hypothesis test, find the p -value, sketch the graph, and state your conclusion.

Press STAT and arrow over to TESTS. Press 1:Z-Test. Arrow over to Stats and press ENTER. Arrow down and enter 40 for μ 0 (null hypothesis), 2 for σ , 45 for the sample mean, and 20 for n . Arrow down to μ : (alternative hypothesis) and set it either as <, ≠, or >. Press ENTER. Arrow down to Calculate and press ENTER. The calculator not only calculates the p -value but it also calculates the test statistic ( z -score) for the sample mean. Select <, ≠, or >; for the alternative hypothesis. Do this set of instructions again except arrow to Draw (instead of Calculate). Press ENTER. A shaded graph appears with test statistic and p -value. Make sure when you use Draw that no other equations are highlighted in Y = and the plots are turned off.

Since the problem is about a mean, this is a test of a single population mean.

H 0 : μ = 40

H a : μ > 40

5 example of hypothesis and conclusion

Because p < α , we reject the null hypothesis. There is sufficient evidence to suggest that the change in grip improved Marco’s throwing distance.

The traditional way to compare the two probabilities, α and the p -value, is to compare the critical value ( z -score from α ) to the test statistic ( z -score from data). The calculated test statistic for the p -value is –2.08. (From the Central Limit Theorem, the test statistic formula is z=[latex]\frac{\overline{x}-{\mu}_{X}}{(σ)}[/latex]. For this problem,[latex]\overline{x}[/latex] = 16,[latex]{/mu}_{X}[/latex] = 16.43 from the null hypothes is, [latex]{/mu}_{X}[/latex] = 0.8, and n = 15.) You can find the critical value for α = 0.05 in the normal table (see 15.Tables in the Table of Contents). The z -score for an area to the left equal to 0.05 is midway between –1.65 and –1.64 (0.05 is midway between 0.0505 and 0.0495). The z -score is –1.645. Since –1.645 > –2.08 (which demonstrates that α > p -value), reject H 0 . Traditionally, the decision to reject or not reject was done in this way. Today, comparing the two probabilities α and the p -value is very common. For this problem, the p -value, 0.0187 is considerably smaller than α , 0.05. You can be confident about your decision to reject. The graph shows α , the p -value, and the test statistics and the critical value.

Distribution curve comparing the α to the p-value. Values of -2.15 and -1.645 are on the x-axis. Vertical upward lines extend from both of these values to the curve. The p-value is equal to 0.0158 and points to the area to the left of -2.15. α is equal to 0.05 and points to the area between the values of -2.15 and -1.645.

A college football coach thought that his players could bench press a mean weight of 275 pounds . It is known that the standard deviation is 55 pounds . Three of his players thought that the mean weight was more than that amount. They asked 30 of their teammates for their estimated maximum lift on the bench press exercise. The data ranged from 205 pounds to 385 pounds. The actual different weights were (frequencies are in parentheses) 205(3) 215(3)225(1) 241(2) 252(2) 265(2) 275(2) 313(2) 316(5) 338(2) 341(1) 345(2) 368(2) 385(1).

Conduct a hypothesis test using a 2.5% level of significance to determine if the bench press mean is more than 275 pounds .

Since the problem is about a mean weight, this is a test of a single population mean .

H 0 : μ = 275

H a : μ > 275

This is a right-tailed test.

Calculating the distribution needed:

Random variable: [latex]\overline{X}[/latex] = the mean weight, in pounds, lifted by the football players.

Distribution for the test: It is normal because σ is known. [latex]\overline{X}~N\left(275,\frac{55}{\sqrt{30}}\right)[/latex]

[latex]\overline{x}[/latex] = 286.2

σ =55 pounds (Always use σ if you know it.) We assume μ = 275 pounds unless our data shows us otherwise.

Calculate the p -value using the normal distribution for a mean and using the sample mean as input. p-value=P[latex]\left(\overline{x}>286.2\right)=0.1323[/latex].

Interpretation of the p -value: If H 0 is true, then there is a 0.1331 probability (13.23%) that the football players can lift a mean weight of 286.2 pounds or more. Because a 13.23% chance is large enough, a mean weight lift of 286.2 pounds or more is not a rare event.

Normal distribution curve of the average weight lifted by football players with values of 275 and 286.2 on the x-axis. A vertical upward line extends from 286.2 to the curve. The p-value points to the area to the right of 286.2.

α = 0.025 p -value = 0.1323

Make a decision: Since α < p -value, do not reject H 0 .

Conclusion: At the 2.5% level of significance, from the sample data, there is not sufficient evidence to conclude that the true mean weight lifted is more than 275 pounds.

Put the data and frequencies into lists. Press STAT and arrow over to TESTS . Press 1:Z-Test . Arrow over to Data and press ENTER . Arrow down and enter 275 for μ 0 , 55 for σ , the name of the list where you put the data, and the name of the list where you put the frequencies. Arrow down to μ: and arrow over to > μ 0 . Press ENTER . Arrow down to Calculate and press ENTER . The calculator not only calculates the p -value ( p = 0.1331, a little different from the previous calculation – in it we used the sample mean rounded to one decimal place instead of the data) but it also calculates the test statistic ( z -score) for the sample mean, the sample mean, and the sample standard deviation. μ > 275 is the alternative hypothesis. Do this set of instructions again except arrow to Draw (instead of Calculate ). Press ENTER . A shaded graph appears with z = 1.112 (test statistic) and p = 0.1331 ( p -value). Make sure when you use Draw that no other equations are highlighted in Y = and the plots are turned off.

Statistics students believe that the mean score on the first statistics test is 65. A statistics instructor thinks the mean score is higher than 65. He samples ten statistics students and obtains the scores 65 65 70 67 66 63 63 68 72 71. He performs a hypothesis test using a 5% level of significance. The data are assumed to be from a normal distribution.

Set up the hypothesis test:

A 5% level of significance means that α = 0.05. This is a test of a single population mean .

H 0 : μ = 65   H a : μ > 65

Since the instructor thinks the average score is higher, use a “>”. The “>” means the test is right-tailed.

Random variable: [latex]\overline{X}[/latex] = average score on the first statistics test.

Distribution for the test: If you read the problem carefully, you will notice that there is no population standard deviation given . You are only given n = 10 sample data values. Notice also that the data come from a normal distribution. This means that the distribution for the test is a student’s t .

Use t df . Therefore, the distribution for the test is t 9 where n = 10 and df = 10 – 1 = 9.

Calculate the p -value using the Student’s t -distribution:

p -value = P ([latex]\overline{x}[/latex]> 67) = 0.0396 where the sample mean and sample standard deviation are calculated as 67 and 3.1972 from the data.

Interpretation of the p -value: If the null hypothesis is true, then there is a 0.0396 probability (3.96%) that the sample mean is 65 or more.

Normal distribution curve of average scores on the first statistic tests with 65 and 67 values on the x-axis. A vertical upward line extends from 67 to the curve. The p-value points to the area to the right of 67.

Since α = 0.05 and p -value = 0.0396. α > p -value.

This means you reject μ = 65. In other words, you believe the average test score is more than 65.

Conclusion: At a 5% level of significance, the sample data show sufficient evidence that the mean (average) test score is more than 65, just as the math instructor thinks.

It is believed that a stock price for a particular company will grow at a rate of $5 per week with a standard deviation of $1. An investor believes the stock won’t grow as quickly. The changes in stock price is recorded for ten weeks and are as follows: $4, $3, $2, $3, $1, $7, $2, $1, $1, $2. Perform a hypothesis test using a 5% level of significance. State the null and alternative hypotheses, find the p -value, state your conclusion, and identify the Type I and Type II errors.

H 0 : μ = 5

H a : μ < 5

Because p < α , we reject the null hypothesis. There is sufficient evidence to suggest that the stock price of the company grows at a rate less than $5 a week.

Type I Error: To conclude that the stock price is growing slower than $5 a week when, in fact, the stock price is growing at $5 a week (reject the null hypothesis when the null hypothesis is true).

Type II Error: To conclude that the stock price is growing at a rate of $5 a week when, in fact, the stock price is growing slower than $5 a week (do not reject the null hypothesis when the null hypothesis is false).

The 1% level of significance means that α = 0.01. This is a test of a single population proportion .

H 0 : p = 0.50   H a : p ≠ 0.50

The words “is the same or different from” tell you this is a two-tailed test.

Calculate the distribution needed:

Random variable: P′ = the percent of of first-time brides who are younger than their grooms.

Distribution for the test: The problem contains no mention of a mean. The information is given in terms of percentages. Use the distribution for P′ , the estimated proportion.

A teacher believes that 85% of students in the class will want to go on a field trip to the local zoo. She performs a hypothesis test to determine if the percentage is the same or different from 85%. The teacher samples 50 students and 39 reply that they would want to go to the zoo. For the hypothesis test, use a 1% level of significance.

Suppose a consumer group suspects that the proportion of households that have three cell phones is 30%. A cell phone company has reason to believe that the proportion is not 30%. Before they start a big advertising campaign, they conduct a hypothesis test. Their marketing people survey 150 households with the result that 43 of the households have three cell phones.

Marketers believe that 92% of adults in the United States own a cell phone. A cell phone manufacturer believes that number is actually lower. 200 American adults are surveyed, of which, 174 report having cell phones. Use a 5% level of significance. State the null and alternative hypothesis, find the p -value, state your conclusion, and identify the Type I and Type II errors.

The next example is a poem written by a statistics student named Nicole Hart. The solution to the problem follows the poem. Notice that the hypothesis test is for a single population proportion. This means that the null and alternate hypotheses use the parameter p . The distribution for the test is normal. The estimated proportion p ′ is the proportion of fleas killed to the total fleas found on Fido. This is sample information. The problem gives a preconceived α = 0.01, for comparison, and a 95% confidence interval computation. The poem is clever and humorous, so please enjoy it!

My dog has so many fleas,

They do not come off with ease.

As for shampoo, I have tried many types

Even one called Bubble Hype,

Which only killed 25% of the fleas,

Unfortunately I was not pleased.

I’ve used all kinds of soap,

Until I had given up hope

Until one day I saw

An ad that put me in awe.

A shampoo used for dogs

Called GOOD ENOUGH to Clean a Hog

Guaranteed to kill more fleas.

I gave Fido a bath

And after doing the math

His number of fleas

Started dropping by 3’s!

Before his shampoo

I counted 42.

At the end of his bath,

I redid the math

And the new shampoo had killed 17 fleas.

So now I was pleased.

Now it is time for you to have some fun

With the level of significance being .01,

You must help me figure out

Use the new shampoo or go without?

The National Institute of Standards and Technology provides exact data on conductivity properties of materials. Following are conductivity measurements for 11 randomly selected pieces of a particular type of glass.

1.11; 1.07; 1.11; 1.07; 1.12; 1.08; .98; .98 1.02; .95; .95

Is there convincing evidence that the average conductivity of this type of glass is greater than one? Use a significance level of 0.05. Assume the population is normal.

In a study of 420,019 cell phone users, 172 of the subjects developed brain cancer. Test the claim that cell phone users developed brain cancer at a greater rate than that for non-cell phone users (the rate of brain cancer for non-cell phone users is 0.0340%). Since this is a critical issue, use a 0.005 significance level. Explain why the significance level should be so low in terms of a Type I error.

According to the US Census there are approximately 268,608,618 residents aged 12 and older. Statistics from the Rape, Abuse, and Incest National Network indicate that, on average, 207,754 rapes occur each year (male and female) for persons aged 12 and older. This translates into a percentage of sexual assaults of 0.078%. In Daviess County, KY, there were reported 11 rapes for a population of 37,937. Conduct an appropriate hypothesis test to determine if there is a statistically significant difference between the local sexual assault percentage and the national sexual assault percentage. Use a significance level of 0.01.

Concept Review

The hypothesis test itself has an established process. This can be summarized as follows:

Notice that in performing the hypothesis test, you use α and not β . β is needed to help determine the sample size of the data that is used in calculating the p -value. Remember that the quantity 1 – β is called the Power of the Test . A high power is desirable. If the power is too low, statisticians typically increase the sample size while keeping α the same.If the power is low, the null hypothesis might not be rejected when it should be.

  • Introductory Statistics . Authored by : Barbara Illowski, Susan Dean. Provided by : Open Stax. Located at : http://cnx.org/contents/[email protected] . License : CC BY: Attribution . License Terms : Download for free at http://cnx.org/contents/[email protected]

Z-test vs T-test: the differences and when to use each

What is hypothesis testing, what is a z-test, examples of a z-test, what is a t-test, examples of a t-test, how to know when to use z-test vs t-test, difference between z-test and t-test: a comparative table.

two light bulbs with inscriptions Z-test vs T-test

The EPAM Anywhere Editorial Team is an international collective of senior software engineers, managers and communications professionals who create, review and share their insights on technology, career, remote work, and the daily life here at Anywhere.

Testing is how you determine effectiveness. Whether you work as a data scientist , statistician, or software developer, to ensure quality, you must measure performance. Without tests, you could deploy flawed code, features, or data points.

With that in mind, the use cases of testing are endless. Machine learning models need statistical tests. Data analysis involves statistical tests to validate assumptions. Optimization of any kind requires evaluation. You even need to test the strength of your hypothesis before you begin an inquiry.

Let's explore two inferential statistics: the Z-test vs the T-test. That way you can understand their differences, their unique purposes, and when to use a Z-test vs T-test.

To start, imagine you have a good idea. At the moment of inception, you have no data to back up your idea. It is an unformed thought. But the idea is an excellent starting point that can launch a full investigation. We consider this starting point a hypothesis.

But what if your hypothesis is off-base? You don’t want to dive into a full-scale search if it is a pointless chase with no reward. That is a waste of resources. You need to determine if you have a workable hypothesis.

Enter hypothesis testing. It is a statistical act used to assess the viability of a hypothesis. The method discovers whether there is sufficient data to support your idea. If there is next to no significance, you do not have a very plausible hypothesis.

To confirm the validity of a hypothesis, you compare it against the status quo (also known as the null hypothesis). Your idea is something new, opposite from normal conditions (also known as the alternative hypothesis). It is zero sum: only one hypothesis between the null and alternate hypothesis can be true in a given set of parameters.

In such a comparison test, you can now determine validity. You can compare and contrast conditions to find meaningful conclusions. Whichever conditions become statistically apparent determines which hypothesis is plausible.

A Z-test is a test statistic. It works with two groups: the sample mean and the population mean. It will test whether these two groups are different.

With a Z-test, you know the population standard deviation. That is to ensure statistical accuracy as you compare one group (the sample mean) vs the second group (the population mean). In other words, you can minimize external confounding factors with a normal distribution. In addition, a defining characteristic of a Z-test is that it works with large sample sizes (typically more than 30, so we achieve normal distribution as defined by the central limit theorem). These are two crucial criteria for using a Z-test.

Within hypothesis testing, your null hypothesis states there is no difference between the two groups your Z-test will compare. Your alternative hypothesis will state there is a difference that your Z-test will expose.

How to perform a Z-test

A Z-test occurs in the following standard format:

  • Formulate your hypothesis: First, define the parameters of your alternative and null hypothesis.
  • Choose a critical value: Second, determine what you consider a viable difference between your two groups. This threshold determines when you can say the null hypothesis should be rejected. Common levels are 0.01 (1%) or 0.05 (5%) , values found to best balance Type I and Type II errors .
  • Collect samples: Obtain the needed data. The data must be large enough and random.
  • Calculate for a Z-score: Input your data into the standard Z-test statistics formula, shown below, where Z = standard score, x = observed value, mu = mean of the sample, sigma = standard deviation of the sample .
  • Compare: If the statistical test is greater than the critical value, you have achieved statistical significance. The sample mean is so different so you can reject the null hypothesis. Your alternative hypothesis (something other than the status quo) is at work, and that's worth investigating.

There are different variations of a Z-test. Let's explore examples of one-sample and two-sample Z-tests.

One-sample Z-test

A one-sample Z-test looks for either an increase or a decrease. There is one sample group involved, taken from a population. We want to see if there is a difference between those two means.

For example, consider a school principal who believes their students' IQ is higher than the state average. The state average is 100 points (population mean), give or take 20 (the population standard deviation). To prove this hypothesis, the principal takes 50 students (the sample size) and finds their IQ scores. To their delight, they earn an average of 110.

But does the difference offer any statistical value? The principal then plugs the numbers into a Z-test. Any Z-score greater than the critical value would state there is sufficient significance. The claim that the students have an above-average IQ is valid.

Two-sample Z-test

A two-sample test involves comparing the average of two sample groups against the population means. It is to determine a difference between two independent samples.

For example, our principal wants to compare their students' IQ scores to the school across the street. They believe their students' average IQ is higher. They don’t need to know the exact numerical increase or decrease. All they want is proof that their student's average scores are higher than the other group.

To confirm the validity of this hypothesis, the principal will search for statistical significance. They can take a 50-student sample size from their school and a 50-student sample size from the rival school. Now in possession of both sample group's average IQ (and the sample standard deviation), they hope to find a number value that is not equal. And they need them to be unequal by a significant amount.

If the test statistic comes in less than the critical value, the differences are negligible. There is not enough evidence to say the hypothesis is worth exploring, the null hypothesis is maintained. He would not have enough proof that the IQ levels between the two schools are different.

data_analyst_portfolio_preview.jpg

Read full story

A T-test performs the same crucial function as a Z-test: determine if there is a difference between the means of two groups. If there is a significant difference, you have achieved statistical validity for your hypothesis.

However, a T-test involves a different set of factors. Most importantly, a T-test applies when you do not know the sample variance of your values. You must generalize the normal distribution (or T-distribution). Plus, there is an expectation that you do not possess all the data in a given scenario.

These conditions better match reality, as it is often hard to collect data from entire populations or always obtain a standard normal distribution. That is why T-tests are more widely applicable than Z-tests, though they operate with less precision.

How to perform a T-test

A T-test occurs in the following standard format:

  • Formulate your hypothesis: First, define the parameters of your null and alternative hypothesis.
  • Choose a critical value: Like a Z-test, determine what you consider a viable difference between your two groups.
  • Collect data: Obtain the needed data. One of the key differences is degrees of freedom in the samples of a T-test, so try to define the typical values and range of values in each group.
  • Calculate your T-score: Input your data into the T-test formula you chose. Here is a one-sample formula:
  • Compare: If the statistical test is greater than the critical value, you have achieved statistical significance. The sample mean is so far from the population mean that you likely have a useful hypothesis.

There are several different kinds of T-tests as well. Let's go through the standard one-sample and two-sample T-tests.

One-sample T-test

A one-sample T-test looks for an increase or decrease compared to a population mean.

For example, your company just went through sales training. Now, the manager wants to know if the training helped improve sales.

Previous company sales data shows an average of $100 on each transaction from all workers. The null hypothesis would be no change. The alternative hypothesis (which you hope is significant), is that there is an improvement.

To test if there is significance, you take the sales average of 20 salesmen. That is the only available data, and you have no other data from nationwide stores. The average of that sample of salesmen in the past month is $130. We will also assume that the standard deviation is approximately equal .

With this set of factors, you can calculate your T-score with a T-test. You compare the sample result to the critical value. In addition, you assess it against the number of degrees of freedom. Since we know with smaller sample data sizes there is greater uncertainty, we allow more room for our data to vary.

After comparing, we may find a lot of significance. That means the data possesses enough strength to support our hypothesis that sales training likely impacted sales. Of course, this is an estimate, as we only assessed one factor with a small group. Sales could have risen for numerous other reasons. But with our set of assumptions, our hypothesis is valid.

Two-sample T-test

A two-sample T-test occurs the same as a two-sample Z-test and compares if two groups are equal when compared to a defined population parameter.

For example, consider English and non-native speakers. We want to see the effect of maternal language on test scores inside a country. To do that, we will offer both groups a reading test and compare those scores to the average.

Of course, finding the mean of an entire population of language speakers is impossible to procure. Still, we can make some assumptions and compare them with a smaller size. We take 15 English speakers and 15 non-native speakers and collect their results. We can decide on a critical score value on the reading test as well. If the average score on the test is not crucially different or outside the population standard deviation, our assumption failed. There is no significant difference between the groups, so the impact of maternal language is not worth investigating.

Both a Z-test and a T-test validate a hypothesis. Both are parametric tests that rely on assumptions. The key difference between Z-test and T-test is in their assumptions (e.g. population variance).

Key differences about the data used result in different applications. You want to use the appropriate tool, otherwise you won’t draw valid conclusions from your data.

So when should you use a Z-test vs a T-test? Here are some factors to consider:

  • Sample size: If the available sample size is small, opt for a T-test. Small sample sizes are more variable, so the greater spread of distribution and estimation of error involved with T-tests is ideal.
  • Knowledge of the population standard deviation: Z-tests are more precise and often simpler to execute. So if you know the standard deviation, use a Z-test.
  • Test purpose: If you are assessing the validity of a mean, a T-test is the best choice. If you are working with a hypothesized population proportion, go for a Z-test.
  • Assumption of normality: A Z-test assumes a normal distribution. This does not apply to all real-world scenarios. If you hope to validate a hypothesis that is not well-defined, opt for a T-test instead.
  • Type of data: You can only work within the constraints of the available data. The more information the better, but that is often not possible given testing and collecting conditions. If you have limited data describing means between groups, opt for a T-test. If you have large data sets comparing means between populations, you can use a Z-test.

Knowing the key differences with each statistical test makes selecting the right tool far easier. Here is a table that can help you compare:

Statistical testing lets you determine the validity of a hypothesis. You discover validity by determining if there is a significant difference between your hypothesis and the status quo. If there is, you have a possible idea worth exploring.

That process has numerous applications in the field of computer science and data analysis . You might want to determine the performance of an app with an A/B test. Or you might need to test if an application fits within the defined limits and compare performance metrics. Z-tests and T-tests can depict whether there is significant evidence in each of these scenarios. With that information, you can take the appropriate measures to fix bugs or optimize processes.

Z-test and T-test are helpful tools, especially for hypothesis testing. For data engineers of the future, knowledge of statistical testing will only help your work and overall career trajectory.

Are you a data scientist looking for a job? Check out our remote data scientist jobs available .

Explore our Editorial Policy to learn more about our standards for content creation.

SDLC models: approach development the way that works for you

SDLC models: approach development the way that works for you

L1, L2 & L3 support: what you should know

L1, L2 & L3 support: what you should know

how quantitative usability testing improves digital products

how quantitative usability testing improves digital products

top 14 C++ machine learning libraries

top 14 C++ machine learning libraries

shift-left strategy in accessible product development

shift-left strategy in accessible product development

10 most popular programming languages for 2020 and beyond

10 most popular programming languages for 2020 and beyond

computer science vs data science: unraveling the differences & similarities

computer science vs data science: unraveling the differences & similarities

what are third-party cookies? why privacy on the web is an illusion

what are third-party cookies? why privacy on the web is an illusion

best react projects for a portfolio: from ideas to standout examples

best react projects for a portfolio: from ideas to standout examples

Ruby on Rails interview questions

Ruby on Rails interview questions

data scientist interview questions and answers

data scientist interview questions and answers

24 unique JavaScript projects for portfolio

24 unique JavaScript projects for portfolio

top resume-boosting Java projects for your portfolio

top resume-boosting Java projects for your portfolio

network engineer interview questions

network engineer interview questions

Selenium developer interview questions

Selenium developer interview questions

step-by-step guide to creating, building, and showcasing your data analyst portfolio projects

step-by-step guide to creating, building, and showcasing your data analyst portfolio projects

  • Open access
  • Published: 15 April 2024

Powerful and accurate detection of temporal gene expression patterns from multi-sample multi-stage single-cell transcriptomics data with TDEseq

  • Yue Fan 1 , 2 , 3 ,
  • Lei Li 1 , 2 &
  • Shiquan Sun   ORCID: orcid.org/0000-0002-9150-6992 1 , 2 , 3 , 4  

Genome Biology volume  25 , Article number:  96 ( 2024 ) Cite this article

1942 Accesses

16 Altmetric

Metrics details

We present a non-parametric statistical method called TDEseq that takes full advantage of smoothing splines basis functions to account for the dependence of multiple time points in scRNA-seq studies, and uses hierarchical structure linear additive mixed models to model the correlated cells within an individual. As a result, TDEseq demonstrates powerful performance in identifying four potential temporal expression patterns within a specific cell type. Extensive simulation studies and the analysis of four published scRNA-seq datasets show that TDEseq can produce well-calibrated p -values and up to 20% power gain over the existing methods for detecting temporal gene expression patterns.

Introduction

The advances in single-cell RNA sequencing (scRNA-seq) technologies make it possible to record the temporal dynamics of gene expression over multiple time points or stages either in the same cell population [ 1 , 2 ] or even in an individual cell without destruction [ 3 ]. Unlike the single time point (e.g., snapshot) profiling of transcriptome that allocates cells on pseudotime or lineages using purely computational strategies [ 4 , 5 , 6 ], in particular, the time-course scRNA-seq profiling of whole transcriptome with respect to real, physical time, is capable of providing additional insights into dynamic biological processes [ 2 , 7 ]. For example, how the cells naturally differentiate into other types or states during the development processes and how the cellular response to specific drug treatments [ 8 ], viral infections [ 9 ], etc. Therefore, accurately characterizing the temporal dynamics of gene expression over time points is crucial for developmental biology [ 10 , 11 ], tumor biology [ 12 , 13 , 14 ], and biogerontology [ 15 , 16 , 17 ], which allows us to decipher the dynamic cellular heterogeneity during cell differentiation [ 18 ], identifying cancer driver genes during the status transformation [ 14 ], and investigating the mechanisms of cell senescence during aging [ 15 ]. Although the time-course scRNA-seq studies are initially designed for different purposes, they essentially require the same data analysis tools for detecting the temporal dynamics of gene expression [ 19 ]. As we know, the statistical modeling for this type of scRNA-seq data to identify temporal gene expression patterns meets significant challenges, i.e., modeling unwanted variables, accounting for temporal dependencies, and even characterizing non-stationary cell populations of scRNA-seq data. However, existing methods are unable to fully consider these limitations, resulting it remains an urgent need to develop effective tools for detecting temporal dynamics of gene expression in time-course scRNA-seq studies.

Particularly, time-course scRNA-seq data commonly share a fundamental temporal dynamics nature, i.e., the gene expression levels measured at each time point may be influenced by previous time points. Accounting for these temporal dependencies requires specialized statistical and computational tools [ 20 ], and failure to do so can lead to inaccurate gene detections [ 21 , 22 ]. As a result, current temporal gene detection methods for time-course scRNA-seq data can be divided into two categories: the methods that treat time points independently and methods that model the temporal dependencies explicitly. Specifically, the methods that utilize the former approach mostly treated time as a categorical variable, performing the differential expression analysis with pair-wise comparison tools, such as a two-sided Wilcoxon rank-sum test [ 23 , 24 ]. However, neglecting the temporal dependencies among multiple time points will reduce the statistical power and may lead to false-positive results [ 22 ]. On the other hand, the methods that utilize the latter approach are commonly used for addressing the time-course bulk RNA-seq data, such as ImpulseDE2 [ 25 ], DESeq2 [ 26 ], and edgeR [ 27 ]. However, the scRNA-seq data is often sparse along with technical and biological variability, making it difficult to accurately identify true biological gene expression changes over multiple time points [ 28 , 29 ].

Furthermore, time-course scRNA-seq data are often collected from multi-sample multi-stage designs. As a result, there may be unwanted variables that arise due to technical variability, batch effects, or the genetic background of individuals [ 30 ]. These variables can obscure the identification of temporal expression changes that are of interest, making it challenging to detect temporal expression genes accurately. Alternatively, the trajectory-based differential expression analysis methods, such as Monocle2 [ 5 ], tradeSeq [ 31 ], and PseudotimeDE [ 32 ] could detect the temporal dynamics of gene expressions along with pseudotime or a continuous trajectory of cellular states. However, since the gene expression profiles of cells from the same sample/individual are known to be dependent, these methods may not adequately account for technical or biological variability that may present in multi-sample multi-stage designs. In addition, these methods may not fully capture the underlying biological process at specific tipping points or intervals, which could be particularly relevant in understanding the mechanisms of cell fate or differentiation [ 33 , 34 ], and tumor progression [ 14 ].

Here, to properly address the above challenges, we develop an efficient and flexible non-parametric method for detecting temporal expression patterns over multiple time points. We refer to our method as TDEseq , temporal differentially expressed genes of time-course scRNA-seq data. Specifically, TDEseq primarily builds upon a linear additive mixed model (LAMM) framework, with a random effect term to account for correlated cells within an individual. In this model, we typically introduce the quadratic I -splines [ 35 ] and cubic C -splines [ 36 ] as basis functions, which facilitate the detection of four potential temporal gene expression patterns, i.e., growth, recession, peak, or trough. As a result, with extensive simulation studies, we find TDEseq can properly control for type I error rate at the transcriptome-wide level and display powerful performance in detecting temporal expression genes under the power simulations. Finally, we apply the TDEseq to one scRNA-seq dataset which was generated by Well-TEMP-seq [ 23 ], one scRNA-seq dataset generated by Smart-seq2 [ 37 ], and two scRNA-seq datasets generated by 10X Genomics, to benchmark TDEseq against current state-of-the-art methods, regarding human colorectal cancer development [ 23 ], mouse hepatocyte differentiation [ 38 ], human metastatic lung adenocarcinoma [ 14 ], and human COVID-19 progression [ 9 ]. These results highlight that TDEseq is an appropriate tool for detecting temporal gene expression patterns over multiple time points, which leads to an improved understanding of developmental biology, tumor biology, and biogerontology.

Overview of TDEseq

Statistical modeling.

TDEseq is a temporal gene expression analysis approach that is primarily built upon the linear additive mixed models (LAMM) [ 39 ] framework to characterize the temporal gene expression changes for time-course (or longitudinal) scRNA-seq datasets (the “ Materials and methods ” section; Supplementary Text). Typically, we aim to detect one of four possible temporal gene expression patterns (i.e., growth, recession, peak, or trough) over multiple time points using both I -splines [ 35 ] and C -splines [ 36 ] basis functions (Fig.  1 A) and examine one gene at a time. Briefly, in LAMM, we assume the log-normalized gene expression level of raw counts (the “ Materials and methods ” section), i.e., \({y}_{gji}\left(t\right)\) for gene \(g\) , individual \(j\) and cell \(i\) at time point \(t\) is,

figure 1

Schematic overview of TDEseq and the methods comparison in simulations. A TDEseq is designed to perform temporal expression gene analysis of time-course scRNA-seq data. With a given gene, TDEseq determines one of four temporal expression patterns, i.e., growth, recession, peak, and trough. TDEseq combines the four p -values using the Cauchy combination rule as a final p -value, facilitating the detection of temporal gene expression patterns. B The quantile–quantile (QQ) plot shows the type I error control under the baseline parameter settings. The well-calibrated p -values will be expected laid on the diagonal line. The p- values generated from Mixed TDEseq (plum) and DESeq2 (brown) are reasonably well-calibrated, while Linear TDEseq (orange), tradeSeq (green), ImpulseDE2 (blue), Wilcoxon test (yellow) and edgeR (dark green) produced the p -values that are not well-calibrated. C The average power of 10 simulation replicates for temporal expression gene detection across a range of FDR cutoffs under the baseline parameter settings. Both versions of TDEseq exhibit high detection power of temporal expression genes, followed by DESeq2, edgeR, tradeSeq, and ImpulseDE2. Wilcoxon test does not fare well, presumably due to bias towards highly expressed genes. The TDEseq methods were highlighted using solid lines, while other methods were represented by dashed lines in the plots. D The comparison of Linear TDEseq, Mixed TDEseq, and ImpuseDE2 in terms of the accuracy of temporal expression pattern detection under the baseline parameter settings, at an FDR of 5%. The temporal expression genes detected by TDEseq demonstrated a higher accuracy than those detected by ImpluseDE2. E The quantile–quantile (QQ) plot shows the type I error control under the large batch effect parameter settings. The p- values generated from Mixed TDEseq coupled with scMerge (purple) and DESeq2 (brown) are reasonably well-calibrated, while Linear TDEseq (orange), Mixed TDEseq (plum), tradeSeq (green), ImpulseDE2 (blue), Wilcoxon test (yellow), and edgeR (dark green) generated the inflated p -values. F The average power of 10 simulation replicates the comparison of temporal expression gene detection across a range of FDR cutoffs under the large batch effect parameter settings. G The comparison of Linear TDEseq, Mixed TDEseq, and Mixed TDEseq coupled with scMerge and ImpuseDE2 in terms of the accuracy of temporal expression pattern detection under the large batch effect parameter settings, at an FDR of 5%. Since DESeq2, edgeR, tradeSeq, and Wilcoxon tests were not originally designed for pattern-specific detection we excluded them in the comparison. FDR denotes the false discovery rate

Where \({{\varvec{w}}}_{gji}\) is the cell-level or time-level covariate (e.g., cell size, or sequencing read depth), \({\boldsymbol{\alpha }}_{g}\) is its corresponding coefficient;  \({{\varvec{u}}}_{g}\) is a random vector to account for the variations from heterogeneous samples, i.e.,

where \({{\varvec{\Sigma}}}_{N\times N}\) is a block diagonal matrix with a total of \(M\) block matrices, in which all elements of \({{\varvec{\Sigma}}}_{{n}_{j}\times {n}_{j}}\) are ones;  \({n}_{j}\) is the number of cells for the individual or replicate \(j\) , and \(\sum_{j=1}^{M}{n}_{j}=N\) ; \({e}_{gji}\) is a random effect, which is an independent and identically distributed variable that follows a normal distribution with mean zero and variance \({\sigma }_{g}^{2}\) to account for independent noise, i.e.,

Particularly, the variable \({s}_{k}\left(t\right)\) is a smoothing spline basis function, which involves either I -splines or C -splines to model monotone patterns (i.e., growth and recession) and quadratic patterns (i.e., peak and trough), respectively [ 40 ] (Fig.  1 A). The I -splines are defined as \({I}_{l}^{k}\left(x\right)={\int }_{{\xi }_{1}}^{x}{M}_{t1}^{k}\left(v\right)dv\) [ 35 ], while C -splines are defined as \({C}_{l}^{k}\left(x\right)={\int }_{{\xi }_{1}}^{x}{I}_{t1}^{k}\left(v\right)d(v)\)  [ 36 ] based on I -splines, and \({\beta }_{gk}\) is its corresponding coefficient that is restricted to \({\beta }_{gk}\ge 0\) , and \(l (l=\mathrm{1,2},\cdots ,L)\) is the number of grid points. We set \(L\) to be a total number of grid points, which is equal to the number of time points in scRNA-seq studies; \(k\) denotes the order of the spline function; \(MVN\) denotes the multivariate normal distribution.

Hypothesis testing

In the LAMM model mentioned above, we are interested in examining whether a gene shows one of four temporal expression patterns, i.e., growth, recession, peak, or trough (Fig.  1 A). Testing whether a gene expression displays temporal gene expression patterns can be translated into testing the null hypothesis \({{\varvec{H}}}_{0}:{{\varvec{\beta}}}_{g}=0\) . Parameter estimates and hypothesis testing in LAMM are notoriously difficult, as the LAMM likelihood involves M -splines [ 35 ] (non-linear) subject to nonnegative constraints that cannot be solved analytically. To make the LAMM model scalable estimation and inference, we developed an approximate inference algorithm based on a cone programming projection algorithm [ 41 , 42 ]. With parameter estimates, we computed a p -value for each of the four patterns using the test statistics [ 43 ], which follow a mixture of beta distributions [ 44 ]. Afterward, we combined these four p -values through the Cauchy combination rule [ 45 ]. The Cauchy combination rule allows us to combine multiple potentially correlated p -values into a single p -value to determine whether a gene exhibits the temporal expression pattern or not (the “ Materials and methods ” section; Additional file 1 : Supplementary Text).

We refer to the above method as the mixed version of TDEseq (we denoted as Mixed TDEseq ). Besides the mixed version, we have also developed a linear version of TDEseq (to distinguish Mixed TDEseq, we denoted this as Linear TDEseq ) for modeling the small or no sample heterogeneity inherited in time-course scRNA-seq data (Additional file 1 : Supplementary Text). Both versions of TDEseq were implemented in the same R package with multiple threads computing capability. The software TDEseq, together with the reproducibility analysis code, is freely available at https://sqsun.github.io/software.html .

TDEseq generates well-calibrated p -values and exhibits powerful gene detection of temporal expression changes in simulations

To benchmark the robustness and performance of TDEseq, we simulated extensive scRNA-seq datasets using the Splatter R package [ 46 ] and compared two versions of TDEseq with other five existing approaches but not specific designs for time-course scRNA-seq data analysis, which are the two-sided Wilcoxon rank-sum test (Wilcoxon test), tradeSeq [ 47 ], ImpulseDE2 [ 25 ], edgeR [ 27 , 48 ], and DESeq2 [ 26 ] (the “ Materials and methods ” section). The simulations were typically designed to assess the ability of TDEseq in terms of type I error control and temporal gene detection power with varying various parameter settings, including the number of time points (i.e., 4, 5, or 6), the number of cells for each sample in each time point (i.e., 50, 100, or 200; three replicates/samples for each time point), the expected UMI counts for each cell of scRNA-seq data (i.e., 7.0 as low, 9.7 as medium, and 13.8 as high), the effect size of temporal expression changes (i.e., 0.1 as low, 0.4 as medium, and 0.7 as high), and the sample-level unwanted technical variations (i.e., batch effects; 0 as no batch effects, 0.04 as medium, and 0.12 as high).

To do so, we considered a baseline simulation scenario: the number of time points as 5; the number of cells in each sample as 100; the expected UMI counts for each cell as 9.7; the batch effect size as 0.04; the time point-specific effect size as 0.4; all cells were measured by 10,000 genes, in which 1,000 genes were randomly assigned one of four possible temporal patterns (i.e., growth, recession, peak, and trough; Additional file 2 : Fig. S1A-S1D) in power simulations. With the baseline parameter settings, we varied one parameter at a time to examine whether the gene was temporally expressed over multiple time points. Notably, the expected UMI counts under baseline settings were estimated from the lung adenocarcinoma progression scRNA-seq data [ 14 ] (the “ Materials and methods ” section).

With the baseline parameter setup, we found that only Mixed TDEseq and DESeq2 generated the well-calibrated p -values under the null simulations, whereas all other methods produced the inflated or conserved p -values (Fig.  1 B). Besides, for the power simulations, Linear TDEseq and Mixed TDEseq can produce a more powerful temporal expression pattern detection rate across a range of FDR cutoffs (Fig.  1 C). Specifically, with a false discovery rate (FDR) of 5%, the power detection rate of both Linear and Mixed TDEseq was 43.3% and 40.8%, respectively, followed by DESeq2 was 38.7%, edgeR was 36.4%, tradeSeq was 25.9%, and ImpulseDE2 was 18.4%. Furthermore, we also examined the accuracy of pattern detection, finding both Linear and Mixed versions of TDEseq outperformed ImpluseDE2 (the sole method capable of identifying pattern-specific genes). Specifically, with an FDR of 5%, the averaged accuracy of pattern examination (with 10-time repeats) for Mixed TDEseq achieved 99.0% for growth, 100% for recession, 80.4% for peak, and 43.6% for trough. In contrast, ImpulseDE2 achieved 73.1% for growth, 83.5% for recession, 39.3% for peak, and 21.3% for trough (Fig.  1 D).

In addition, we systematically examined the performance of the type I error control rate under other parameter settings. Our findings indicate that Mixed TDEseq consistently produces well-calibrated p -values (Additional file 2 : Fig. S2A, S2B, S3A, S4A, and S4B) except when dealing with high UMI counts (Additional file 2 : Fig. S3B). These observations are presumably due to the presence of sample-level variations or batch effects associated with high UMI counts. On the other hand, in terms of temporal expression gene detection and averaged accuracy of pattern examination, either Mixed or Linear TDEseq displayed more powerful performance across a range of parameter settings regardless of the number of time points (Additional file 2 : Fig. S2C and S2D), the low expected UMI counts for each cell (Additional file 2 : Fig. S3C), the large number of cells per sample (Additional file 2 : Fig. S4D), and the small effect size setups (Additional file 2 : Fig. S5), as well as the accuracy of pattern examination (Additional file 2 : Fig. S6). Meanwhile, we found both pseudobulk-based methods, i.e., either edgeR or DESeq2, performed well with high UMI counts (Additional file 2 : Fig. S2D) and small number of cells per sample (Additional file 2 : Fig. S3C). These observations were probably consistent with the previous studies that DESeq2/edgeR performed well on log-normal distributed small sample size RNA-seq data [ 49 , 50 ]. Taken together, we summarized the findings on detection power at an FDR of 5% across diverse parameter settings. The results demonstrated that the power of temporal gene detection increases with a rise in the number of time points, effect size, and UMI counts. Conversely, it diminishes as the number of cells within each individual increase, along with sample-level variations (i.e., batch effects) (Additional file 2 : Fig. S7). Notably, the Wilcoxon test did not fare well in all power simulations, presumably due to failure to properly control the type I error rate.

In addition, we further examined the performance of TDEseq in other two temporal expression patterns: (1) a plateau in the first few time points, then another plateau in the last few time points (we referred to this pattern as a bi-plateau pattern; Additional file 2 : Fig. S1E), and (2) a multi-mode pattern at begin time points then stable in last time points (we referred this pattern as a multi-modal pattern; Additional file 2 : Fig. S1F). Under the bi-plateau pattern, Mixed TDEseq still displayed more powerful performance than other methods (Additional file 2 : Fig. S8A), suggesting the shape-restricted spline function is flexible to capture bi-plateau patterns. In contrast, under the multi-modal pattern, all methods achieved low detection power, but edgeR and DESeq2 showed a higher performance than other methods (Additional file 2 : Fig. S8B). However, this may not be a great issue since the multi-modal pattern may be a rare scenario in real data applications [ 25 ].

TDEseq coupled with batch removal strategy exhibits excellent performance in analyzing large heterogeneous scRNA-seq data

Intuitively, in situations with minimal or no sample-level variations (i.e., batch effects), it is reasonable to expect that trajectory-based differential expression methods (e.g., tradeSeq) would yield comparable results to temporal-based differential expression methods. To do so, we reduced the batch effect size to zero. As a result, we observed that both versions of TDEseq and tradeSeq generated well-calibrated p -values under the null simulations, whereas ImpulseDE2 demonstrated overly conservative p -values and the Wilcoxon test displays inflated p -values (Additional file 2 : Fig. S9A). Again, both versions of TDEseq and all other approaches generated comparable results of temporal expression pattern (Additional file 2 : Fig. S9B). As we know, the presence of unwanted batch effects poses substantial obstacles in detecting temporal expression changes. We therefore increased the batch effect size to 0.12. As a result, we found Mixed TDEseq outperformed other methods in terms of temporal expression pattern detection power (Fig.  1 F). However, the p -values generated by Mixed TDEseq were not well-calibrated (Fig.  1 E).

To this end, to properly control the unwanted variables in the large batch effects scenario, we additionally carried out the batch effects correction procedure prior to performing temporal gene expression analysis. To do so, we benchmarked five existing batch removal methods that can return the corrected gene expression matrix, including MNN [ 51 ], scMerge [ 52 ], ZINB-WaVE [ 53 ], ComBat [ 54 ], and Limma [ 55 ]. As a result, with evaluation criterion iLISI score [ 56 ] (the “ Materials and methods ” section) for batch correction approaches, we found scMerge (0.49) achieved a higher alignment score than Limma (0.48), ComBat (0.48), MNN (0.11), and ZINB-WaVE (0.21; Additional file 2 : Fig. S10). Moreover, we found Mixed TDEseq coupled with scMerge (Mixed TDEseq + scMerge) performed reasonably well in terms of the type I error control (Fig.  1 E and Additional file 2 : Fig. S11A) and was more powerful in detecting temporal expression genes (Fig.  1 F and Additional file 2 : Fig. S11B), suggesting this combination is suitable for time-course scRNA-seq data with strong sample-level variations (i.e., batch effects). Taken together, TDEseq coupled with scMerge may be an ideal combination for the identification of temporal gene expression patterns when time-course scRNA-seq data involves large heterogeneous samples.

TDEseq performs well in the intertwined cells among time points

The simulations above all display the time point-specific expression. To mimic the cell differentiation scenario where the same type of cells were intertwined among time points (Additional file 2 : Fig. S12A), we simulated additional scRNA-seq datasets (denoted as smudged data) using the Symsim R package [ 57 ] (the “ Materials and methods ” section). Consequently, the pseudotime was inferred using Slingshot [ 6 ] according to the recommendation from the previous studies [ 31 , 32 ]. In this simulation, we first took the inferred pseudotime as inputs in tradeSeq and ImpulseDE2, while the time points as inputs in both versions of TDEseq, edgeR, and DESeq2. As a result, we observed that the performance of Linear TDEseq was comparable with ImpluseDE2 in a small proportion of intertwined cells between time points (Additional file 2 : Fig. S12B). With a medium proportion of intertwined cells (Additional file 2 : Fig. S12C) and a large proportion of intertwined cells (Additional file 2 : Fig. S12D), the pseudotime-based methods tradeSeq and ImpulseDE2 outperformed the time points-based methods, both versions of TDEseq, edgeR, and DESeq2. Furthermore, we took the inferred pseudotime as inputs in both versions of TDEseq, Linear TDEseq outperformed Mixed TDEseq, and ImpulseDE2, but not tradeSeq (Additional file 2 : Fig. S12E).

In addition, we further examined the temporal expression patterns that were detected by Linear TDEseq. As a result, we found even though the pseudotime as inputs, TDEseq displayed four distinct temporal expression patterns (Additional file 2 : Fig. S12F). Therefore, TDEseq was also useful for detecting temporal expression patterns with the pseudotime as inputs.

TDEseq detects drug-associated temporal expression changes of time-course scRNA-seq data

We first applied TDEseq on a drug-treatment time-resolved scRNA-seq dataset (Additional file 3 : Table S1). The data were assayed by Well-TEMP-seq protocols to profile the transcriptional dynamics of colorectal cancer cells exposed to 5-AZA-CdR [ 23 ] (the “ Materials and methods ” section). This scRNA-seq dataset consists of D0, D1, D2, and D3 four time points (Fig.  2 A), and each time point contains 4,000 cells. We expected these scRNA-seq datasets to exhibit minimal individual heterogeneity across multiple time points since Well-TEMP-seq addressed the cell lines within one chip [ 23 ] (Additional file 2 : Fig. S13A). Therefore, we performed the temporal gene detection methods without batch effects correction. Since only one sample was involved in each time point, we excluded Mixed TDEseq, edgeR, and DESeq2 in this application.

figure 2

The time-resolved scRNA-seq data analysis for the HCT116 cell lines after 5-AZA-CdR treatment. A The experimental design of HCT116 cell lines treated with 5-AZA-CdR. The scRNA-seq data were assayed by Well-TEMP-seq protocols, consisting of four time points, i.e., D0, D1, D2, or D3 after treatment. B The quantile–quantile (QQ) plot shows the type I error control under the null simulations with permutation strategy. The well-calibrated p -values will be expected laid on the diagonal line. The p- values produced by Linear TDEseq (orange) and tradeSeq (green) are reasonably well-calibrated, while those from ImpulseDE2 (blue) are overly conservative. C The power comparison of temporal expression gene detection across a range of FDR cutoffs. Linear TDEseq was highlighted using solid lines, while other methods were represented by dashed lines in the plots. Linear TDEseq displays the powerful performance of temporal expression gene detection. D The heatmap demonstrates the pattern-specific temporal expression genes that were identified by Linear TDEseq. Gene expression levels were log-transformed and were standardized using z-scores for visualization. The top-ranked temporal expression genes identified by Linear TDEseq show distinct four patterns. E The Venn diagram shows the overlapping of the temporally expressed genes (FDR ≤ 0.05) identified by Linear TDEseq, tradeSeq, or ImpulseDE2. Those method-specific unique genes were enriched in the number of GO terms ( N GO , BH-adjusted p -value < 0.05). The temporal expression genes detected by Linear TDEseq were enriched with a greater number of GO terms. F The UMAP shows two temporal expression genes, i.e., DKK1 and IFITM3, which were identified by Linear TDEseq but not by tradeSeq. G The bubble plot demonstrates the significant GO terms enriched by pattern-specific temporal expression genes, which were identified by Linear TDEseq. The peak-specific temporal expression genes enriched more significant GO terms. The Wilcoxon test was excluded from this comparison due to its poor performance in simulations. DESeq2 and edgeR were excluded from this comparison due to only one sample at each time point. FDR denotes the false discovery rate

Next, we examined the ability of Linear TDEseq in terms of type I error control. To do so, we utilized a permutation strategy (repeated 5 times) to construct a null distribution (the “ Materials and methods ” section). Consistent with simulation studies, we found Linear TDEseq can produce well-calibrated p -values while tradeSeq produced inflated p -values and ImpulseDE2 produced overly conserved p -values (Fig.  2 B). Besides, in terms of temporally expressed gene detection, Linear TDEseq outperformed other methods across a range of FDR cutoffs (Fig.  2 C), even in pattern-specific temporal expression gene detection (Additional file 2 : Fig. S13B). For example, Linear TDEseq identified a total of 5,596 temporally expressed genes at an FDR of 5%, including 1,341 growth genes, 1,177 recession genes, 225 trough genes, and 2,853 peak genes, which displayed four distinct temporal expression patterns (Fig.  2 D). In contrast, ImpulseDE2 identified a total of 4,792 temporally expressed genes and tradeSeq detected a total of 2,672 temporally expressed genes (Additional file 3 : Table S3). Overall, besides the 2,427 common shared temporally expressed genes detected by Linear TDEseq, tradeSeq, and ImpulseDE2 methods (Fig.  2 E), a total of 559 temporally expressed genes were uniquely detected by TDEseq, which were also significantly enriched in cell cycle DNA replication (GO:0044786; BH adjusted p -value = 7.89e − e) and response to interleukin1 (GO:0070555; BH adjusted p -value = 0.031). In contrast, tradeSeq or ImpulseDE2 unique genes were not enriched in 5-AZA-CdR treatment response associated GO terms (Fig.  2 E). Specifically, we found tumor suppressor genes which were a target of 5-AZA-CdR, i.e., DKK1 [ 23 , 58 ] was identified by TDEseq as top-ranked significant temporal expression genes ( p -value < 1e − 300, FDR = 0), while not being detected by tradeSeq ( p -value = 0.82, FDR = 0.90), probably due to though this gene had clearly peak pattern, the log fold change was small enough, and it was difficult to detect with penalized splines; besides, a 5-AZA-CdR response gene IFITM3 [ 58 , 59 ] was also identified by TDEseq as top-ranked significant genes ( p -value < 1e − 300, FDR = 0), but not detected by tradeSeq ( p -value = 0.19, FDR = 0.36, Fig.  2 F).

Finally, we performed gene set enrichment analysis (GSEA) on the pattern-specific temporal expression genes to examine top GO terms enriched by the given gene lists (the “ Materials and methods ” section). Specifically, with an FDR of 5%, a total of 1,341 growth-specific temporal expression genes were detected by Linear TDEseq. These genes were enriched in a total of 179 GO terms. Because the 5-AZA-CdR treatment leads HCT116 cells to a viral mimicry state, and triggers the antiviral response [ 60 ], we expected a result of an immune response that drives the immune-associated genes is upregulated. Indeed, the GO terms contain many immune response terms, such as the cell activation involved in the immune response process (GO: 0002263; BH-adjusted p -value = 8.90e − 7), indicating immune response was activated by the 5-AZA-CdR treatment; a total of 1,177 recession-specific temporal expression genes were enriched in a total of 244 GO terms, e.g., many regulations of histone methylation terms such as positive regulation of histone H3-K4 methylation (GO: 0051571; BH-adjusted p -value = 0.047), implying DNA methylation inhibitions and gene expression regulation were occurred after 5-AZA-CdR treatment, due to the global DNA demethylation effects of 5-AZA-CdR [ 61 ]; a total of 2,853 peak-specific temporal expression genes were enriched in a total of 249 GO terms. For example, the ATP metabolic process pathways, particularly oxidative phosphorylation (GO: 0006119; BH-adjusted p -value = 3.30e − 7), are impacted by the increase of intracellular ROS and mitochondrial superoxide induced by 5-AZA-CdR. However, this effect diminishes over time [ 62 ]; similarly, a total of 225 trough-specific temporal expression genes were enriched in a total of 89 GO terms, with a significant portion belonging to cell cycle pathways, including mitotic nuclear division (GO: 0140014; BH-adjusted p -value = 1.06e − 6). These findings suggest that 5-AZA-CdR treatment may lead to the suppression of tumor cell proliferation and division [ 60 ] (Fig.  2 G).

TDEseq detects hepatic cell differentiation-associated temporal expression genes of time-course scRNA-seq data

We next applied TDEseq to a hepatoblast-to-hepatocyte transition study from the C57BL/6 and C3H embryo mice livers [ 38 ] (the “ Materials and methods ” section; Additional file 3 : Table S1). This scRNA-seq dataset consists of 7 developmental stages from 13 samples, including E10.5 (54 cells from 1 sample), E11.5 (70 cells from 2 samples), E12.5 (41 cells from 2 samples), E13.5 (65 cells from 2 samples), E14.5 (70 cells from 2 samples), E15.5 (77 cells from 2 samples), and E17.5 (70 cells from 2 samples) [ 63 ] (Fig.  3 A). Compared with the above time-resolved scRNA-seq data, this time-course scRNA-seq dataset contains multiple samples at each stage, exhibiting small individual heterogeneity across all developmental stages (Additional file 2 : Fig. S14A). Therefore, we carried out both versions of TDEseq that would be expected to be comparable in such a scenario and excluded edgeR and DESeq2 in this application due to one or two samples involved in each time point.

figure 3

The time-course scRNA-seq data analysis for mouse fetal liver development. A The experimental design of mouse fetal liver sample collection. The scRNA-seq data were assayed on the FACS isolated cell populations, consisting of seven liver developmental stages, i.e., E10.5, E11.5, E12.5, E13.5, E14.5, E15.5, and E17.5. B The quantile–quantile (QQ) plot shows the type I error control under the permutation strategy. The well-calibrated p -values will be expected laid on the diagonal line. The p- values produced by Linear TDEseq (orange), Mixed TDEseq (plum), and tradeSeq (green) are reasonably well-calibrated, while those from ImpulseDE2 (blue) are overly conservative. C The power comparison of temporal expression gene detection across a range of FDR cutoffs. The TDEseq methods were highlighted using solid lines, while other methods were represented by dashed lines in the plots. Both versions of TDEseq display the powerful performance of temporal expression gene detection. D The heatmap demonstrates the pattern-specific temporal expression genes that were identified by Linear TDEseq. Gene expression levels were log-transformed and were standardized using z -scores for visualization. The top-ranked temporal expression genes identified by Linear TDEseq show distinct four patterns. E The Venn diagram shows the overlapping of the temporally expressed genes (FDR ≤ 0.05) identified by Linear TDEseq, tradeSeq, or ImpulseDE2. Those method-specific unique genes were enriched in the number of GO terms ( N GO , BH-adjusted p -value < 0.05). The temporal expression genes detected by Linear TDEseq were enriched more GO terms. F The UMAP shows two temporal expression genes, i.e., Atf4 and Itgb1 , which were uniquely identified by Linear TDEseq. G The bubble plot demonstrates the significant GO terms enriched by pattern-specific temporal expression genes, which were identified by Linear TDEseq. The recession-specific temporal expression genes enriched more significant GO terms, whereas trough-specific temporal expression genes were not enriched in any GO terms. The Wilcoxon test was excluded from this comparison due to its poor performance in simulations. DESeq2 and edgeR were excluded from this comparison due to only one or two samples at each time point. FDR denotes the false discovery rate

To do so, we first examined the ability of TDEseq in terms of type I error control using permutation strategies (the “ Materials and methods ” section). Consistent with the simulation results, Linear TDEseq, Mixed TDEseq, and tradeSeq could produce the well-calibrated p -values whereas ImpulseDE2 generated overly conservative p -values (Fig.  3 B). Besides, in terms of temporally expressed gene detection, both versions of TDEseq outperformed other methods across a range of FDR cutoffs (Fig.  3 C and Additional file 2 : Fig. S14B). Specifically, Linear TDEseq identified a total of 9,975 temporally expressed genes at an FDR of 5%, including 1,266 growth genes, 7,146 recession genes, 217 trough genes, and 1,346 peak genes, which displayed four temporal distinct patterns (Fig.  3 D); Mixed TDEseq identified a total of 8,924 temporally expressed genes, including 1,242 growth genes, 6,708 recession genes, 136 trough genes, and 838 peak genes. In contrast, ImpulseDE2 detected a total of 7,737 temporally expressed genes, while tradeSeq detected a total of 7,108 temporally expressed genes (Additional file 3 : Table S4). Notably, comparing with tradeSeq and ImpulseDE2, there were a total of 948 temporally expressed genes uniquely detected by Linear TDEseq at an FDR of 5% and a total of 3,517 temporally expressed genes uniquely detected by Mixed TDEseq at an FDR of 5%. Comparing the results of Linear TDEseq and Mixed TDEseq, we found the p -values generated from both Mixed TDEseq and Linear TDEseq demonstrated a high correlation (Spearman R  = 0.954; Additional file 2 : Fig. S14C). We further observed that some of the genes from Linear TDEseq displayed a smaller p -value than that from Mixed TDEseq. This observation was presumably due to Linear TDEseq being more sensitive in large sample-level variations across time points. For example, the p -value of a recession-specific gene MAPK13 generated by Linear TDEseq ( p -value = 1.0e − 300) was extremely small than Mixed TDEseq ( p -value = 1.8e − 2; Additional file 2 : Fig. S14D).

Moreover, the temporally expressed genes detected by both versions of TDEseq but not detected by tradeSeq or ImpulseDE2 were highly related to hepatic cell differentiation (Fig.  3 E), where many of them have been validated by the previous studies [ 38 , 64 , 65 ]. For example, we found a key mouse fetal liver development regulator Atf4 [ 65 ], which exhibits a growth pattern (Fig.  3 F), was only identified by TDEseq as the top-ranked significant temporal expression gene ( p -value < 1e − 300, FDR = 0), while not being detected by tradeSeq ( p -value = 0.240, FDR = 0.275) and ImpulseDE2 ( p -value = 0.243, FDR = 0.079). Besides, Itgb1 displays the growth pattern (bi-plateau pattern; Fig.  3 F) for liver microstructure establishment during the embryonic process, which was only identified by TDEseq as the top-ranked significant temporal expression gene ( p -value < 1e − 300, FDR = 0), while not being detected by tradeSeq ( p -value = 0.272, FDR = 0.309). These genes uniquely detected by Linear TDEseq were enriched in the liver embryo process, particularly the cell cycle process (GO:0022402, BH-adjusted p -value = 1.33e − 5) and embryo development (GO:0009790; BH-adjusted p -value = 0.0361), whereas the temporal expression genes uniquely detected by tradeSeq or ImpulseDE2 were not enriched in liver development-related gene sets (Fig.  3 E), and ImpulseDE2 wrongly detected the peak or trough pattern genes as growth pattern genes (Additional file 2 : Fig. S14E). In addition, the enrichment analysis of unique temporal expression genes from Mixed TDEseq and Linear TDEseq showed similar results (Additional file 2 : Fig. S14F).

Next, we performed GSEA on the pattern-specific temporal expression genes identified by Linear TDEseq, to examine the four pattern-specific functions during hepatic cell differentiation (the “ Materials and methods ” section). Specifically, with an FDR of 5%, a total of 1,266 growth-specific temporal expression genes were enriched in a total of 685 GO terms. Notably, these growth-related genes were significantly enriched in liver function-associated pathways, reflecting the mature process from hepatoblasts to hepatocytes. For example, almost all of these enriched terms were associated with metabolic processes, biosynthetic processes, or organic substance transport, with key functions attributed to mature hepatocytes (Fig.  3 G), particularly the fatty acid metabolic process (GO:0006631; BH-adjusted p -value = 4.99e − 37), lipid catabolic process (GO:0016042; BH-adjusted p -value = 3.34e − 26), and secondary alcohol metabolic process (GO:1902652; BH-adjusted p -value = 8.06e − 16), which are the main functions of mature hepatocytes. On the other hand, a total of 7,146 recession-specific temporal expression genes were enriched in 1,000 GO terms. Interestingly, these GO terms were related to embryo or tissue development (Fig.  3 G), such as embryonic morphogenesis (GO:0048598; BH-adjusted p -value = 1.04e − 3) and mesoderm morphogenesis (GO:0048332; BH-adjusted p -value = 4.05e − 3). This finding suggests the involvement of an embryo development process, possibly linked to organogenesis occurring at the E14.5 stage [ 66 ]. Furthermore, these genes may signify the loss of embryonic cell identity in mature hepatocytes.

Finally, since the scRNA-seq data showed intertwined cells among time points (Additional file 2 : Fig. S14G), we further applied TDEseq with the pseudotime as inputs. As a result, we observed both versions of TDEseq generated comparable results with tradeSeq and ImpulseDE2 for temporal expression gene detection (Additional file 2 : Fig. S14H). Besides, these genes demonstrated distinct temporal expression patterns for TDEseq (Additional file 2 : Fig. S14I).

Taken together, we found both versions of TDEseq yields similar results in terms of type I control rate and temporal expression gene detection when scRNA-seq data exhibits small individual heterogeneity over time points. Therefore, considering the computation burden for large-scale scRNA-seq data applications, we recommended Linear TDEseq in a small individual heterogeneity scenario.

TDEseq detects the epithelial cell evaluation-associated temporal expression genes of time-course scRNA-seq data

We again applied TDEseq to detect temporal expression genes altered in human metastatic lung adenocarcinoma (LUAD) cancer [ 14 ] (the “ Materials and methods ” section). Here, we were primarily interested in epithelial cells of this time-course scRNA-seq data, which involves a total of five distinct evolution stages, i.e., stage normal, stage I, stage II, and stage III and stage IV (Additional file 3 : Table S1). Since stage II contains a relatively small number of cells (i.e., 119 cells) compared with other stages, we excluded this stage, resulting in a total of 3,703 cells from 11 samples in the normal stage; 5,651 cells from 8 samples in stage I; 1,500 cells from 2 samples in stage III; and 3,053 cells from 7 samples in stage IV (Fig.  4 A). We noticed that these scRNA-seq datasets contain sample-level variations across stages (iLISI = 0.10; Additional file 2 : Fig. S15A). Therefore, we performed both versions of TDEseq in such a scenario.

figure 4

The time-course scRNA-seq data analysis for human metastatic LUAD. A The experimental design of human lung sample collection. The scRNA-seq data were assayed by 10X Genomics Chromium protocols, consisting of 4 LUAD evaluation stages, i.e., normal, stage I (early LUAD), stage III (advanced LUAD), and stage IV (lymph node metastasis). B The quantile–quantile (QQ) plot shows the type I error control under the permutation strategy. The well-calibrated p -values will be expected laid on the diagonal line. The p- values produced by Linear TDEseq (orange) and Mixed TDEseq(plum) are reasonably well-calibrated, while those from tradeSeq (green), ImpulseDE2 (blue) edgeR (dark green), and DESeq2 (brown) are inflated. C The power comparison of temporal expression gene detection across a range of FDR cutoffs. The TDEseq methods were highlighted using solid lines, while other methods were represented by dashed lines in the plots. Both versions of TDEseq display the powerful performance of temporal expression gene detection. D The heatmap demonstrates the pattern-specific temporal expression genes that were identified by Mixed TDEseq. Gene expression levels were log-transformed and were standardized using z -scores for visualization. The top-ranked temporal expression genes identified by Mixed TDEseq show distinct four patterns. E The Venn diagram shows the overlapping of the temporally expressed genes (FDR ≤ 0.05) in pairwise comparisons between Mixed TDEseq and tradeSeq, ImpulseDE2, DESeq2, and edgeR. Those method-specific unique genes were enriched in the number of GO terms ( N GO , BH-adjusted p -value < 0.05). Many more GO terms were enriched in the Mixed TDEseq-unique temporal expression genes than in other methods. F The proportion of enrichment for the detected temporal expression genes. The given gene set (136 genes) was collected from ONGene [ 67 ] database. Mixed TDEseq enriched more temporal genes than other methods across a range of top-number cutoffs. G The bubble plot demonstrates the significant GO terms enriched by pattern-specific temporal expression genes, which were identified by Mixed TDEseq. The Wilcoxon test was excluded from this comparison due to its poor performance in simulations. FDR denotes the false discovery rate

To do so, we first examined the ability of temporal gene detection methods in terms of type I error controls. As we expected, when large sample-level variations were involved, Mixed TDEseq and Linear TDEseq produced well-calibrated p- values. The other methods tradeSeq, ImpulseDE2, DESeq2, and edgeR generated the inflated p- values (Fig.  4 B). Either in terms of temporal expression gene detection (Fig.  4 C) or in terms of temporal expression pattern detection (Additional file 2 : Fig. S15B), both versions of TDEseq outperformed other methods across a range of FDR cutoffs. Specifically, Mixed TDEseq identified a total of 11,919 temporally expressed genes at an FDR of 5%, which displayed four temporal distinct patterns (Fig.  4 D), while Linear TDEseq detected 12,263 temporal genes, tradeSeq detected 9,562 temporal genes, ImpulseDE2 identified 8,440 temporal genes, DESeq2 detected 5,081 temporal genes and edgeR detected 2,565 temporal genes (Additional file 3 : Table S5).

To validate whether the temporal expression genes were related to epithelial cell evaluation, we performed the following two lines of enrichment analyses. We examined the temporal expression genes detected by Mixed TDEseq but not detected by tradeSeq, ImpulseDE2, DESeq2, or edgeR, finding many genes were associated with LUAD evolution. For example, a LUAD driver MAP2K1 [ 68 ] was detected by Mixed TDEseq as significant temporal expression genes that gradually upregulated during LUAD progression ( p -value = 1.17e − 7, FDR = 0), while not detected by ImpulseDE2 ( p -value = 0.553, FDR = 0.105); besides, another LUAD drivers KRAS [ 68 ] was also detected by Mixed TDEseq as significant temporal expression genes that gradually upregulated during LUAD progression ( p -value = 4.05e − 12, FDR = 0), while not detected by tradeSeq ( p -value = 0.089, FDR = 0.356). Furthermore, we performed pairwise comparisons of Mixed TDEseq vs other methods. The result shows the unique temporal expression genes from Mixed TDEseq were enriched in GO terms that related to the LUAD progression (Fig.  4 E), such as signal transduction by p53 class mediator [ 69 ] (GO:0072331; BH-adjusted p -value = 1.09e − 4) and cellular response to hypoxia [ 70 ] (GO:00071456; BH-adjusted p -value = 4.91e-3). On the other hand, we curated a total of 136 LUAD-related genes from the ONGene database [ 67 ] (Additional file 3 : Table S6) to highlight the importance of temporal expression genes detected by different methods. As a result, we found that the temporal expression genes from Mixed TDEseq were enriched more genes than other methods across a range of top genes (Fig.  4 F). Notably, though Linear TDEseq detected more temporal genes, the temporal genes uniquely identified by Mixed TDEseq were enriched in more biologically meaningful GO terms (Additional file 2 : Fig. S15C), e.g., epithelial cell migration (GO:0010631; BH-adjusted p -value = 0.048).

Finally, we performed GSEA on the pattern-specific temporal expression genes to examine the four pattern-specific functions during LUAD progression. Specifically, with an FDR of 5%, Mixed TDEseq detected a total of 3,249 growth-specific temporal expression genes, which were enriched in 812 GO terms (Fig.  4 G). The top GO terms contained many tumor proliferation or metastasis-associated pathways, such as signal transduction by p53 class mediator [ 69 ] (GO:0072331; BH-adjusted p -value = 1.57e − 6) and regulation of canonical Wnt signaling pathway [ 71 ] (GO:0060828; BH-adjusted p -value = 2.81e − 3), suggesting epithelial cells proliferation towards tumor cells; Mixed TDEseq identified a total of 3,671 recession-specific temporal expression genes, which were enriched in 276 GO terms (Fig.  4 G). Those genes would be expected enriched in normal lung function terms as a result of the process of low-grade tumors developing into high-grade tumors. Indeed, the top GO terms contained lung development (GO:0030324; BH-adjusted p -value = 8.26e − 3) and lamellar body (GO:0042599; BH-adjusted p -value = 4.08e − 4), suggesting the proliferation process of epithelial cells towards tumor cells. TDEseq detected a total of 2,526 peak-specific temporal genes, which were enriched in a total of 244 GO terms. Notably, those peak genes were further enriched in hypoxia pathways, such as response to oxidative stress (GO:0006979; BH-adjusted p -value = 1.19e − 2), as well as epithelium migration (GO:0090132; BH-adjusted p -value = 4.56e − 2). This evidence further validated the fact that hypoxia occurs in the intermediate stages of LUAD promoting lymphatic metastasis [ 70 ].

Taken together, Mixed TDEseq can address time-course scRNA-seq data with relatively large sample-level variations over time points. However, one of the concerns regarding whether batch effects removal can improve the identification of temporal expression genes. To do so, we performed the temporal gene detection analysis using Mixed TDEseq either with or without batch correction. We found that Mixed TDEseq can generate well-calibrated p -values in both cases (Additional file 2 : Fig. S15D). In terms of temporal expression gene detection, Mixed TDEseq alone would produce a more powerful performance than that with batch correction across a range of FDR cutoffs (Additional file 2 : Fig. S15E). The slightly poor performance of Mixed TDEseq coupled with scMerge may be due to the over-correction of sample-level variations. Furthermore, we observed the temporal expression genes uniquely identified by Mixed TDEseq were significantly enriched in LUAD-related pathways (Additional file 2 : Fig. S15F), suggesting Mixed TDEseq well-addressed time-course scRNA-seq data with relatively large sample-level variations.

TDEseq detects NK cell response temporal genes of time-course scRNA-seq data

We finally applied TDEseq to detect the temporal expression changes of natural killer (NK) cells from 21 severe/critical COVID-19 patients [ 9 ] (the “ Materials and methods ” section). This time-course scRNA-seq dataset contains 19 time points (Additional file 3 : Table S1), which could be grouped into five developmental stages (Fig.  5 A), i.e., stage I (consisting of 930 cells from 3 patients), stage II (939 cells from 4 patients), stage III (893 cells from 3 patients), stage IV (768 cells from 3 patients), stage V (1,000 cells from 8 patients). Since these scRNA-seq datasets contain large sample-level variations across different stages (iLISI = 0.11; Fig.  5 B), presumably due to a large number of heterogeneous patients involved in this study. To do so, following the results from simulation studies, we first carried out Mixed TDEseq with or without the batch effect removal procedure using scMerge [ 52 ], since tradeSeq and ImpulseDE2 have originally built-in variables to control sample-level variations. For a fair comparison, we additionally incorporated the sample indicator variables as covariates in both tradeSeq and ImpulseDE2 models.

figure 5

The time-course scRNA-seq data analysis for the NK cell response to SARS-COV-2 infection. A The experimental design of SARS-COV-2 infection samples from PBMC. The scRNA-seq data were assayed by 10X Genomics Chromium protocols, consisting of 5 stages, i.e., stage I (4–8 days), stage II (10–13 days), stage III (19–24 days), stage IV (28–34 days), and stage V (110–123 days). B The UMAP demonstrates cell alignment from different stages. These scRNA-seq datasets display strong batch effects over heterogeneous samples (iLISI = 0.10, left panel). The cells are well-aligned after performing integrative analysis using scMerge (iLISI = 0.36, right panel). C The quantile–quantile (QQ) plot shows the type I error control under the permutation strategy. The well-calibrated p -values will be expected laid on the diagonal line. The p- values produced by Mixed TDEseq (plum), Mixed TDEseq coupled with scMerge (purple), and tradeSeq (green) are reasonably well-calibrated, while those from ImpulseDE2 (blue) are overly conservative, and those from edgeR (dark green) and DESeq2 (brown) are inflated. D The power comparison of temporal expression gene detection across a range of FDR cutoffs. The TDEseq methods were highlighted using solid lines, while other methods were represented by dashed lines in the plots. TDEseq coupled with scMerge is more powerful in identifying more temporal expression genes than other comparative methods. E The heatmap demonstrates the pattern-specific temporal expression genes that were identified by Mixed TDEseq coupled with scMerge. Gene expression levels were log-transformed and then standardized using z-scores for visualization. The top-ranked temporal expression genes identified by Mixed TDEseq coupled with scMerge show distinct four patterns. F The Venn diagram shows the overlapping of the temporally expressed genes (FDR ≤ 0.05) identified by Mixed TDEseq coupled with scMerge, tradeSeq, DESeq2, and edgeR. ImpulseDE2 was excluded because it only identified 3 temporal DE genes. Those method-specific unique genes were enriched in the number of GO terms ( N GO , BH-adjusted p -value < 0.05). The temporal expression genes detected by Mixed TDEseq coupled with scMerge enriched more GO terms. G The bubble plot demonstrates the significant GO terms enriched by pattern-specific temporal expression genes, which were identified by Mixed TDEseq coupled with scMerge. The Wilcoxon test was excluded from this comparison due to its poor performance in simulations. FDR denotes the false discovery rate

As we expected, in terms of type I error control, Mixed TDEseq and Mixed TDEseq coupled with scMerge (Mixed TDEseq + scMerge) can produce well-calibrated p -values (Fig.  5 C). Besides, in terms of temporal expression gene detection, Mixed TDEseq coupled with scMerge detected more temporal expression genes than Mixed TDEseq across a range of FDR cutoffs (Fig.  5 D, Additional file 3 : Table S7). In addition, with a range of FDR cutoffs Mixed TDEseq + scMerge identified more pattern-specific temporal genes (Additional file 2 : Fig. S16A), which displayed four temporal distinct patterns (Fig.  5 E). Moreover, the temporal expression genes that were uniquely detected by Mixed TDEseq + scMerge were enriched in the defense response to the virus GO term (GO:0051607) [ 72 ] (Additional file 3 : Table S8; Additional file 2 : Fig. S16B). Therefore, we performed the Mixed TDEseq + scMerge in the following analysis.

Notably, many top temporal expression genes uniquely detected by Mixed TDEseq coupled with scMerge were highly related to the NK cell response to COVID-19 infection. For example, GNLY which highly expressed in healthy people than patients with viral infections [ 73 ] was identified by Mixed TDEseq + scMerge as a significant temporal expression gene ( p -value = 2.93e − 5, FDR = 0.0; Additional file 2 : Fig. S16C), while not being detected by tradeSeq ( p -value = 0.377, FDR = 1.0), ImpulseDE2 ( p -value = 0.994, FDR = 1), DESeq2 ( p -value = 0.028, FDR = 0.051), or edgeR ( p -value = 0.015, FDR = 0.093). Another example is the ILF3 gene which plays an important role in the establishment of type I IFN antiviral program [ 74 ] was uniquely identified by Mixed TDEseq + scMerge ( p -value = 2.67e − 4, FDR = 1.16e − 3; Additional file 2 : Fig. S16C) but did not detect by tradeSeq ( p -value = 0.746, FDR = 1.0), ImpulseDE2 ( p -value = 0.952, FDR = 1.0), DESeq2 ( p -value = 0.684, FDR = 0.712), or edgeR ( p -value = 0.904, FDR = 0.892). This evidence supported that the temporal DE genes detected by Mixed TDEseq + scMerge were more specific to the SARS-COV-2 response biological process. Consequently, we further performed the GSEA on the temporal expression genes uniquely detected by Mixed TDEseq + scMerge, enriched in the immune response to virus infection pathways such as T cell activation (GO:0042110; BH-adjusted p -value = 0.027), and response to interleukin-12 (GO:0070671; BH-adjusted p -value = 0.029) [ 75 ] (Fig.  5 F).

Finally, we performed GSEA on the pattern-specific temporal expression genes. Specifically, with an FDR of 5%, Mixed TDEseq + scMerge detected a total of 654 growth-specific temporal expression genes, which were significantly enriched in cell cycle-associated pathways, such as mitotic G1/S transition checkpoint (GO:0044819; BH-adjusted p -value = 7.81e − 3). It was shown that NK cells showcased upregulated patterns of cell cycle and division after SARS-COV-2 infection [ 76 ], also enriched in the cellular response to interleukin-12 (GO:0071349; BH-adjusted p -value = 1.33e − 2), because IL-12 promotes NK cell proliferation at the end stage of SARS-COV-2 infection [ 77 ]; Mixed TDEseq + scMerge detected a total of 809 recession-specific temporal expression genes, which were enriched in the immune response to the virus as a result of SARS-COV-2 infection. Indeed, most of the top GO terms were immune response-associated pathways, such as defense response to the virus (GO:0051607; BH-adjusted p -value = 1.76e − 29) and type I interferon signaling pathway (GO:0060337; BH-adjusted p -value = 1.81e − 28) which promotes NK cell expansion during viral infection [ 78 ]; Mixed TDEseq + scMerge detected a total of 567 peak-specific temporal expression genes, which were enriched in oxidative phosphorylation (GO:0006119; BH-adjusted p -value = 0.025) and mitochondrial gene expression (GO:0140053; BH-adjusted p -value = 4.11e − 2), consistent with that long period of activation enhances effector functions in the NK cells and upregulated OXPHOS [ 79 , 80 ] (Fig.  5 G).

Taken together, we found Mixed TDEseq coupled with scMerge performed effectively in mitigating substantial sample-level variations (i.e., batch effects), which were presented in time-course scRNA-seq data. However, the batch correction methods do introduce extra variations into the data. Therefore, a more comprehensive assessment of the performance of temporal gene testing is required to determine whether these variations are advantageous, particularly for sparse scRNA-seq data [ 29 ].

In this paper, we have presented TDEseq, a non-parametric statistical method designed for identifying temporal expression patterns in time-course single-cell RNA sequencing (scRNA-seq) data. By incorporating shape-constrained spline models, TDEseq typically enables the detection of four specific temporal patterns, i.e., growth, recession, peak, or trough. Two versions of TDEseq (i.e., Mixed TDEseq and Linear TDEseq ) were developed to accommodate different real data application scenarios. Specifically, Mixed TDEseq is designed for analyzing the time-course scRNA-seq data with heterogeneous samples and large sample-level variations (i.e., batch effects) across time points, such as cancer evolution, while Linear TDEseq is tailored for handling the data with small heterogeneous samples, such as cell differentiation. With extensive simulations and four real data applications, TDEseq generated well-calibrated p -values and demonstrated powerful detection of temporal expression genes, highlighting its robustness and reliability in time-course scRNA-seq data analysis.

Particularly, the statistical modeling of TDEseq is different from tradeSeq and ImpluseDE2. TDEseq incorporates either I -splines [ 35 ] or C -splines [ 36 ] to model temporal expression patterns, and builds upon linear additive mixed models (the “ Materials and methods ” section) to characterize the dependent cells within an individual. TDEseq was originally designed for time-course scRNA-seq studies, in which cells for each time point were not largely intertwined between adjacent time points. In contrast, tradeSeq [ 31 ] employs a generalized additive model framework to model gene expression profiles of pseudotime for different lineages, while ImpluseDE2 [ 25 ] relies on a descriptive impulse function [ 81 ] to distinguish permanently from transiently upregulated or downregulated genes over multiple time points. As a result, we observed that the p -values from tradeSeq and ImpluseDE2 under permuted null were not well-calibrated even to control the sample-level variations as covariates, presumably due to its inability to model dependent cells within an individual rather than independent cells. We also found that the power of tradeSeq or ImpluseDE2 was lower than that of TDEseq (Additional file 2 : Fig. S7), likely due to its suitability for detecting a distinct type of differential expression patterns along a lineage or between lineages. In addition, we also developed a linear version of TDEseq to ensure more scalable computation for large-scale scRNA-seq data but with small batch effects. As a result, we found the performance of Linear TDEseq was comparable with Mixed TDEseq in small sample heterogeneity (Fig.  1 C and 1F). Besides, we found two pseudo-bulk aggregation methods, DESeq2 and edgeR, performed well in both high UMI counts and small number of cells per sample scenarios (Additional file 2 : Fig. S3D and S4C). However, the pseudo-bulk aggregation methods potentially generated biased inference and underpowered results due to the cells from the same individual are not statistically independent [ 82 ]. Finally, we found TDEseq demonstrates powerful performance in capturing the temporal expression patterns with the intertwined nature of cells across time points, particularly in the scenarios characterized by a modest proportion of intertwined cells between time points. Nevertheless, with the case where a substantial proportion of cells exhibit intertwined across time points, we suggest the pseudotime-based methods are used for detecting temporal expression genes. Exploration of the extent of cell intertwining could be pursued through trajectory inference or leveraging existing biological insights.

Based on the aforementioned observations, the sample-level variation (i.e., batch effect) plays a crucial role in the identification of temporal expression genes. However, the removal of this unwanted variation in time-course scRNA-seq data poses significant challenges [ 83 , 84 ], and there are currently no efficient criteria to measure these effects. Fortunately, we provide a solution called TDEseq, which can effectively handle such unwanted variation across multiple time points in time-course scRNA-seq studies. In contrast, tradeSeq and ImpluseDE2 directly model gene expression raw counts and incorporate the covariates to control the individual heterogeneity. However, this approach leads to a significant loss in power for detecting temporal expression genes and inevitably increases the computational burden. This is a reason why TDEseq was designed to model the transformed gene expression level (e.g., log-normalized or variance stabilizing transformation) rather than the count nature of raw gene expression data, allowing for scRNA-seq data preprocessing prior to performing TDEseq. Furthermore, even in the presence of large individual heterogeneity in scRNA-seq data, TDEseq, when coupled with batch removal methods, offers a promising approach to identifying temporal expression genes. In this study, we evaluated five batch effects removal methods. The results indicated that scMerge displays a more powerful performance than the other four methods. However, it is worth noting that the simulated scRNA-seq data may not fully mimic real-time-course scRNA-seq data. Therefore, alternative batch removal methods could also be applied to TDEseq analysis, especially a well-designed batch correction method specifically tailored for time-course scRNA-seq data, which would greatly enhance the performance of TDEseq.

Finally, several potential extensions for TDEseq can enhance its capabilities. Presently, TDEseq is designed to identify four distinct patterns—growth, recession, peak, and trough. In our efforts to broaden the scope of temporal expression patterns, we explored the bi-plateau pattern. While TDEseq demonstrated powerful performance in this pattern, it faced challenges in handling multi-modal temporal expression patterns. Besides, we observed that Mixed TDEseq detected weak peak or weak trough patterns as growth or recession patterns, while Linear TDEseq detected weak peak or weak trough patterns as peak or trough patterns making the difficult determination of temporal expression patterns. This would be a meaningful exploration in future research. Furthermore, the parameter inference of LAMM, the underlying model of TDEseq, becomes notably challenging when the number of cells is large (number of cells > 6,000). To address this issue, it would be beneficial to incorporate more efficient algorithms that reduce the computation burden. Alternatively, a down-sampling strategy or a pseudo-cell [ 2 , 85 ] strategy could be employed for each time point when dealing with an extremely large number of cells.

Overall, TDEseq is well-suited to analyzing time-course scRNA-seq datasets. Thus, it can be flexibly deployed to investigate the important temporal expression genes and their potential roles during growth, development, or disease progression.

Conclusions

In this paper, we present an algorithm TDEseq for the identification of temporal expression genes in time-course scRNA-seq data. To detect the temporal expression genes, we propose a linear additive mixed model that relies on shape-constrained spline. TDEseq accounts for the correlated nature of cells within individuals and demonstrates robustness in the presence of cellular heterogeneity and sample variations. With and extensive and comprehensive evaluation on various datasets, including both synthetic and real scRNA-seq data, TDEseq has shown superior performance against other methods such as tradeSeq, ImpulseDE2, Wilcoxon test, DESeq2, and edgeR. Overall, TDEseq stand as a powerful tool enabling the precise identification of temporal expression genes in time-course scRNA-seq data and facilitating a deeper understanding of the dynamic biological processes.

Materials and methods

Models and algorithm.

As the aforementioned overview, TDEseq typically models the log-normalized gene expression levels along the multiple time points as inputs. Subsequently, TDEseq is performed to identify temporal expression genes that display one of four possible patterns (i.e., growth, recession, peak, or trough). Specifically, we assume the transformed gene expression level \({y}_{gji}\left(t\right)\) for gene \(g\) , individual \(j\) and cell \(i\) at time point \(t\) is,

where \({{\varvec{w}}}_{gji}\) is the cell-level or time-level covariate (e.g., cell size, or sequencing read depth), \({\boldsymbol{\alpha }}_{g}\) is its corresponding coefficient;  \({{\varvec{u}}}_{g}\) is a random vector to account for the variations from heterogeneous samples, i.e.,

Where \({s}_{k}\left(t\right)\) is a smoothing spline basis function to characterize the temporal gene expression patterns. The regression function is estimated by a linear combination of the basis function with constrained. In particular, the I -splines defined as \({I}_{l}^{k}\left(t\right)={\int }_{{\xi }_{1}}^{t}{M}_{l}^{k}\left(v\right)dv\) [ 35 ] were used to characterize both growth and recession patterns, while C -splines as \({C}_{l}^{k}\left(t\right)={\int }_{{\xi }_{1}}^{t}{I}_{l}^{k}\left(v\right)d(v)\)  [ 36 ], taking integral operation of I -splines, were used to characterize both peak and trough patterns, where the order 1 M-splines are computed as,

The order k M -splines are computed as.

Where the number of M -splines basis functions is \(k+l\) ; \(l\) is the number of grid points; and \(k\) is the order of splines; we define knots \(1={\xi }_{1}<\dots <{\xi }_{k+l}=T\) . With the spline functions \({s}_{1}\left(t\right),\dots ,{s}_{K}\left(t\right)\) , we infer the parameters of Eq.  1 for gene \(g\) , which could be translated into estimating the parameters of the equation,

where \({\mathbf{W}}_{0}\) is the linear part of the spline basis function, and \(\mathbf{S}\) is the nonlinear part of the spline basis function which is an \(N\times \left(k+l\right)\) matrix with columns \({s}_{1}\left(t\right),\dots ,{s}_{k+l}\left(t\right)\) ; \(k\) is the number of knots of the spline functions; for the growth or recession patterns,  \(\mathbf{S}\) was assigned by I -spline basis functions and \({\mathbf{W}}_{0}=\left[{1}_{N}\right]\) . For both peak and trough patterns, \(\mathbf{S}\) was assigned by the C -spline basis vectors and \({\mathbf{W}}_{0}=[{1}_{N},{\varvec{t}}]\) , where \({\varvec{t}}={\left({t}_{1},\dots ,{t}_{N}\right)}{\prime}\) represents the time points. \({1}_{N}\) is an all-ones vector; \(\mathbf{W}\) is a covariates matrix; \({\boldsymbol{\alpha }}_{g}\) , \({\boldsymbol{\alpha }}_{g0}\) , and \({{\varvec{\beta}}}_{g}\) are the corresponding regression coefficients. Hence, the complete likelihood \({\varvec{L}}({{\varvec{y}}}_{g}, {\varvec{t}}, {\mathbf{W}}_{0}, \mathbf{W}|{\boldsymbol{\alpha }}_{g0}, {\boldsymbol{\alpha }}_{g}, {{\varvec{\beta}}}_{g})\) of the transformed gene expression data \({{\varvec{y}}}_{g}\) for gene \(g\) at a time, \({\varvec{t}}\) is:

Where \({{\varvec{\mu}}}_{g}=\mathbf{E}\left({{\varvec{y}}}_{g}\right)={\mathbf{W}}_{0}{\boldsymbol{\alpha }}_{g0}+\mathbf{W}{\boldsymbol{\alpha }}_{g}+\mathbf{S}{{\varvec{\beta}}}_{g}\) is the mean and \({\mathbf{K}}_{g}=\frac{{\sigma }_{gu}^{2}}{{\sigma }_{g}^{2}}{{\varvec{\Sigma}}}_{N\times N}+{\mathbf{I}}_{N\times N}\) is the covariance matrix; \(MVN(\bullet )\) represents multivariate normal distribution. For any pattern constraints, we assume a linearly independent set of \(\mathbf{S},{\mathbf{W}}_{0}\) and \(\mathbf{W}\) together as a closed convex cone:

The parameter estimation of TDEseq models is notoriously difficult, as it involves the calculation of a matrix determinant and a matrix inversion of  \({\mathbf{K}}_{g}\) . To enable scalable estimation and inference for TDEseq models, we have developed an efficient inference algorithm that: 1) performs a block diagonal matrix (one individual as an all-one block matrix) eigen-decomposition (Additional file 1 : Supplementary Text), i.e.,  \({\mathbf{K}}_{g}=\frac{{\sigma }_{gu}^{2}}{{\sigma }_{g}^{2}}{{\varvec{\Sigma}}}_{N\times N}+{\mathbf{I}}_{N\times N}={\mathbf{U}}_{g}{\mathbf{U}}_{g}^{\boldsymbol{^{\prime}}}\) ; 2) transforms the variables of Eq.  2 as:  \({\widetilde{{\varvec{y}}}}_{g}={\mathbf{U}}_{g}^{-1}{{\varvec{y}}}_{g},{\widetilde{\mathbf{W}}}_{0}={\mathbf{U}}_{g}^{-1}{\mathbf{W}}_{0},\widetilde{\mathbf{W}}={\mathbf{U}}_{g}^{-1}\mathbf{W}\) , and \(, \widetilde{\mathbf{S}}={\mathbf{U}}_{g}^{-1}\mathbf{S}\) resulting in

and the following new closed convex cone:

To efficiently estimate the parameter of an ordinary linear regression model (Eq.  3 ) with cone constraint (Eq.  4 ), we developed a cone projection algorithm following previous work [ 41 ]. With the estimated parameters, we further examined the gene expression pattern-specific parameter \({{\varvec{\beta}}}_{g}\) , which follows a mixture of Beta distributions [ 86 ] (Additional file 1 : Supplementary Text). Finally, TDEseq returns p -values for four temporal expression patterns, i.e., growth, recession, peak, and trough.

The choice of parameter knots in TDEseq models

The number of knots can significantly influence the smoothness and flexibility of the resulting curve [ 87 ]. Increasing the number of knots in general results in a more adaptable spline, enhancing its ability to capture intricate and irregular data patterns. Nonetheless, excessive flexibility in a spline can lead to overfitting, causing it to closely mimic noise in the data instead of representing the fundamental trend. Conversely, using too few knots can lead to underfitting, resulting in an overly smooth spline that misses crucial data features. Particularly, in this paper, we found that temporal gene detections are typically more robust with varying the number of knots across a range of FDR cutoffs (Additional file 2 : Fig. S17). TDEseq performed well when the number of knots k equals the number of time points. Therefore, we set the number of knots of TDEseq as the number of time points by default.

Linear TDEseq

Linear TDEseq is a reduced special model of Mixed TDEseq (Eq.  1 ), which drops the random effect term \({u}_{gji}\) , typically to model only one individual involved in each stage or small individual heterogeneity across all stages in time-course scRNA-seq data. Specifically, we assume the log-normalized gene expression level \({y}_{gi}\left(t\right)\) for gene \(g\) and cell \(i\) at time point \(t\) can be modeled as,

As a result, the parameter estimates of Eq.  5 could fall into Eqs.  3 and 4 , where the \({\text{cov}}\left({{\varvec{y}}}_{g}\right)={\sigma }_{g}^{2}{\mathbf{I}}_{N\times N}\) . Therefore, we can directly apply the cone projection algorithm to estimate \(\left({\boldsymbol{\alpha }}_{g0},\boldsymbol{ }{\boldsymbol{\alpha }}_{g},{{\varvec{\beta}}}_{g}, {\sigma }_{g}^{2}\right)\) without iteration procedure. Compared with Mixed TDEseq, Linear TDEseq is much more efficient in analyzing large-scale time-course scRNA-seq data.

Testing temporal expression patterns

Testing whether a gene shows a temporal expression pattern over all time points can be translated into testing the null hypothesis:  \({{\varvec{H}}}_{0}:{{\varvec{\beta}}}_{g}=0\) . The statistical power of such a hypothesis test will depend on how well the pattern-constrained spline function fits the observed temporal expressions. We, therefore, compute p -values for growth, recession, peak, or trough (each at a time), thereby combining these four p -values through the Cauchy combination rule [ 45 ]. Specifically, we first convert each of the four p -values into a Cauchy statistic and then aggregate the four Cauchy statistics through summation and convert the summation back to a single p -value based on the standard Cauchy distribution (Additional file 1 : Supplementary Text). After obtaining m p -values across m genes, we computed the false discovery rate (FDR) through the permutation strategy.

Determining one of the temporal expression patterns for each gene

A primary goal of TDEseq is assigning a suitable expression pattern of four given patterns (i.e., growth, recession, peak, or trough) for each gene. Specifically, for the linear version of TDEseq, TDEseq calculated the Akaike information criterion (AIC) [ 88 ] for gene \(g\) :

where k is the number of parameters and \({L}_{gp}\) is the likelihood for gene \(g\) and pattern \(p\) . For Mixed TDEseq, the marginal AIC is not an asymptotically unbiased estimator of the AIC and favors smaller models without random effects, but conditional AIC induces a bias that can lead to the selection of any random effect not predicted to be exactly zero [ 89 ]. Therefore, we used B -statistics to determine the temporal expression patterns for gene \(g\) :

where \(SS{R}_{gp1}\) is the sum of squared residuals with parameters estimated via cone projection algorithm, and \(SS{R}_{gp0}\) is the sum of squared residuals under the null hypothesis.

Simulations

To make our simulations as realistic as possible, we simulated time-course scRNA-seq data using the Splatter package [ 46 ], in which the parameters were inferred from real LUAD scRNA-seq data [ 14 ]. Splatter simulated scRNA-seq data by specifying the number of cells (using batchCells parameter) and the number of time points (using group.prob parameter). Specifically, in null simulations, we set the probability of the differential expression genes de.prob  = 0 and the effect of time point-specific size parameter de.facloc  = 0 to denote the non-temporal expression gene across all stages. We simulated 200 cells measured by 10,000 genes for each sample, to examine the performance of type I error control.

In power simulations, we varied the number of time points, the number of cells per sample, the time point-specific effect size de.facloc , and expected UMI counts lib.loc (Additional file 3 : Table S2). To do so, we set the probability of temporal expression gene in a group de.prob  = 0.3, the number of stages as 5, the time point-specific effect size parameter de.facloc  = 0.4, and the expected UMI count lib.loc  = 9.4 as the baseline simulation scenario. We then varied the number of time points to be either 4, 5, or 6 (3 samples/replicates per time point); the number of cells to be either 100, 200, or 300 for each sample; the time point-specific effect size parameter as 0.1, 0.4 or 0.7; and expected UMI counts as 7, 9.4, and 13.7 (estimates from rat liver scRNA-seq data). Splatter also added batch effects (i.e., sample-level variations) for the simulated dataset. The batch effects were applied to all genes for each sample. For the batch effects, we simulated 100 cells for each batch, and we varied sample-level variations, i.e., batch effects (i.e., batch.facloc ) to be either 0, 0.04, or 0.12 to represent small, moderate, or strong batch effects. With these parameter settings, we limited our simulations to six specific temporal expression patterns—growth, recession, peak, trough, bi-plateau, and multi-modal patterns. For temporal expression effect sizes, we generated time point-specific effect sizes by setting the parameter de.facloc , one time point at a time. Then, we examined temporal expression patterns based on effect sizes across time points, limiting our simulations to six specific temporal expression patterns (Additional file 2 : Fig. S1). We simulated three samples/replicates for each time point and repeated 10 times for each simulation scenario.

Besides temporal expression patterns, we also generated the smudged time-course scRNA-seq data using the SymSim R package [ 57 ]. Specifically, the parameter settings were as follows: the transcription rate vary  = “s”; the variance of Brownian motion, Sigma  = 0.4; the mean rate of subsampling of transcripts alpha_mean  = 0.05; the standard deviation rate of subsampling of transcripts alpha_sd  = 0.02; the mean of sequencing depth depth_mean  = 10,000; and the standard deviation of sequencing depth depth_sd  = 3,000. All datasets were measured by 2,000 cells and 10,000 genes. Consequently, to generate the data that were intertwined cells among time points, we proportionally mixed the cells from the other two adjacent time points for each stage. Specifically, with a given time point, we randomly sampled \({p}_{1}=\) 90% cells from the given stage and then sampled \({p}_{2}=\) 8% and \({p}_{3}=\) 2% from two adjacent stages, respectively, as a low proportion of intertwined cell scenario. Similarly, we set \({p}_{1}=70\%, {p}_{2}=24\%\) and \({p}_{3}=6\%\) as a medium proportion of intertwined cell scenario and \({p}_{1}=50\%, {p}_{2}=40\%\) and \({p}_{3}=10\%\) as a high proportion of intertwined cell scenario.

The difference in statistical models among TDEseq and tradeSeq or ImpluseDE2

In addition to TDEseq, we also employed other two temporal expression analysis methods, tradeSeq, and ImpulseDE2. The tradeSeq builds on the generalized additive model (GAM) that directly models raw gene expression counts in scRNA-seq data, i.e.,

where \({y}_{gi}\) represents the raw counts for gene \(g\) and cell \(i\) ; \(t\) represents the time points; \({{\varvec{w}}}_{gi}\) represents the covariates and \(\sum_{k=1}^{K}{b}_{k}\left(t\right){\beta }_{gk}\) represents a linear combination of \(K\) cubic basis functions; \({N}_{i}\) denotes the total counts for cell \(i\) ; \(NB\) is a negative binomial distribution. To avoid overfitting issues, tradeSeq employed penalized spline which shrinkages \({\beta }_{gk}\) to zero and therefore are less sensitive to temporal genes with small fold changes. Notably, tradeSeq [ 47 ] was primarily developed for detecting trajectory-based differential expression genes; however, the applicability of tradeSeq was extended beyond this setting, i.e., also can be applied to bulk time-course RNA-seq data analysis [ 31 ]. Therefore, we performed an analogous analysis using tradeSeq in the comparison. On the other hand, ImpulseDE2 [ 25 ] combines the impulse model [ 81 ] with a negative binomial noise model to directly model the raw counts of gene expression measurements. The impulse function is the scaled product of two sigmoid functions:

where \({h}_{0}={f}_{{\text{Impulse}}}\left(t\to -\infty \right), {h}_{2}={f}_{{\text{Impulse}}}\left(t\to \infty \right)\) , and \({h}_{1}\) models the intermediate expression, \({t}_{1}\) and \({t}_{2}\) are the state transition times, an d \(\beta\) is the slope parameter of both sigmoid functions.

The impulse function is a more restrictive model compared with spline functions, therefore limiting its power. It was originally designed to model the bulk time-course RNA-seq data. To adapt for temporal expression analysis of time-course scRNA-seq data, we modified the implementation of ImpulseDE2 following the tradeSeq paper. Both methods take a count matrix Y and a time points vector t as input and return one p -value for each gene at a time.

Methods for comparison

We compared TDEseq with five existing methods for identifying temporal expression genes from time-course from scRNA-seq data (tradeSeq, and Wilcoxon test) or bulk RNA-seq data (ImpulseDE2, edgeR, and DESeq2). For tradeSeq (version 1.4.0), we used the functions fitGAM and associationTest ( https://statomics.github.io/tradeSeq/articles/tradeSeq.html ). The number of knots parameter k in the tradeSeq was chosen by 100 random genes based on the tradeSeq vignette. For ImpulseDE2 (version 0.99.10), we followed the modified implementation of ImpulseDE2 in the tradeSeq paper ( https://github.com/statOmics/tradeSeqPaper ). For DESeq2 (version 1.40.2) and edgeR (version 3.42.4), we treated time points as categorical factors and tested DE genes using a likelihood ratio test.

Permutation strategy to construct the null distribution

In real data applications, to calculate the false discovery rate (FDR), we construct an empirical null distribution of p -values through permuting the time point variables and repeating 5 times. Afterward, we computed the FDRs using.

Where \({P}_{rk}^{null}\) is an increasing ordered p -value for \({k}^{th}\) gene and \({r}^{th}\) permutation; \({P}_{g}^{alt}\) is an increasing ordered p -values under the alternative hypothesis.

Functional gene set enrichment analysis

The gene set enrichment analyses of temporal expression genes were performed by the enrichGO function implemented in the clusterProfiler R package (version 3.18.1) [ 90 ]. Specifically, we used all genes as the background and set the minimal and maximal sizes of genes annotated by Gene Ontology (GO) terms for testing as 10 and 500, respectively. The significant GO terms were selected by setting BH-adjusted p -value < 0.05.

Batch effects removal evaluation

We used the LISI metric to measure cell batch distribution (iLISI) [ 56 ]. The LISI metric was designed to assess whether clusters of cells in a scRNA-seq dataset are well-mixed across categorical variables (batches). We took the median value of the scores computed for all cells in the dataset and scaled the value between 0 and 1 to denote the worst and best cell mixed.

HCT116 cell lines after 5-AZA-CdR treatment data

The scRNA-seq data were assayed by Well-TEMP-seq, which contains 5-AZA-CdR-treatment HCT116 cell lines after 0 days (4,000 cells), 1 day (4,000 cells), 2 days (4,000 cells), or 3 days (4,000 cells) [ 23 ]. The Well-TEMP-seq technology can distinguish new RNAs from pre-existing RNAs, we used the new RNAs which better reflect RNA dynamics for downstream analysis. For the preprocessing of scRNA-seq data, the genes with more than 99% zero counts were filtered out, resulting in 7,314 genes and 16,000 cells for further analysis.

Mouse hepatoblast differentiation data

The scRNA-seq data were assayed by Smart-seq2 protocols [ 63 ] on isolated cells from mouse fetal livers at 7 different developmental stages [ 38 ]. Gene expression levels were measured by a total of 14,226 genes and 345 cells. In our analysis, we only considered the hepatoblast cells for temporal expression analysis. Finally, for the preprocessing of scRNA-seq data, the genes with more than 99% of zero counts were filtered out, resulting in 14,180 genes and 345 cells for further analysis.

Human metastatic LUAD progression data

The scRNA-seq data [ 14 ] were assayed by 10X Genomics Chromium protocols [ 91 ] on LUAD samples from 5 distinct developmental stages, i.e., the control stage consists of a total of 80,441 cells in 7 cell types from 21 samples; stage I consists of a total of 31,026 cells in 7 cell types from 8 samples; stage II consists of a total of 3,840 cells in 7 cell types from 1 sample, stage III consists of a total of 10,283 cells in 7 cell types from 2 samples, and stage IV (metastasis) consists of a total of 82,916 cells in 7 cell types from 26 samples. In our analysis, we only employed the epithelial cells from the control lung samples (3,703 cells), stage I tumor lung samples (5,651 cells), stage III tumor lung samples (1,500 cells), and stage IV samples (lymph node metastasis, 6,582 cells). To relieve the computational burden in practice, we utilized a down-sampling strategy to randomly select 1,000 cells for the stage that contains more than 1,000 cells. Finally, for the preprocessing of scRNA-seq data, the genes with more than 99% of zero counts were filtered out, resulting in 15,263 genes and 4,000 cells for further analysis.

Human COVID-19 immune response data

The scRNA-seq data were assayed by 10X Genomics Chromium protocols on human SARS-COV-2 infection samples from disease progression ranging from 4 to 123 days [ 9 ]. In our analysis, we only employed the NK cells from the 21 serve/critical patients and divided those patients into 5 stages according to the time point interval, i.e., stage I (4–8 days, 930 cells), stage II (10–13 days, 939 cells), stage III (19–24 days, 893 cells), stage IV (28–34 days, 768 cells), and stage V (110–123 days, 1,000 cells). Finally, for the preprocessing of scRNA-seq data, the genes with more than 99% of zero counts were filtered out, resulting in 10,699 genes and 4,530 cells for further analysis.

Availability of data and materials

All scRNA-seq datasets used in this study are publicly available on the GEO database. Specifically, mouse hepatoblast differentiation scRNA-seq data are available at GEO under accession GSE90047 [ 38 ]; human metastatic LUAD progression scRNA-seq data are available at GEO under accession GSE131907 [ 14 ]; human COVID-19 immune response scRNA-seq data are available at GEO under accession GSE158055 [ 9 ]. Well-TEMP-seq scRNA-seq data are available at GEO under accession GSE194357 [ 23 ]. TDEseq is an open-source R package that is freely available from GitHub ( https://github.com/fanyue322/TDEseq ) [ 92 ] and https://sqsun.github.io/software.html . Source code for the software release used in the paper has been placed into a DOI-assigning repository ( https://zenodo.org/records/10869078 ) [ 93 ]. The source code and scRNA-seq data for reproducing the results are publicly available at https://sqsun.github.io/software.html .

Qiu Q, et al. Massively parallel and time-resolved RNA sequencing in single cells with scNT-seq. Nat Methods. 2020;17(10):991–1001.

Article   CAS   PubMed   PubMed Central   Google Scholar  

Han XP, et al. Construction of a human cell landscape at single-cell level. Nature. 2020;581(7808):303–9.

Article   CAS   PubMed   Google Scholar  

Chen W, et al. Live-seq enables temporal transcriptomic recording of single cells. Nature. 2022;608(7924):733–40.

Saelens W, et al. A comparison of single-cell trajectory inference methods. Nat Biotechnol. 2019;37(5):547–54.

Qiu X, et al. Reversed graph embedding resolves complex single-cell trajectories. Nat Methods. 2017;14(10):979–82.

Street K, et al. Slingshot: cell lineage and pseudotime inference for single-cell transcriptomics. BMC Genomics. 2018;19(1):477.

Article   PubMed   PubMed Central   Google Scholar  

Cao JY, et al. Sci-fate characterizes the dynamics of gene expression in single cells. Nat Biotechnol. 2020;38(8):980–8.

Hu F, Warren J, Exeter DJ. Interrupted time series analysis on first cardiovascular disease hospitalization for adherence to lipid-lowering therapy. Pharmacoepidemiol Drug Saf. 2020;29(2):150–60.

Ren X, et al. COVID-19 immune features revealed by a large-scale single-cell transcriptome atlas. Cell. 2021;184(7):1895-1913 e19.

Garcia-Alonso L, et al. Single-cell roadmap of human gonadal development. Nature. 2022;607(7919):540–7.

Fan XY, et al. Single-cell transcriptome analysis reveals cell lineage specification in temporal-spatial patterns in human cortical development. Sci Adv. 2020;6(34):eaaz2978.

Zhu J, et al. Delineating the dynamic evolution from preneoplasia to invasive lung adenocarcinoma by integrating single-cell RNA sequencing and spatial transcriptomics. Exp Mol Med. 2022;54(11):2060–76.

Wang Z, et al. Deciphering cell lineage specification of human lung adenocarcinoma with single-cell RNA sequencing. Nat Commun. 2021;12(1):1–15.

Google Scholar  

Kim N, et al. Single-cell RNA sequencing demonstrates the molecular and cellular reprogramming of metastatic lung adenocarcinoma. Nat Commun. 2020;11(1):2285.

Zou ZR, et al. A single-cell transcriptomic atlas of human skin aging. Dev Cell. 2021;56(3):383–97.

Mogilenko DA, Shchukina I, Artyomov MN. Immune ageing at single-cell resolution. Nat Rev Immunol. 2022;22(8):484–98.

Allen WE, et al. Molecular and spatial signatures of mouse brain aging at single-cell resolution. Cell. 2023;186(1):194–208.

Su X, et al. Single-cell RNA-Seq analysis reveals dynamic trajectories during mouse liver development. BMC Genomics. 2017;18(1):946.

Shao L, et al. Identify differential genes and cell subclusters from time-series scRNA-seq data using scTITANS. Comput Struct Biotechnol J. 2021;19:4132–41.

Ding J, Sharon N, Bar-Joseph Z. Temporal modelling using single-cell transcriptomics. Nat Rev Genet. 2022;23(6):355–68.

Bar-Joseph Z, Gitter A, Simon I. STUDY DESIGNS Studying and modelling dynamic biological processes using time-series gene expression data. Nat Rev Genet. 2012;13(8):552–64.

Bar-Joseph Z. Analyzing time series gene expression data. Bioinformatics. 2004;20(16):2493–503.

Lin S, et al. Well-TEMP-seq as a microwell-based strategy for massively parallel profiling of single-cell temporal RNA dynamics. Nat Commun. 2023;14(1):1272.

Gao S, et al. Tracing the temporal-spatial transcriptome landscapes of the human fetal digestive tract using single-cell RNA-sequencing. Nat Cell Biol. 2018;20(6):721–34.

Fischer DS, Theis FJ, Yosef N. Impulse model-based differential expression analysis of time course sequencing data. Nucleic Acids Res. 2018;46(20):e119.

PubMed   PubMed Central   Google Scholar  

Love MI, Huber W, Anders S. Moderated estimation of fold change and dispersion for RNA-seq data with DESeq2. Genome Biol. 2014;15(12):550.

Robinson MD, McCarthy DJ, Smyth GK. edgeR: a Bioconductor package for differential expression analysis of digital gene expression data. Bioinformatics. 2010;26(1):139–40.

Lahnemann D, et al. Eleven grand challenges in single-cell data science. Genome Biol. 2020;21(1):31.

Nguyen HCT, et al. Benchmarking integration of single-cell differential expression. Nat Commun. 2023;14(1):1570.

Goldman SL, et al. The impact of heterogeneity on single-cell sequencing. Front Genet. 2019;10:8.

Van den Berge K, et al. Trajectory-based differential expression analysis for single-cell sequencing data. Nat Commun. 2020;11(1):1–13.

Song DY, Li JJ. PseudotimeDE: inference of differential gene expression along cell pseudotime with well-calibrated p-values from single-cell RNA sequencing data. Genome Biol. 2021;22(1):124.

Lange M, et al. Cell Rank for directed single-cell fate mapping. Nat Methods. 2022;19(2):159–70.

Setty M, et al. Characterization of cell fate probabilities in single-cell data with Palantir. Nat Biotechnol. 2019;37(4):451–60.

Ramsay JO. Monotone regression splines in action. Stat Sci. 1988;3(4):425–41.

Meyer MC. Inference using shape-restricted regression splines. Annals of Applied Statistics. 2008;2(3):1013–33.

Article   Google Scholar  

Picelli S, et al. Full-length RNA-seq from single cells using Smart-seq2. Nat Protoc. 2014;9(1):171–81.

Yang L, et al. A single-cell transcriptomic analysis reveals precise pathways and regulatory mechanisms underlying hepatoblast differentiation. Hepatology. 2017;66(5):1387–401.

Chen C. Generalized additive mixed models. Commun Stat –Theory Methods. 2000;29(5–6):1257–71.

Curry HB, Schoenberg IJ. On Pólya frequency functions IV: the fundamental spline functions and their limits. J Anal Math. 1966;17(1):71–107.

Meyer MC. A simple new algorithm for quadratic programming with applications in statistics. Commun Stat-Simul Comput. 2013;42(5):1126–39.

Liao XY, Meyer MC. Estimation and inference in mixed effect regression models using shape constraints, with application to tree height estimation. J R Stat Soc Series C-Appl Stat. 2020;69(2):353–75.

Meyer MC. A test for linear versus convex regression function using shape-restricted regression. Biometrika. 2003;90(1):223–32.

Robertson T, Wright FT, Dykstra RL. Order restricted statistical inference. New York: John Wiley; 1988.

Liu YW, et al. ACAT: a fast and powerful p value combination method for rare-variant analysis in sequencing studies. Am J Hum Genet. 2019;104(3):410–21.

Zappia L, Phipson B, Oshlack A. Splatter: simulation of single-cell RNA sequencing data. Genome Biol. 2017;18:174.

Van den Berge K, et al. Trajectory-based differential expression analysis for single-cell sequencing data. Nat Commun. 2020;11(1):1201.

Perez RK, et al. Single-cell RNA-seq reveals cell type-specific molecular and genetic associations to lupus. Science. 2022;376(6589):153–65.

Li D, et al. An evaluation of RNA-seq differential analysis methods. PLoS ONE. 2022;17(9):e0264246.

Zhou X, Lindsay H, Robinson MD. Robustly detecting differential expression in RNA sequencing data using observation weights. Nucleic Acids Res. 2014;42(11):e91–e91.

Haghverdi L, et al. Batch effects in single-cell RNA-sequencing data are corrected by matching mutual nearest neighbors. Nat Biotechnol. 2018;36(5):421–7.

Lin Y, et al. scMerge leverages factor analysis, stable expression, and pseudoreplication to merge multiple single-cell RNA-seq datasets. Proc Natl Acad Sci. 2019;116(20):9775–84.

Risso D, et al. A general and flexible method for signal extraction from single-cell RNA-seq data. Nat Commun. 2018;9(1):284.

Johnson WE, Li C, Rabinovic A. Adjusting batch effects in microarray expression data using empirical Bayes methods. Biostatistics. 2007;8(1):118–27.

Article   PubMed   Google Scholar  

Smyth GK, Speed T. Normalization of cDNA microarray data. Methods. 2003;31(4):265–73.

Tran HTN, et al. A benchmark of batch-effect correction methods for single-cell RNA sequencing data. Genome Biol. 2020;21(1):12.

Zhang XW, Xu CL, Yosef N. Simulating multiple faceted variability in single cell RNA sequencing. Nat Commun. 2019;10(1):2611.

Kocemba KA, et al. Transcriptional silencing of the Wnt-antagonist DKK1 by promoter methylation is associated with enhanced Wnt signaling in advanced multiple myeloma. PLoS ONE. 2012;7(2):e30359.

Liu MM, et al. Vitamin C increases viral mimicry induced by 5-aza-2 ’-deoxycytidine. Proc Natl Acad Sci USA. 2016;113(37):10238–44.

Roulois D, et al. DNA-demethylating agents target colorectal cancer cells by inducing viral mimicry by endogenous transcripts. Cell. 2015;162(5):961–73.

Zheng YC, Feng SQ. Epigenetic modifications as therapeutic targets. Curr Drug Targets. 2020;21(11):1046–1046.

Gleneadie HJ, et al. The anti-tumour activity of DNA methylation inhibitor 5-aza-2 ’-deoxycytidine is enhanced by the common analgesic paracetamol through induction of oxidative stress. Cancer Lett. 2021;501:172–86.

Picelli S, et al. Smart-seq2 for sensitive full-length transcriptome profiling in single cells. Nat Methods. 2013;10(11):1096–8.

Meng Q, et al. Repression of MAP3K1 expression and JNK activity by canonical Wnt signaling. Dev Biol. 2018;440(2):129–36.

Zhao Y, et al. ATF4 plays a pivotal role in the development of functional hematopoietic stem cells in mouse fetal liver. Blood. 2015;126(21):2383–91.

Weninger WJ, et al. Phenotyping structural abnormalities in mouse embryos using high-resolution episcopic microscopy. Dis Model Mech. 2014;7(10):1143–52.

Liu YN, Sun JC, Zhao M. ONGene: a literature-based database for human oncogenes. J Genet Genomics. 2017;44(2):119–21.

Pao W, Girard N. New driver mutations in non-small-cell lung cancer. Lancet Oncol. 2011;12(2):175–80.

Huang J. Current developments of targeting the p53 signaling pathway for cancer treatment. Pharmacol Ther. 2021;220:107720.

Muz B, et al. The role of hypoxia in cancer progression, angiogenesis, metastasis, and resistance to therapy. Hypoxia (Auckl). 2015;3:83–92.

Luo J, et al. PITX2 enhances progression of lung adenocarcinoma by transcriptionally regulating WNT3A and activating Wnt/beta-catenin signaling pathway. Cancer Cell Int. 2019;19:96.

Xie Z, et al. Gene set knowledge discovery with Enrichr. Curr Protoc. 2021;1(3):e90.

Baljic R, et al. Granulysin as a novel factor for the prognosis of the clinical course of chickenpox. Epidemiol Infect. 2018;146(7):854–7.

Watson SF, Bellora N, Macias S. ILF3 contributes to the establishment of the antiviral type I interferon program. Nucleic Acids Res. 2019;48(1):116–29.

PubMed Central   Google Scholar  

Ouyang W, et al. Regulation and functions of the IL-10 family of cytokines in inflammation and disease. Annu Rev Immunol. 2011;29:71–109.

Cao X. COVID-19: immunopathology and its implications for therapy. Nat Rev Immunol. 2020;20(5):269–70.

Shemesh A, et al. Diminished cell proliferation promotes natural killer cell adaptive-like phenotype by limiting FcepsilonRIgamma expression. J Exp Med. 2022;219(11):e20220551.

Madera S, et al. Type I IFN promotes NK cell expansion during viral infection by protecting NK cells against fratricide. J Exp Med. 2016;213(2):225–33.

Osuna-Espinoza KY, Rosas-Taraco AG. Metabolism of NK cells during viral infections. Front Immunol. 2023;14:1064101.

Kumar A, et al. Enhanced oxidative phosphorylation in NKT cells is essential for their survival and function. Proc Natl Acad Sci U S A. 2019;116(15):7439–48.

Gal C, Daphne K. Timing of gene expression responses to environmental changes. J Comput Biol. 2009;16(2):279–90.

Zimmerman KD, Espeland MA, Langefeld CD. A practical solution to pseudoreplication bias in single-cell studies. Nat Commun. 2021;12(1):738.

Luecken MD, et al. Benchmarking atlas-level data integration in single-cell genomics. Nat Methods. 2022;19(1):41–50.

Guo XY, et al. Recent advances in differential expression analysis for single-cell RNAseqand spatially resolved transcriptomic studies. Brief Funct Genom. 2024;23:95–109.

You Y, et al. Modeling group heteroscedasticity in single-cell RNA-seq pseudo-bulk data. Genome Biol. 2023;24(1):107.

Nuesch PE. Order restricted statistical-inference - Robertson, T, Wright, Ft, Dykstra Rl. J Appl Econom. 1991;6(1):105–7.

Johs B, Hale JS. Dielectric function representation by B-splines. Phys Status Solidi A Appl Mater Sci. 2008;205(4):715–9.

Article   CAS   Google Scholar  

Akaike H. Information theory and an extension of the maximum likelihood principle. New York: Springer; 1998.

Book   Google Scholar  

Greven S, Kneib T. On the behaviour of marginal and conditional AIC in linear mixed models. Biometrika. 2010;97(4):773–89.

Yu GC, et al. clusterProfiler: an R package for comparing biological themes among gene clusters. OMICS. 2012;16(5):284–7.

Zheng GX, et al. Massively parallel digital transcriptional profiling of single cells. Nat Commun. 2017;8:14049.

Fan Y, Li L, Sun SQ. Powerful and accurate detection of temporal gene expression patterns from multi-sample multi-stage single cell transcriptomics data with TDEseq GitHub. 2024. https://github.com/fanyue322/TDEseq .

Fan Y, Li L, Sun SQ. Powerful and accurate detection of temporal gene expression patterns from multi-sample multi-stage single cell transcriptomics data with TDEseq. Zenodo. 2024. https://doi.org/10.5281/zenodo.10869078 .

Download references

Acknowledgements

We would like to thank Jin Ning in our research group at Xi’an Jiaotong University for pre-processing part of scRNA-seq data used in this paper.

Peer review information

Kevin Pang was the primary editor of this article and managed its editorial process and peer review in collaboration with the rest of the editorial team.

Review history

The review history is available as Additional file 4 .

This study was supported by the STI2030-Major Project (Grant No. 2022ZD0208000) to SS, LL, and YF; the National Natural Science Foundation of China (Grant Nos. 82122061 and 61902319) to SS; the National Natural Science Foundation of China (Grant No. 82204173); and the Natural Science Foundation of Shaanxi Province (Grant No. 2022JQ-773) to YF.

Author information

Authors and affiliations.

Center for Single-Cell Omics and Health, School of Public Health, Xi’an Jiaotong University, Xi’an, Shaanxi, 710061, People’s Republic of China

Yue Fan, Lei Li & Shiquan Sun

Collaborative Innovation Center of Endemic Diseases and Health Promotion in Silk Road Region; NHC Key Laboratory of Environment and Endemic Diseases, Xi’an Jiaotong University, Xi’an, Shaanxi, 710061, People’s Republic of China

Key Laboratory of Environment and Genes Related to Diseases (Xi’an Jiaotong University), Ministry of Education, Xi’an, Shaanxi, 710061, People’s Republic of China

Yue Fan & Shiquan Sun

Key Laboratory for Disease Prevention and Control and Health Promotion of Shaanxi Province, Xi’an, Shaanxi, 710061, People’s Republic of China

Shiquan Sun

You can also search for this author in PubMed   Google Scholar

Contributions

SS conceived the idea of the manuscript and provided funding support. SS and YF developed the method and designed the experiments. YF implemented the software and performed simulations and real data analysis with assistance from LL. SS and YF wrote the manuscript. All authors approved the final manuscript.

Corresponding author

Correspondence to Shiquan Sun .

Ethics declarations

Ethics approval and consent to participate.

Not applicable.

Consent for publication

Competing interests.

The authors declare that they have no competing interests.

Additional information

Publisher’s note.

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary Information

Additional file 1..

Supplementary text on TDEseq modeling and inference details.

Additional file 2.

Supplementary figures on the simulation performance evaluation.

Additional file 3.

Supplementary tables on simulations and real data information, identified dynamic temporal genes for each real dataset. Additionally, the validation gene sets utilized in this research are documented.

Additional file 4.

Review history.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ . The Creative Commons Public Domain Dedication waiver ( http://creativecommons.org/publicdomain/zero/1.0/ ) applies to the data made available in this article, unless otherwise stated in a credit line to the data.

Reprints and permissions

About this article

Cite this article.

Fan, Y., Li, L. & Sun, S. Powerful and accurate detection of temporal gene expression patterns from multi-sample multi-stage single-cell transcriptomics data with TDEseq. Genome Biol 25 , 96 (2024). https://doi.org/10.1186/s13059-024-03237-3

Download citation

Received : 22 September 2023

Accepted : 03 April 2024

Published : 15 April 2024

DOI : https://doi.org/10.1186/s13059-024-03237-3

Share this article

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

  • Time-course scRNA-seq data
  • Temporal expression patterns
  • Non-parametric models

Genome Biology

ISSN: 1474-760X

5 example of hypothesis and conclusion

IMAGES

  1. PPT

    5 example of hypothesis and conclusion

  2. Hypothesis And Conclusion Research Example

    5 example of hypothesis and conclusion

  3. PPT

    5 example of hypothesis and conclusion

  4. How to Write a Hypothesis: The Ultimate Guide with Examples

    5 example of hypothesis and conclusion

  5. Identifying Hypothesis and Conclusion

    5 example of hypothesis and conclusion

  6. PPT

    5 example of hypothesis and conclusion

VIDEO

  1. HYPOTHESIS TESTING CONCEPT AND EXAMPLE #shorts #statistics #data #datanalysis #analysis #hypothesis

  2. Making conclusions in a test about a proportion

  3. Hypothesis Testing Conclusion| Statistics Course Part 7

  4. 26- Chapter 5 Summary ( Test Hypotheses )

  5. Hypothesis Meaning with Examples

  6. Determine the hypothesis and conclusion of a conditional statement

COMMENTS

  1. How to Write Hypothesis Test Conclusions (With Examples)

    A hypothesis test is used to test whether or not some hypothesis about a population parameter is true.. To perform a hypothesis test in the real world, researchers obtain a random sample from the population and perform a hypothesis test on the sample data, using a null and alternative hypothesis:. Null Hypothesis (H 0): The sample data occurs purely from chance.

  2. How to Write a Strong Hypothesis

    Developing a hypothesis (with example) Step 1. Ask a question. Writing a hypothesis begins with a research question that you want to answer. The question should be focused, specific, and researchable within the constraints of your project. Example: Research question.

  3. How to Write a Hypothesis w/ Strong Examples

    For example, a hypothesis like "Sunflower plants need water to grow" is not falsifiable, as it is already a well-established fact. But a hypothesis regarding frequency or amount of watering does have the potential to be nullified. ... Conclusion. In conclusion, understanding and effectively formulating a solid hypothesis is what scientific ...

  4. Hypothesis Testing

    Step 5: Present your findings. The results of hypothesis testing will be presented in the results and discussion sections of your research paper, dissertation or thesis.. In the results section you should give a brief summary of the data and a summary of the results of your statistical test (for example, the estimated difference between group means and associated p-value).

  5. How to Write a Research Hypothesis: Good & Bad Examples

    Another example for a directional one-tailed alternative hypothesis would be that. H1: Attending private classes before important exams has a positive effect on performance. Your null hypothesis would then be that. H0: Attending private classes before important exams has no/a negative effect on performance.

  6. Writing a Research Paper Conclusion

    Table of contents. Step 1: Restate the problem. Step 2: Sum up the paper. Step 3: Discuss the implications. Research paper conclusion examples. Frequently asked questions about research paper conclusions.

  7. How to Write a Strong Hypothesis

    Step 5: Phrase your hypothesis in three ways. To identify the variables, you can write a simple prediction in if … then form. The first part of the sentence states the independent variable and the second part states the dependent variable. If a first-year student starts attending more lectures, then their exam scores will improve.

  8. How to Write a Hypothesis 101: A Step-by-Step Guide

    In the following section, we explore both types of hypotheses with examples. Alternative Hypothesis (H1) This kind of hypothesis suggests a relationship or effect between the variables. ... In conclusion, a hypothesis plays a fundamental role in the research process. ... acting as a counterpoint to your primary hypothesis. Step 5: Review Your ...

  9. What Is a Hypothesis and How Do I Write One?

    Hypothesis Testing Examples. We know it can be hard to write a good hypothesis unless you've seen some good hypothesis examples. We've included four hypothesis examples based on some made-up experiments. Use these as templates or launch pads for coming up with your own hypotheses. Experiment #1: Students Studying Outside (Writing a Hypothesis)

  10. Research Hypothesis: Definition, Types, Examples and Quick Tips

    3. Simple hypothesis. A simple hypothesis is a statement made to reflect the relation between exactly two variables. One independent and one dependent. Consider the example, "Smoking is a prominent cause of lung cancer." The dependent variable, lung cancer, is dependent on the independent variable, smoking. 4.

  11. The Scientific Method

    CONCLUSION. The final step in the scientific method is the conclusion. This is a summary of the experiment's results, and how those results match up to your hypothesis. You have two options for your conclusions: based on your results, either: (1) YOU CAN REJECT the hypothesis, or (2) YOU CAN NOT REJECT the hypothesis. This is an important point!

  12. Hypothesis: Definition, Examples, and Types

    A hypothesis is a tentative statement about the relationship between two or more variables. It is a specific, testable prediction about what you expect to happen in a study. It is a preliminary answer to your question that helps guide the research process. Consider a study designed to examine the relationship between sleep deprivation and test ...

  13. How to Write a Conclusion for Research Papers (with Examples)

    While there is no strict rule for the length of a conclusion, but it's generally advisable to keep it relatively short. A typical research paper conclusion might be around 5-10% of the paper's total length. For example, if your paper is 10 pages long, the conclusion might be roughly half a page to one page in length.

  14. How to Write Hypothesis Test Conclusions (With Examples)

    A hypothesis test is used to test whether or not some hypothesis about a population parameter is true.. To perform a hypothesis test in the real world, researchers obtain a random sample from the population and perform a hypothesis test on the sample data, using a null and alternative hypothesis:. Null Hypothesis (H 0): The sample data occurs purely from chance.

  15. Steps of the Scientific Method

    The six steps of the scientific method include: 1) asking a question about something you observe, 2) doing background research to learn what is already known about the topic, 3) constructing a hypothesis, 4) experimenting to test the hypothesis, 5) analyzing the data from the experiment and drawing conclusions, and 6) communicating the results ...

  16. 9.5 Additional Information and Full Hypothesis Test Examples

    The "1%" is the preconceived or preset α. The statistician setting up the hypothesis test selects the value of α to use before collecting the sample data. If no level of significance is given, a common standard to use is α = 0.05. When you calculate the p -value and draw the picture, the p -value is the area in the left tail, the right tail ...

  17. Using P-values to make conclusions (article)

    Onward! We use p -values to make conclusions in significance testing. More specifically, we compare the p -value to a significance level α to make conclusions about our hypotheses. If the p -value is lower than the significance level we chose, then we reject the null hypothesis H 0 in favor of the alternative hypothesis H a .

  18. How to State the Conclusion about a Hypothesis Test

    The best way to state the conclusion is to include the significance level of the test and a bit about the claim itself. For example, if the claim was the alternative that the mean score on a test was greater than 85, and your decision was to Reject then Null, then you could conclude: " At the 5% significance level, there is sufficient ...

  19. Scientific Method: Definition, Steps, Examples, Uses

    There are seven steps of the scientific method such as: Make an observation. Ask a question. Background research/ Research the topic. Formulate a hypothesis. Conduct an experiment to test the hypothesis. Data record and analysis. Draw a conclusion. 1.

  20. How to Write a Thesis or Dissertation Conclusion

    Step 2: Summarize and reflect on your research. Step 3: Make future recommendations. Step 4: Emphasize your contributions to your field. Step 5: Wrap up your thesis or dissertation. Full conclusion example. Conclusion checklist. Other interesting articles. Frequently asked questions about conclusion sections.

  21. Understanding the Role of Hypotheses and Conclusions in Mathematical

    Hypothesis and conclusion. In the context of mathematics and logic, a hypothesis is a statement or proposition that is assumed to be true for the purpose of a logical argument or investigation. It is usually denoted by "H" or "P" and is the starting point for many mathematical proofs. For example, let's consider the hypothesis: "If ...

  22. 9.5: Additional Information and Full Hypothesis Test Examples

    In a hypothesis test problem, you may see words such as "the level of significance is 1%." The "1%" is the preconceived or preset α.; The statistician setting up the hypothesis test selects the value of α to use before collecting the sample data.; If no level of significance is given, a common standard to use is α = 0.05.; When you calculate the p-value and draw the picture, the p ...

  23. Z-test vs T-test: the differences and when to use each

    Both a Z-test and a T-test validate a hypothesis. Both are parametric tests that rely on assumptions. The key difference between Z-test and T-test is in their assumptions (e.g. population variance). Key differences about the data used result in different applications.

  24. Powerful and accurate detection of temporal gene expression patterns

    Hypothesis testing. In the LAMM model mentioned above, ... This scRNA-seq dataset consists of 7 developmental stages from 13 samples, including E10.5 (54 cells from 1 sample), E11.5 (70 cells from 2 samples), E12.5 (41 cells from 2 samples), ... Conclusions. In this paper, we present an algorithm TDEseq for the identification of temporal ...

  25. Null & Alternative Hypotheses

    The null and alternative hypotheses offer competing answers to your research question. When the research question asks "Does the independent variable affect the dependent variable?": The null hypothesis ( H0) answers "No, there's no effect in the population.". The alternative hypothesis ( Ha) answers "Yes, there is an effect in the ...

  26. Buildings

    To prove this hypothesis, researchers proposed a new type of prestressed concrete composite slab with removable rectangular steel-tube lattice girders (referred to as CDB composite slabs), whose bottom plate consists of a temporary structure composed of a prestressed concrete prefabricated plate and removable rectangular steel-tube lattice ...