If you're seeing this message, it means we're having trouble loading external resources on our website.

If you're behind a web filter, please make sure that the domains *.kastatic.org and *.kasandbox.org are unblocked.

To log in and use all the features of Khan Academy, please enable JavaScript in your browser.

Biology library

Course: biology library   >   unit 1, the scientific method.

  • Controlled experiments
  • The scientific method and experimental design

Introduction

  • Make an observation.
  • Ask a question.
  • Form a hypothesis , or testable explanation.
  • Make a prediction based on the hypothesis.
  • Test the prediction.
  • Iterate: use the results to make new hypotheses or predictions.

Scientific method example: Failure to toast

1. make an observation..

  • Observation: the toaster won't toast.

2. Ask a question.

  • Question: Why won't my toaster toast?

3. Propose a hypothesis.

  • Hypothesis: Maybe the outlet is broken.

4. Make predictions.

  • Prediction: If I plug the toaster into a different outlet, then it will toast the bread.

5. Test the predictions.

  • Test of prediction: Plug the toaster into a different outlet and try again.
  • If the toaster does toast, then the hypothesis is supported—likely correct.
  • If the toaster doesn't toast, then the hypothesis is not supported—likely wrong.

Logical possibility

Practical possibility, building a body of evidence, 6. iterate..

  • Iteration time!
  • If the hypothesis was supported, we might do additional tests to confirm it, or revise it to be more specific. For instance, we might investigate why the outlet is broken.
  • If the hypothesis was not supported, we would come up with a new hypothesis. For instance, the next hypothesis might be that there's a broken wire in the toaster.

Want to join the conversation?

  • Upvote Button navigates to signup page
  • Downvote Button navigates to signup page
  • Flag Button navigates to signup page

Incredible Answer

SEP home page

  • Table of Contents
  • Random Entry
  • Chronological
  • Editorial Information
  • About the SEP
  • Editorial Board
  • How to Cite the SEP
  • Special Characters
  • Advanced Tools
  • Support the SEP
  • PDFs for SEP Friends
  • Make a Donation
  • SEPIA for Libraries
  • Entry Contents

Bibliography

Academic tools.

  • Friends PDF Preview
  • Author and Citation Info
  • Back to Top

Scientific Method

Science is an enormously successful human enterprise. The study of scientific method is the attempt to discern the activities by which that success is achieved. Among the activities often identified as characteristic of science are systematic observation and experimentation, inductive and deductive reasoning, and the formation and testing of hypotheses and theories. How these are carried out in detail can vary greatly, but characteristics like these have been looked to as a way of demarcating scientific activity from non-science, where only enterprises which employ some canonical form of scientific method or methods should be considered science (see also the entry on science and pseudo-science ). Others have questioned whether there is anything like a fixed toolkit of methods which is common across science and only science. Some reject privileging one view of method as part of rejecting broader views about the nature of science, such as naturalism (Dupré 2004); some reject any restriction in principle (pluralism).

Scientific method should be distinguished from the aims and products of science, such as knowledge, predictions, or control. Methods are the means by which those goals are achieved. Scientific method should also be distinguished from meta-methodology, which includes the values and justifications behind a particular characterization of scientific method (i.e., a methodology) — values such as objectivity, reproducibility, simplicity, or past successes. Methodological rules are proposed to govern method and it is a meta-methodological question whether methods obeying those rules satisfy given values. Finally, method is distinct, to some degree, from the detailed and contextual practices through which methods are implemented. The latter might range over: specific laboratory techniques; mathematical formalisms or other specialized languages used in descriptions and reasoning; technological or other material means; ways of communicating and sharing results, whether with other scientists or with the public at large; or the conventions, habits, enforced customs, and institutional controls over how and what science is carried out.

While it is important to recognize these distinctions, their boundaries are fuzzy. Hence, accounts of method cannot be entirely divorced from their methodological and meta-methodological motivations or justifications, Moreover, each aspect plays a crucial role in identifying methods. Disputes about method have therefore played out at the detail, rule, and meta-rule levels. Changes in beliefs about the certainty or fallibility of scientific knowledge, for instance (which is a meta-methodological consideration of what we can hope for methods to deliver), have meant different emphases on deductive and inductive reasoning, or on the relative importance attached to reasoning over observation (i.e., differences over particular methods.) Beliefs about the role of science in society will affect the place one gives to values in scientific method.

The issue which has shaped debates over scientific method the most in the last half century is the question of how pluralist do we need to be about method? Unificationists continue to hold out for one method essential to science; nihilism is a form of radical pluralism, which considers the effectiveness of any methodological prescription to be so context sensitive as to render it not explanatory on its own. Some middle degree of pluralism regarding the methods embodied in scientific practice seems appropriate. But the details of scientific practice vary with time and place, from institution to institution, across scientists and their subjects of investigation. How significant are the variations for understanding science and its success? How much can method be abstracted from practice? This entry describes some of the attempts to characterize scientific method or methods, as well as arguments for a more context-sensitive approach to methods embedded in actual scientific practices.

1. Overview and organizing themes

2. historical review: aristotle to mill, 3.1 logical constructionism and operationalism, 3.2. h-d as a logic of confirmation, 3.3. popper and falsificationism, 3.4 meta-methodology and the end of method, 4. statistical methods for hypothesis testing, 5.1 creative and exploratory practices.

  • 5.2 Computer methods and the ‘new ways’ of doing science

6.1 “The scientific method” in science education and as seen by scientists

6.2 privileged methods and ‘gold standards’, 6.3 scientific method in the court room, 6.4 deviating practices, 7. conclusion, other internet resources, related entries.

This entry could have been given the title Scientific Methods and gone on to fill volumes, or it could have been extremely short, consisting of a brief summary rejection of the idea that there is any such thing as a unique Scientific Method at all. Both unhappy prospects are due to the fact that scientific activity varies so much across disciplines, times, places, and scientists that any account which manages to unify it all will either consist of overwhelming descriptive detail, or trivial generalizations.

The choice of scope for the present entry is more optimistic, taking a cue from the recent movement in philosophy of science toward a greater attention to practice: to what scientists actually do. This “turn to practice” can be seen as the latest form of studies of methods in science, insofar as it represents an attempt at understanding scientific activity, but through accounts that are neither meant to be universal and unified, nor singular and narrowly descriptive. To some extent, different scientists at different times and places can be said to be using the same method even though, in practice, the details are different.

Whether the context in which methods are carried out is relevant, or to what extent, will depend largely on what one takes the aims of science to be and what one’s own aims are. For most of the history of scientific methodology the assumption has been that the most important output of science is knowledge and so the aim of methodology should be to discover those methods by which scientific knowledge is generated.

Science was seen to embody the most successful form of reasoning (but which form?) to the most certain knowledge claims (but how certain?) on the basis of systematically collected evidence (but what counts as evidence, and should the evidence of the senses take precedence, or rational insight?) Section 2 surveys some of the history, pointing to two major themes. One theme is seeking the right balance between observation and reasoning (and the attendant forms of reasoning which employ them); the other is how certain scientific knowledge is or can be.

Section 3 turns to 20 th century debates on scientific method. In the second half of the 20 th century the epistemic privilege of science faced several challenges and many philosophers of science abandoned the reconstruction of the logic of scientific method. Views changed significantly regarding which functions of science ought to be captured and why. For some, the success of science was better identified with social or cultural features. Historical and sociological turns in the philosophy of science were made, with a demand that greater attention be paid to the non-epistemic aspects of science, such as sociological, institutional, material, and political factors. Even outside of those movements there was an increased specialization in the philosophy of science, with more and more focus on specific fields within science. The combined upshot was very few philosophers arguing any longer for a grand unified methodology of science. Sections 3 and 4 surveys the main positions on scientific method in 20 th century philosophy of science, focusing on where they differ in their preference for confirmation or falsification or for waiving the idea of a special scientific method altogether.

In recent decades, attention has primarily been paid to scientific activities traditionally falling under the rubric of method, such as experimental design and general laboratory practice, the use of statistics, the construction and use of models and diagrams, interdisciplinary collaboration, and science communication. Sections 4–6 attempt to construct a map of the current domains of the study of methods in science.

As these sections illustrate, the question of method is still central to the discourse about science. Scientific method remains a topic for education, for science policy, and for scientists. It arises in the public domain where the demarcation or status of science is at issue. Some philosophers have recently returned, therefore, to the question of what it is that makes science a unique cultural product. This entry will close with some of these recent attempts at discerning and encapsulating the activities by which scientific knowledge is achieved.

Attempting a history of scientific method compounds the vast scope of the topic. This section briefly surveys the background to modern methodological debates. What can be called the classical view goes back to antiquity, and represents a point of departure for later divergences. [ 1 ]

We begin with a point made by Laudan (1968) in his historical survey of scientific method:

Perhaps the most serious inhibition to the emergence of the history of theories of scientific method as a respectable area of study has been the tendency to conflate it with the general history of epistemology, thereby assuming that the narrative categories and classificatory pigeon-holes applied to the latter are also basic to the former. (1968: 5)

To see knowledge about the natural world as falling under knowledge more generally is an understandable conflation. Histories of theories of method would naturally employ the same narrative categories and classificatory pigeon holes. An important theme of the history of epistemology, for example, is the unification of knowledge, a theme reflected in the question of the unification of method in science. Those who have identified differences in kinds of knowledge have often likewise identified different methods for achieving that kind of knowledge (see the entry on the unity of science ).

Different views on what is known, how it is known, and what can be known are connected. Plato distinguished the realms of things into the visible and the intelligible ( The Republic , 510a, in Cooper 1997). Only the latter, the Forms, could be objects of knowledge. The intelligible truths could be known with the certainty of geometry and deductive reasoning. What could be observed of the material world, however, was by definition imperfect and deceptive, not ideal. The Platonic way of knowledge therefore emphasized reasoning as a method, downplaying the importance of observation. Aristotle disagreed, locating the Forms in the natural world as the fundamental principles to be discovered through the inquiry into nature ( Metaphysics Z , in Barnes 1984).

Aristotle is recognized as giving the earliest systematic treatise on the nature of scientific inquiry in the western tradition, one which embraced observation and reasoning about the natural world. In the Prior and Posterior Analytics , Aristotle reflects first on the aims and then the methods of inquiry into nature. A number of features can be found which are still considered by most to be essential to science. For Aristotle, empiricism, careful observation (but passive observation, not controlled experiment), is the starting point. The aim is not merely recording of facts, though. For Aristotle, science ( epistêmê ) is a body of properly arranged knowledge or learning—the empirical facts, but also their ordering and display are of crucial importance. The aims of discovery, ordering, and display of facts partly determine the methods required of successful scientific inquiry. Also determinant is the nature of the knowledge being sought, and the explanatory causes proper to that kind of knowledge (see the discussion of the four causes in the entry on Aristotle on causality ).

In addition to careful observation, then, scientific method requires a logic as a system of reasoning for properly arranging, but also inferring beyond, what is known by observation. Methods of reasoning may include induction, prediction, or analogy, among others. Aristotle’s system (along with his catalogue of fallacious reasoning) was collected under the title the Organon . This title would be echoed in later works on scientific reasoning, such as Novum Organon by Francis Bacon, and Novum Organon Restorum by William Whewell (see below). In Aristotle’s Organon reasoning is divided primarily into two forms, a rough division which persists into modern times. The division, known most commonly today as deductive versus inductive method, appears in other eras and methodologies as analysis/​synthesis, non-ampliative/​ampliative, or even confirmation/​verification. The basic idea is there are two “directions” to proceed in our methods of inquiry: one away from what is observed, to the more fundamental, general, and encompassing principles; the other, from the fundamental and general to instances or implications of principles.

The basic aim and method of inquiry identified here can be seen as a theme running throughout the next two millennia of reflection on the correct way to seek after knowledge: carefully observe nature and then seek rules or principles which explain or predict its operation. The Aristotelian corpus provided the framework for a commentary tradition on scientific method independent of science itself (cosmos versus physics.) During the medieval period, figures such as Albertus Magnus (1206–1280), Thomas Aquinas (1225–1274), Robert Grosseteste (1175–1253), Roger Bacon (1214/1220–1292), William of Ockham (1287–1347), Andreas Vesalius (1514–1546), Giacomo Zabarella (1533–1589) all worked to clarify the kind of knowledge obtainable by observation and induction, the source of justification of induction, and best rules for its application. [ 2 ] Many of their contributions we now think of as essential to science (see also Laudan 1968). As Aristotle and Plato had employed a framework of reasoning either “to the forms” or “away from the forms”, medieval thinkers employed directions away from the phenomena or back to the phenomena. In analysis, a phenomena was examined to discover its basic explanatory principles; in synthesis, explanations of a phenomena were constructed from first principles.

During the Scientific Revolution these various strands of argument, experiment, and reason were forged into a dominant epistemic authority. The 16 th –18 th centuries were a period of not only dramatic advance in knowledge about the operation of the natural world—advances in mechanical, medical, biological, political, economic explanations—but also of self-awareness of the revolutionary changes taking place, and intense reflection on the source and legitimation of the method by which the advances were made. The struggle to establish the new authority included methodological moves. The Book of Nature, according to the metaphor of Galileo Galilei (1564–1642) or Francis Bacon (1561–1626), was written in the language of mathematics, of geometry and number. This motivated an emphasis on mathematical description and mechanical explanation as important aspects of scientific method. Through figures such as Henry More and Ralph Cudworth, a neo-Platonic emphasis on the importance of metaphysical reflection on nature behind appearances, particularly regarding the spiritual as a complement to the purely mechanical, remained an important methodological thread of the Scientific Revolution (see the entries on Cambridge platonists ; Boyle ; Henry More ; Galileo ).

In Novum Organum (1620), Bacon was critical of the Aristotelian method for leaping from particulars to universals too quickly. The syllogistic form of reasoning readily mixed those two types of propositions. Bacon aimed at the invention of new arts, principles, and directions. His method would be grounded in methodical collection of observations, coupled with correction of our senses (and particularly, directions for the avoidance of the Idols, as he called them, kinds of systematic errors to which naïve observers are prone.) The community of scientists could then climb, by a careful, gradual and unbroken ascent, to reliable general claims.

Bacon’s method has been criticized as impractical and too inflexible for the practicing scientist. Whewell would later criticize Bacon in his System of Logic for paying too little attention to the practices of scientists. It is hard to find convincing examples of Bacon’s method being put in to practice in the history of science, but there are a few who have been held up as real examples of 16 th century scientific, inductive method, even if not in the rigid Baconian mold: figures such as Robert Boyle (1627–1691) and William Harvey (1578–1657) (see the entry on Bacon ).

It is to Isaac Newton (1642–1727), however, that historians of science and methodologists have paid greatest attention. Given the enormous success of his Principia Mathematica and Opticks , this is understandable. The study of Newton’s method has had two main thrusts: the implicit method of the experiments and reasoning presented in the Opticks, and the explicit methodological rules given as the Rules for Philosophising (the Regulae) in Book III of the Principia . [ 3 ] Newton’s law of gravitation, the linchpin of his new cosmology, broke with explanatory conventions of natural philosophy, first for apparently proposing action at a distance, but more generally for not providing “true”, physical causes. The argument for his System of the World ( Principia , Book III) was based on phenomena, not reasoned first principles. This was viewed (mainly on the continent) as insufficient for proper natural philosophy. The Regulae counter this objection, re-defining the aims of natural philosophy by re-defining the method natural philosophers should follow. (See the entry on Newton’s philosophy .)

To his list of methodological prescriptions should be added Newton’s famous phrase “ hypotheses non fingo ” (commonly translated as “I frame no hypotheses”.) The scientist was not to invent systems but infer explanations from observations, as Bacon had advocated. This would come to be known as inductivism. In the century after Newton, significant clarifications of the Newtonian method were made. Colin Maclaurin (1698–1746), for instance, reconstructed the essential structure of the method as having complementary analysis and synthesis phases, one proceeding away from the phenomena in generalization, the other from the general propositions to derive explanations of new phenomena. Denis Diderot (1713–1784) and editors of the Encyclopédie did much to consolidate and popularize Newtonianism, as did Francesco Algarotti (1721–1764). The emphasis was often the same, as much on the character of the scientist as on their process, a character which is still commonly assumed. The scientist is humble in the face of nature, not beholden to dogma, obeys only his eyes, and follows the truth wherever it leads. It was certainly Voltaire (1694–1778) and du Chatelet (1706–1749) who were most influential in propagating the latter vision of the scientist and their craft, with Newton as hero. Scientific method became a revolutionary force of the Enlightenment. (See also the entries on Newton , Leibniz , Descartes , Boyle , Hume , enlightenment , as well as Shank 2008 for a historical overview.)

Not all 18 th century reflections on scientific method were so celebratory. Famous also are George Berkeley’s (1685–1753) attack on the mathematics of the new science, as well as the over-emphasis of Newtonians on observation; and David Hume’s (1711–1776) undermining of the warrant offered for scientific claims by inductive justification (see the entries on: George Berkeley ; David Hume ; Hume’s Newtonianism and Anti-Newtonianism ). Hume’s problem of induction motivated Immanuel Kant (1724–1804) to seek new foundations for empirical method, though as an epistemic reconstruction, not as any set of practical guidelines for scientists. Both Hume and Kant influenced the methodological reflections of the next century, such as the debate between Mill and Whewell over the certainty of inductive inferences in science.

The debate between John Stuart Mill (1806–1873) and William Whewell (1794–1866) has become the canonical methodological debate of the 19 th century. Although often characterized as a debate between inductivism and hypothetico-deductivism, the role of the two methods on each side is actually more complex. On the hypothetico-deductive account, scientists work to come up with hypotheses from which true observational consequences can be deduced—hence, hypothetico-deductive. Because Whewell emphasizes both hypotheses and deduction in his account of method, he can be seen as a convenient foil to the inductivism of Mill. However, equally if not more important to Whewell’s portrayal of scientific method is what he calls the “fundamental antithesis”. Knowledge is a product of the objective (what we see in the world around us) and subjective (the contributions of our mind to how we perceive and understand what we experience, which he called the Fundamental Ideas). Both elements are essential according to Whewell, and he was therefore critical of Kant for too much focus on the subjective, and John Locke (1632–1704) and Mill for too much focus on the senses. Whewell’s fundamental ideas can be discipline relative. An idea can be fundamental even if it is necessary for knowledge only within a given scientific discipline (e.g., chemical affinity for chemistry). This distinguishes fundamental ideas from the forms and categories of intuition of Kant. (See the entry on Whewell .)

Clarifying fundamental ideas would therefore be an essential part of scientific method and scientific progress. Whewell called this process “Discoverer’s Induction”. It was induction, following Bacon or Newton, but Whewell sought to revive Bacon’s account by emphasising the role of ideas in the clear and careful formulation of inductive hypotheses. Whewell’s induction is not merely the collecting of objective facts. The subjective plays a role through what Whewell calls the Colligation of Facts, a creative act of the scientist, the invention of a theory. A theory is then confirmed by testing, where more facts are brought under the theory, called the Consilience of Inductions. Whewell felt that this was the method by which the true laws of nature could be discovered: clarification of fundamental concepts, clever invention of explanations, and careful testing. Mill, in his critique of Whewell, and others who have cast Whewell as a fore-runner of the hypothetico-deductivist view, seem to have under-estimated the importance of this discovery phase in Whewell’s understanding of method (Snyder 1997a,b, 1999). Down-playing the discovery phase would come to characterize methodology of the early 20 th century (see section 3 ).

Mill, in his System of Logic , put forward a narrower view of induction as the essence of scientific method. For Mill, induction is the search first for regularities among events. Among those regularities, some will continue to hold for further observations, eventually gaining the status of laws. One can also look for regularities among the laws discovered in a domain, i.e., for a law of laws. Which “law law” will hold is time and discipline dependent and open to revision. One example is the Law of Universal Causation, and Mill put forward specific methods for identifying causes—now commonly known as Mill’s methods. These five methods look for circumstances which are common among the phenomena of interest, those which are absent when the phenomena are, or those for which both vary together. Mill’s methods are still seen as capturing basic intuitions about experimental methods for finding the relevant explanatory factors ( System of Logic (1843), see Mill entry). The methods advocated by Whewell and Mill, in the end, look similar. Both involve inductive generalization to covering laws. They differ dramatically, however, with respect to the necessity of the knowledge arrived at; that is, at the meta-methodological level (see the entries on Whewell and Mill entries).

3. Logic of method and critical responses

The quantum and relativistic revolutions in physics in the early 20 th century had a profound effect on methodology. Conceptual foundations of both theories were taken to show the defeasibility of even the most seemingly secure intuitions about space, time and bodies. Certainty of knowledge about the natural world was therefore recognized as unattainable. Instead a renewed empiricism was sought which rendered science fallible but still rationally justifiable.

Analyses of the reasoning of scientists emerged, according to which the aspects of scientific method which were of primary importance were the means of testing and confirming of theories. A distinction in methodology was made between the contexts of discovery and justification. The distinction could be used as a wedge between the particularities of where and how theories or hypotheses are arrived at, on the one hand, and the underlying reasoning scientists use (whether or not they are aware of it) when assessing theories and judging their adequacy on the basis of the available evidence. By and large, for most of the 20 th century, philosophy of science focused on the second context, although philosophers differed on whether to focus on confirmation or refutation as well as on the many details of how confirmation or refutation could or could not be brought about. By the mid-20 th century these attempts at defining the method of justification and the context distinction itself came under pressure. During the same period, philosophy of science developed rapidly, and from section 4 this entry will therefore shift from a primarily historical treatment of the scientific method towards a primarily thematic one.

Advances in logic and probability held out promise of the possibility of elaborate reconstructions of scientific theories and empirical method, the best example being Rudolf Carnap’s The Logical Structure of the World (1928). Carnap attempted to show that a scientific theory could be reconstructed as a formal axiomatic system—that is, a logic. That system could refer to the world because some of its basic sentences could be interpreted as observations or operations which one could perform to test them. The rest of the theoretical system, including sentences using theoretical or unobservable terms (like electron or force) would then either be meaningful because they could be reduced to observations, or they had purely logical meanings (called analytic, like mathematical identities). This has been referred to as the verifiability criterion of meaning. According to the criterion, any statement not either analytic or verifiable was strictly meaningless. Although the view was endorsed by Carnap in 1928, he would later come to see it as too restrictive (Carnap 1956). Another familiar version of this idea is operationalism of Percy William Bridgman. In The Logic of Modern Physics (1927) Bridgman asserted that every physical concept could be defined in terms of the operations one would perform to verify the application of that concept. Making good on the operationalisation of a concept even as simple as length, however, can easily become enormously complex (for measuring very small lengths, for instance) or impractical (measuring large distances like light years.)

Carl Hempel’s (1950, 1951) criticisms of the verifiability criterion of meaning had enormous influence. He pointed out that universal generalizations, such as most scientific laws, were not strictly meaningful on the criterion. Verifiability and operationalism both seemed too restrictive to capture standard scientific aims and practice. The tenuous connection between these reconstructions and actual scientific practice was criticized in another way. In both approaches, scientific methods are instead recast in methodological roles. Measurements, for example, were looked to as ways of giving meanings to terms. The aim of the philosopher of science was not to understand the methods per se , but to use them to reconstruct theories, their meanings, and their relation to the world. When scientists perform these operations, however, they will not report that they are doing them to give meaning to terms in a formal axiomatic system. This disconnect between methodology and the details of actual scientific practice would seem to violate the empiricism the Logical Positivists and Bridgman were committed to. The view that methodology should correspond to practice (to some extent) has been called historicism, or intuitionism. We turn to these criticisms and responses in section 3.4 . [ 4 ]

Positivism also had to contend with the recognition that a purely inductivist approach, along the lines of Bacon-Newton-Mill, was untenable. There was no pure observation, for starters. All observation was theory laden. Theory is required to make any observation, therefore not all theory can be derived from observation alone. (See the entry on theory and observation in science .) Even granting an observational basis, Hume had already pointed out that one could not deductively justify inductive conclusions without begging the question by presuming the success of the inductive method. Likewise, positivist attempts at analyzing how a generalization can be confirmed by observations of its instances were subject to a number of criticisms. Goodman (1965) and Hempel (1965) both point to paradoxes inherent in standard accounts of confirmation. Recent attempts at explaining how observations can serve to confirm a scientific theory are discussed in section 4 below.

The standard starting point for a non-inductive analysis of the logic of confirmation is known as the Hypothetico-Deductive (H-D) method. In its simplest form, a sentence of a theory which expresses some hypothesis is confirmed by its true consequences. As noted in section 2 , this method had been advanced by Whewell in the 19 th century, as well as Nicod (1924) and others in the 20 th century. Often, Hempel’s (1966) description of the H-D method, illustrated by the case of Semmelweiss’ inferential procedures in establishing the cause of childbed fever, has been presented as a key account of H-D as well as a foil for criticism of the H-D account of confirmation (see, for example, Lipton’s (2004) discussion of inference to the best explanation; also the entry on confirmation ). Hempel described Semmelsweiss’ procedure as examining various hypotheses explaining the cause of childbed fever. Some hypotheses conflicted with observable facts and could be rejected as false immediately. Others needed to be tested experimentally by deducing which observable events should follow if the hypothesis were true (what Hempel called the test implications of the hypothesis), then conducting an experiment and observing whether or not the test implications occurred. If the experiment showed the test implication to be false, the hypothesis could be rejected. If the experiment showed the test implications to be true, however, this did not prove the hypothesis true. The confirmation of a test implication does not verify a hypothesis, though Hempel did allow that “it provides at least some support, some corroboration or confirmation for it” (Hempel 1966: 8). The degree of this support then depends on the quantity, variety and precision of the supporting evidence.

Another approach that took off from the difficulties with inductive inference was Karl Popper’s critical rationalism or falsificationism (Popper 1959, 1963). Falsification is deductive and similar to H-D in that it involves scientists deducing observational consequences from the hypothesis under test. For Popper, however, the important point was not the degree of confirmation that successful prediction offered to a hypothesis. The crucial thing was the logical asymmetry between confirmation, based on inductive inference, and falsification, which can be based on a deductive inference. (This simple opposition was later questioned, by Lakatos, among others. See the entry on historicist theories of scientific rationality. )

Popper stressed that, regardless of the amount of confirming evidence, we can never be certain that a hypothesis is true without committing the fallacy of affirming the consequent. Instead, Popper introduced the notion of corroboration as a measure for how well a theory or hypothesis has survived previous testing—but without implying that this is also a measure for the probability that it is true.

Popper was also motivated by his doubts about the scientific status of theories like the Marxist theory of history or psycho-analysis, and so wanted to demarcate between science and pseudo-science. Popper saw this as an importantly different distinction than demarcating science from metaphysics. The latter demarcation was the primary concern of many logical empiricists. Popper used the idea of falsification to draw a line instead between pseudo and proper science. Science was science because its method involved subjecting theories to rigorous tests which offered a high probability of failing and thus refuting the theory.

A commitment to the risk of failure was important. Avoiding falsification could be done all too easily. If a consequence of a theory is inconsistent with observations, an exception can be added by introducing auxiliary hypotheses designed explicitly to save the theory, so-called ad hoc modifications. This Popper saw done in pseudo-science where ad hoc theories appeared capable of explaining anything in their field of application. In contrast, science is risky. If observations showed the predictions from a theory to be wrong, the theory would be refuted. Hence, scientific hypotheses must be falsifiable. Not only must there exist some possible observation statement which could falsify the hypothesis or theory, were it observed, (Popper called these the hypothesis’ potential falsifiers) it is crucial to the Popperian scientific method that such falsifications be sincerely attempted on a regular basis.

The more potential falsifiers of a hypothesis, the more falsifiable it would be, and the more the hypothesis claimed. Conversely, hypotheses without falsifiers claimed very little or nothing at all. Originally, Popper thought that this meant the introduction of ad hoc hypotheses only to save a theory should not be countenanced as good scientific method. These would undermine the falsifiabililty of a theory. However, Popper later came to recognize that the introduction of modifications (immunizations, he called them) was often an important part of scientific development. Responding to surprising or apparently falsifying observations often generated important new scientific insights. Popper’s own example was the observed motion of Uranus which originally did not agree with Newtonian predictions. The ad hoc hypothesis of an outer planet explained the disagreement and led to further falsifiable predictions. Popper sought to reconcile the view by blurring the distinction between falsifiable and not falsifiable, and speaking instead of degrees of testability (Popper 1985: 41f.).

From the 1960s on, sustained meta-methodological criticism emerged that drove philosophical focus away from scientific method. A brief look at those criticisms follows, with recommendations for further reading at the end of the entry.

Thomas Kuhn’s The Structure of Scientific Revolutions (1962) begins with a well-known shot across the bow for philosophers of science:

History, if viewed as a repository for more than anecdote or chronology, could produce a decisive transformation in the image of science by which we are now possessed. (1962: 1)

The image Kuhn thought needed transforming was the a-historical, rational reconstruction sought by many of the Logical Positivists, though Carnap and other positivists were actually quite sympathetic to Kuhn’s views. (See the entry on the Vienna Circle .) Kuhn shares with other of his contemporaries, such as Feyerabend and Lakatos, a commitment to a more empirical approach to philosophy of science. Namely, the history of science provides important data, and necessary checks, for philosophy of science, including any theory of scientific method.

The history of science reveals, according to Kuhn, that scientific development occurs in alternating phases. During normal science, the members of the scientific community adhere to the paradigm in place. Their commitment to the paradigm means a commitment to the puzzles to be solved and the acceptable ways of solving them. Confidence in the paradigm remains so long as steady progress is made in solving the shared puzzles. Method in this normal phase operates within a disciplinary matrix (Kuhn’s later concept of a paradigm) which includes standards for problem solving, and defines the range of problems to which the method should be applied. An important part of a disciplinary matrix is the set of values which provide the norms and aims for scientific method. The main values that Kuhn identifies are prediction, problem solving, simplicity, consistency, and plausibility.

An important by-product of normal science is the accumulation of puzzles which cannot be solved with resources of the current paradigm. Once accumulation of these anomalies has reached some critical mass, it can trigger a communal shift to a new paradigm and a new phase of normal science. Importantly, the values that provide the norms and aims for scientific method may have transformed in the meantime. Method may therefore be relative to discipline, time or place

Feyerabend also identified the aims of science as progress, but argued that any methodological prescription would only stifle that progress (Feyerabend 1988). His arguments are grounded in re-examining accepted “myths” about the history of science. Heroes of science, like Galileo, are shown to be just as reliant on rhetoric and persuasion as they are on reason and demonstration. Others, like Aristotle, are shown to be far more reasonable and far-reaching in their outlooks then they are given credit for. As a consequence, the only rule that could provide what he took to be sufficient freedom was the vacuous “anything goes”. More generally, even the methodological restriction that science is the best way to pursue knowledge, and to increase knowledge, is too restrictive. Feyerabend suggested instead that science might, in fact, be a threat to a free society, because it and its myth had become so dominant (Feyerabend 1978).

An even more fundamental kind of criticism was offered by several sociologists of science from the 1970s onwards who rejected the methodology of providing philosophical accounts for the rational development of science and sociological accounts of the irrational mistakes. Instead, they adhered to a symmetry thesis on which any causal explanation of how scientific knowledge is established needs to be symmetrical in explaining truth and falsity, rationality and irrationality, success and mistakes, by the same causal factors (see, e.g., Barnes and Bloor 1982, Bloor 1991). Movements in the Sociology of Science, like the Strong Programme, or in the social dimensions and causes of knowledge more generally led to extended and close examination of detailed case studies in contemporary science and its history. (See the entries on the social dimensions of scientific knowledge and social epistemology .) Well-known examinations by Latour and Woolgar (1979/1986), Knorr-Cetina (1981), Pickering (1984), Shapin and Schaffer (1985) seem to bear out that it was social ideologies (on a macro-scale) or individual interactions and circumstances (on a micro-scale) which were the primary causal factors in determining which beliefs gained the status of scientific knowledge. As they saw it therefore, explanatory appeals to scientific method were not empirically grounded.

A late, and largely unexpected, criticism of scientific method came from within science itself. Beginning in the early 2000s, a number of scientists attempting to replicate the results of published experiments could not do so. There may be close conceptual connection between reproducibility and method. For example, if reproducibility means that the same scientific methods ought to produce the same result, and all scientific results ought to be reproducible, then whatever it takes to reproduce a scientific result ought to be called scientific method. Space limits us to the observation that, insofar as reproducibility is a desired outcome of proper scientific method, it is not strictly a part of scientific method. (See the entry on reproducibility of scientific results .)

By the close of the 20 th century the search for the scientific method was flagging. Nola and Sankey (2000b) could introduce their volume on method by remarking that “For some, the whole idea of a theory of scientific method is yester-year’s debate …”.

Despite the many difficulties that philosophers encountered in trying to providing a clear methodology of conformation (or refutation), still important progress has been made on understanding how observation can provide evidence for a given theory. Work in statistics has been crucial for understanding how theories can be tested empirically, and in recent decades a huge literature has developed that attempts to recast confirmation in Bayesian terms. Here these developments can be covered only briefly, and we refer to the entry on confirmation for further details and references.

Statistics has come to play an increasingly important role in the methodology of the experimental sciences from the 19 th century onwards. At that time, statistics and probability theory took on a methodological role as an analysis of inductive inference, and attempts to ground the rationality of induction in the axioms of probability theory have continued throughout the 20 th century and in to the present. Developments in the theory of statistics itself, meanwhile, have had a direct and immense influence on the experimental method, including methods for measuring the uncertainty of observations such as the Method of Least Squares developed by Legendre and Gauss in the early 19 th century, criteria for the rejection of outliers proposed by Peirce by the mid-19 th century, and the significance tests developed by Gosset (a.k.a. “Student”), Fisher, Neyman & Pearson and others in the 1920s and 1930s (see, e.g., Swijtink 1987 for a brief historical overview; and also the entry on C.S. Peirce ).

These developments within statistics then in turn led to a reflective discussion among both statisticians and philosophers of science on how to perceive the process of hypothesis testing: whether it was a rigorous statistical inference that could provide a numerical expression of the degree of confidence in the tested hypothesis, or if it should be seen as a decision between different courses of actions that also involved a value component. This led to a major controversy among Fisher on the one side and Neyman and Pearson on the other (see especially Fisher 1955, Neyman 1956 and Pearson 1955, and for analyses of the controversy, e.g., Howie 2002, Marks 2000, Lenhard 2006). On Fisher’s view, hypothesis testing was a methodology for when to accept or reject a statistical hypothesis, namely that a hypothesis should be rejected by evidence if this evidence would be unlikely relative to other possible outcomes, given the hypothesis were true. In contrast, on Neyman and Pearson’s view, the consequence of error also had to play a role when deciding between hypotheses. Introducing the distinction between the error of rejecting a true hypothesis (type I error) and accepting a false hypothesis (type II error), they argued that it depends on the consequences of the error to decide whether it is more important to avoid rejecting a true hypothesis or accepting a false one. Hence, Fisher aimed for a theory of inductive inference that enabled a numerical expression of confidence in a hypothesis. To him, the important point was the search for truth, not utility. In contrast, the Neyman-Pearson approach provided a strategy of inductive behaviour for deciding between different courses of action. Here, the important point was not whether a hypothesis was true, but whether one should act as if it was.

Similar discussions are found in the philosophical literature. On the one side, Churchman (1948) and Rudner (1953) argued that because scientific hypotheses can never be completely verified, a complete analysis of the methods of scientific inference includes ethical judgments in which the scientists must decide whether the evidence is sufficiently strong or that the probability is sufficiently high to warrant the acceptance of the hypothesis, which again will depend on the importance of making a mistake in accepting or rejecting the hypothesis. Others, such as Jeffrey (1956) and Levi (1960) disagreed and instead defended a value-neutral view of science on which scientists should bracket their attitudes, preferences, temperament, and values when assessing the correctness of their inferences. For more details on this value-free ideal in the philosophy of science and its historical development, see Douglas (2009) and Howard (2003). For a broad set of case studies examining the role of values in science, see e.g. Elliott & Richards 2017.

In recent decades, philosophical discussions of the evaluation of probabilistic hypotheses by statistical inference have largely focused on Bayesianism that understands probability as a measure of a person’s degree of belief in an event, given the available information, and frequentism that instead understands probability as a long-run frequency of a repeatable event. Hence, for Bayesians probabilities refer to a state of knowledge, whereas for frequentists probabilities refer to frequencies of events (see, e.g., Sober 2008, chapter 1 for a detailed introduction to Bayesianism and frequentism as well as to likelihoodism). Bayesianism aims at providing a quantifiable, algorithmic representation of belief revision, where belief revision is a function of prior beliefs (i.e., background knowledge) and incoming evidence. Bayesianism employs a rule based on Bayes’ theorem, a theorem of the probability calculus which relates conditional probabilities. The probability that a particular hypothesis is true is interpreted as a degree of belief, or credence, of the scientist. There will also be a probability and a degree of belief that a hypothesis will be true conditional on a piece of evidence (an observation, say) being true. Bayesianism proscribes that it is rational for the scientist to update their belief in the hypothesis to that conditional probability should it turn out that the evidence is, in fact, observed (see, e.g., Sprenger & Hartmann 2019 for a comprehensive treatment of Bayesian philosophy of science). Originating in the work of Neyman and Person, frequentism aims at providing the tools for reducing long-run error rates, such as the error-statistical approach developed by Mayo (1996) that focuses on how experimenters can avoid both type I and type II errors by building up a repertoire of procedures that detect errors if and only if they are present. Both Bayesianism and frequentism have developed over time, they are interpreted in different ways by its various proponents, and their relations to previous criticism to attempts at defining scientific method are seen differently by proponents and critics. The literature, surveys, reviews and criticism in this area are vast and the reader is referred to the entries on Bayesian epistemology and confirmation .

5. Method in Practice

Attention to scientific practice, as we have seen, is not itself new. However, the turn to practice in the philosophy of science of late can be seen as a correction to the pessimism with respect to method in philosophy of science in later parts of the 20 th century, and as an attempted reconciliation between sociological and rationalist explanations of scientific knowledge. Much of this work sees method as detailed and context specific problem-solving procedures, and methodological analyses to be at the same time descriptive, critical and advisory (see Nickles 1987 for an exposition of this view). The following section contains a survey of some of the practice focuses. In this section we turn fully to topics rather than chronology.

A problem with the distinction between the contexts of discovery and justification that figured so prominently in philosophy of science in the first half of the 20 th century (see section 2 ) is that no such distinction can be clearly seen in scientific activity (see Arabatzis 2006). Thus, in recent decades, it has been recognized that study of conceptual innovation and change should not be confined to psychology and sociology of science, but are also important aspects of scientific practice which philosophy of science should address (see also the entry on scientific discovery ). Looking for the practices that drive conceptual innovation has led philosophers to examine both the reasoning practices of scientists and the wide realm of experimental practices that are not directed narrowly at testing hypotheses, that is, exploratory experimentation.

Examining the reasoning practices of historical and contemporary scientists, Nersessian (2008) has argued that new scientific concepts are constructed as solutions to specific problems by systematic reasoning, and that of analogy, visual representation and thought-experimentation are among the important reasoning practices employed. These ubiquitous forms of reasoning are reliable—but also fallible—methods of conceptual development and change. On her account, model-based reasoning consists of cycles of construction, simulation, evaluation and adaption of models that serve as interim interpretations of the target problem to be solved. Often, this process will lead to modifications or extensions, and a new cycle of simulation and evaluation. However, Nersessian also emphasizes that

creative model-based reasoning cannot be applied as a simple recipe, is not always productive of solutions, and even its most exemplary usages can lead to incorrect solutions. (Nersessian 2008: 11)

Thus, while on the one hand she agrees with many previous philosophers that there is no logic of discovery, discoveries can derive from reasoned processes, such that a large and integral part of scientific practice is

the creation of concepts through which to comprehend, structure, and communicate about physical phenomena …. (Nersessian 1987: 11)

Similarly, work on heuristics for discovery and theory construction by scholars such as Darden (1991) and Bechtel & Richardson (1993) present science as problem solving and investigate scientific problem solving as a special case of problem-solving in general. Drawing largely on cases from the biological sciences, much of their focus has been on reasoning strategies for the generation, evaluation, and revision of mechanistic explanations of complex systems.

Addressing another aspect of the context distinction, namely the traditional view that the primary role of experiments is to test theoretical hypotheses according to the H-D model, other philosophers of science have argued for additional roles that experiments can play. The notion of exploratory experimentation was introduced to describe experiments driven by the desire to obtain empirical regularities and to develop concepts and classifications in which these regularities can be described (Steinle 1997, 2002; Burian 1997; Waters 2007)). However the difference between theory driven experimentation and exploratory experimentation should not be seen as a sharp distinction. Theory driven experiments are not always directed at testing hypothesis, but may also be directed at various kinds of fact-gathering, such as determining numerical parameters. Vice versa , exploratory experiments are usually informed by theory in various ways and are therefore not theory-free. Instead, in exploratory experiments phenomena are investigated without first limiting the possible outcomes of the experiment on the basis of extant theory about the phenomena.

The development of high throughput instrumentation in molecular biology and neighbouring fields has given rise to a special type of exploratory experimentation that collects and analyses very large amounts of data, and these new ‘omics’ disciplines are often said to represent a break with the ideal of hypothesis-driven science (Burian 2007; Elliott 2007; Waters 2007; O’Malley 2007) and instead described as data-driven research (Leonelli 2012; Strasser 2012) or as a special kind of “convenience experimentation” in which many experiments are done simply because they are extraordinarily convenient to perform (Krohs 2012).

5.2 Computer methods and ‘new ways’ of doing science

The field of omics just described is possible because of the ability of computers to process, in a reasonable amount of time, the huge quantities of data required. Computers allow for more elaborate experimentation (higher speed, better filtering, more variables, sophisticated coordination and control), but also, through modelling and simulations, might constitute a form of experimentation themselves. Here, too, we can pose a version of the general question of method versus practice: does the practice of using computers fundamentally change scientific method, or merely provide a more efficient means of implementing standard methods?

Because computers can be used to automate measurements, quantifications, calculations, and statistical analyses where, for practical reasons, these operations cannot be otherwise carried out, many of the steps involved in reaching a conclusion on the basis of an experiment are now made inside a “black box”, without the direct involvement or awareness of a human. This has epistemological implications, regarding what we can know, and how we can know it. To have confidence in the results, computer methods are therefore subjected to tests of verification and validation.

The distinction between verification and validation is easiest to characterize in the case of computer simulations. In a typical computer simulation scenario computers are used to numerically integrate differential equations for which no analytic solution is available. The equations are part of the model the scientist uses to represent a phenomenon or system under investigation. Verifying a computer simulation means checking that the equations of the model are being correctly approximated. Validating a simulation means checking that the equations of the model are adequate for the inferences one wants to make on the basis of that model.

A number of issues related to computer simulations have been raised. The identification of validity and verification as the testing methods has been criticized. Oreskes et al. (1994) raise concerns that “validiation”, because it suggests deductive inference, might lead to over-confidence in the results of simulations. The distinction itself is probably too clean, since actual practice in the testing of simulations mixes and moves back and forth between the two (Weissart 1997; Parker 2008a; Winsberg 2010). Computer simulations do seem to have a non-inductive character, given that the principles by which they operate are built in by the programmers, and any results of the simulation follow from those in-built principles in such a way that those results could, in principle, be deduced from the program code and its inputs. The status of simulations as experiments has therefore been examined (Kaufmann and Smarr 1993; Humphreys 1995; Hughes 1999; Norton and Suppe 2001). This literature considers the epistemology of these experiments: what we can learn by simulation, and also the kinds of justifications which can be given in applying that knowledge to the “real” world. (Mayo 1996; Parker 2008b). As pointed out, part of the advantage of computer simulation derives from the fact that huge numbers of calculations can be carried out without requiring direct observation by the experimenter/​simulator. At the same time, many of these calculations are approximations to the calculations which would be performed first-hand in an ideal situation. Both factors introduce uncertainties into the inferences drawn from what is observed in the simulation.

For many of the reasons described above, computer simulations do not seem to belong clearly to either the experimental or theoretical domain. Rather, they seem to crucially involve aspects of both. This has led some authors, such as Fox Keller (2003: 200) to argue that we ought to consider computer simulation a “qualitatively different way of doing science”. The literature in general tends to follow Kaufmann and Smarr (1993) in referring to computer simulation as a “third way” for scientific methodology (theoretical reasoning and experimental practice are the first two ways.). It should also be noted that the debates around these issues have tended to focus on the form of computer simulation typical in the physical sciences, where models are based on dynamical equations. Other forms of simulation might not have the same problems, or have problems of their own (see the entry on computer simulations in science ).

In recent years, the rapid development of machine learning techniques has prompted some scholars to suggest that the scientific method has become “obsolete” (Anderson 2008, Carrol and Goodstein 2009). This has resulted in an intense debate on the relative merit of data-driven and hypothesis-driven research (for samples, see e.g. Mazzocchi 2015 or Succi and Coveney 2018). For a detailed treatment of this topic, we refer to the entry scientific research and big data .

6. Discourse on scientific method

Despite philosophical disagreements, the idea of the scientific method still figures prominently in contemporary discourse on many different topics, both within science and in society at large. Often, reference to scientific method is used in ways that convey either the legend of a single, universal method characteristic of all science, or grants to a particular method or set of methods privilege as a special ‘gold standard’, often with reference to particular philosophers to vindicate the claims. Discourse on scientific method also typically arises when there is a need to distinguish between science and other activities, or for justifying the special status conveyed to science. In these areas, the philosophical attempts at identifying a set of methods characteristic for scientific endeavors are closely related to the philosophy of science’s classical problem of demarcation (see the entry on science and pseudo-science ) and to the philosophical analysis of the social dimension of scientific knowledge and the role of science in democratic society.

One of the settings in which the legend of a single, universal scientific method has been particularly strong is science education (see, e.g., Bauer 1992; McComas 1996; Wivagg & Allchin 2002). [ 5 ] Often, ‘the scientific method’ is presented in textbooks and educational web pages as a fixed four or five step procedure starting from observations and description of a phenomenon and progressing over formulation of a hypothesis which explains the phenomenon, designing and conducting experiments to test the hypothesis, analyzing the results, and ending with drawing a conclusion. Such references to a universal scientific method can be found in educational material at all levels of science education (Blachowicz 2009), and numerous studies have shown that the idea of a general and universal scientific method often form part of both students’ and teachers’ conception of science (see, e.g., Aikenhead 1987; Osborne et al. 2003). In response, it has been argued that science education need to focus more on teaching about the nature of science, although views have differed on whether this is best done through student-led investigations, contemporary cases, or historical cases (Allchin, Andersen & Nielsen 2014)

Although occasionally phrased with reference to the H-D method, important historical roots of the legend in science education of a single, universal scientific method are the American philosopher and psychologist Dewey’s account of inquiry in How We Think (1910) and the British mathematician Karl Pearson’s account of science in Grammar of Science (1892). On Dewey’s account, inquiry is divided into the five steps of

(i) a felt difficulty, (ii) its location and definition, (iii) suggestion of a possible solution, (iv) development by reasoning of the bearing of the suggestions, (v) further observation and experiment leading to its acceptance or rejection. (Dewey 1910: 72)

Similarly, on Pearson’s account, scientific investigations start with measurement of data and observation of their correction and sequence from which scientific laws can be discovered with the aid of creative imagination. These laws have to be subject to criticism, and their final acceptance will have equal validity for “all normally constituted minds”. Both Dewey’s and Pearson’s accounts should be seen as generalized abstractions of inquiry and not restricted to the realm of science—although both Dewey and Pearson referred to their respective accounts as ‘the scientific method’.

Occasionally, scientists make sweeping statements about a simple and distinct scientific method, as exemplified by Feynman’s simplified version of a conjectures and refutations method presented, for example, in the last of his 1964 Cornell Messenger lectures. [ 6 ] However, just as often scientists have come to the same conclusion as recent philosophy of science that there is not any unique, easily described scientific method. For example, the physicist and Nobel Laureate Weinberg described in the paper “The Methods of Science … And Those By Which We Live” (1995) how

The fact that the standards of scientific success shift with time does not only make the philosophy of science difficult; it also raises problems for the public understanding of science. We do not have a fixed scientific method to rally around and defend. (1995: 8)

Interview studies with scientists on their conception of method shows that scientists often find it hard to figure out whether available evidence confirms their hypothesis, and that there are no direct translations between general ideas about method and specific strategies to guide how research is conducted (Schickore & Hangel 2019, Hangel & Schickore 2017)

Reference to the scientific method has also often been used to argue for the scientific nature or special status of a particular activity. Philosophical positions that argue for a simple and unique scientific method as a criterion of demarcation, such as Popperian falsification, have often attracted practitioners who felt that they had a need to defend their domain of practice. For example, references to conjectures and refutation as the scientific method are abundant in much of the literature on complementary and alternative medicine (CAM)—alongside the competing position that CAM, as an alternative to conventional biomedicine, needs to develop its own methodology different from that of science.

Also within mainstream science, reference to the scientific method is used in arguments regarding the internal hierarchy of disciplines and domains. A frequently seen argument is that research based on the H-D method is superior to research based on induction from observations because in deductive inferences the conclusion follows necessarily from the premises. (See, e.g., Parascandola 1998 for an analysis of how this argument has been made to downgrade epidemiology compared to the laboratory sciences.) Similarly, based on an examination of the practices of major funding institutions such as the National Institutes of Health (NIH), the National Science Foundation (NSF) and the Biomedical Sciences Research Practices (BBSRC) in the UK, O’Malley et al. (2009) have argued that funding agencies seem to have a tendency to adhere to the view that the primary activity of science is to test hypotheses, while descriptive and exploratory research is seen as merely preparatory activities that are valuable only insofar as they fuel hypothesis-driven research.

In some areas of science, scholarly publications are structured in a way that may convey the impression of a neat and linear process of inquiry from stating a question, devising the methods by which to answer it, collecting the data, to drawing a conclusion from the analysis of data. For example, the codified format of publications in most biomedical journals known as the IMRAD format (Introduction, Method, Results, Analysis, Discussion) is explicitly described by the journal editors as “not an arbitrary publication format but rather a direct reflection of the process of scientific discovery” (see the so-called “Vancouver Recommendations”, ICMJE 2013: 11). However, scientific publications do not in general reflect the process by which the reported scientific results were produced. For example, under the provocative title “Is the scientific paper a fraud?”, Medawar argued that scientific papers generally misrepresent how the results have been produced (Medawar 1963/1996). Similar views have been advanced by philosophers, historians and sociologists of science (Gilbert 1976; Holmes 1987; Knorr-Cetina 1981; Schickore 2008; Suppe 1998) who have argued that scientists’ experimental practices are messy and often do not follow any recognizable pattern. Publications of research results, they argue, are retrospective reconstructions of these activities that often do not preserve the temporal order or the logic of these activities, but are instead often constructed in order to screen off potential criticism (see Schickore 2008 for a review of this work).

Philosophical positions on the scientific method have also made it into the court room, especially in the US where judges have drawn on philosophy of science in deciding when to confer special status to scientific expert testimony. A key case is Daubert vs Merrell Dow Pharmaceuticals (92–102, 509 U.S. 579, 1993). In this case, the Supreme Court argued in its 1993 ruling that trial judges must ensure that expert testimony is reliable, and that in doing this the court must look at the expert’s methodology to determine whether the proffered evidence is actually scientific knowledge. Further, referring to works of Popper and Hempel the court stated that

ordinarily, a key question to be answered in determining whether a theory or technique is scientific knowledge … is whether it can be (and has been) tested. (Justice Blackmun, Daubert v. Merrell Dow Pharmaceuticals; see Other Internet Resources for a link to the opinion)

But as argued by Haack (2005a,b, 2010) and by Foster & Hubner (1999), by equating the question of whether a piece of testimony is reliable with the question whether it is scientific as indicated by a special methodology, the court was producing an inconsistent mixture of Popper’s and Hempel’s philosophies, and this has later led to considerable confusion in subsequent case rulings that drew on the Daubert case (see Haack 2010 for a detailed exposition).

The difficulties around identifying the methods of science are also reflected in the difficulties of identifying scientific misconduct in the form of improper application of the method or methods of science. One of the first and most influential attempts at defining misconduct in science was the US definition from 1989 that defined misconduct as

fabrication, falsification, plagiarism, or other practices that seriously deviate from those that are commonly accepted within the scientific community . (Code of Federal Regulations, part 50, subpart A., August 8, 1989, italics added)

However, the “other practices that seriously deviate” clause was heavily criticized because it could be used to suppress creative or novel science. For example, the National Academy of Science stated in their report Responsible Science (1992) that it

wishes to discourage the possibility that a misconduct complaint could be lodged against scientists based solely on their use of novel or unorthodox research methods. (NAS: 27)

This clause was therefore later removed from the definition. For an entry into the key philosophical literature on conduct in science, see Shamoo & Resnick (2009).

The question of the source of the success of science has been at the core of philosophy since the beginning of modern science. If viewed as a matter of epistemology more generally, scientific method is a part of the entire history of philosophy. Over that time, science and whatever methods its practitioners may employ have changed dramatically. Today, many philosophers have taken up the banners of pluralism or of practice to focus on what are, in effect, fine-grained and contextually limited examinations of scientific method. Others hope to shift perspectives in order to provide a renewed general account of what characterizes the activity we call science.

One such perspective has been offered recently by Hoyningen-Huene (2008, 2013), who argues from the history of philosophy of science that after three lengthy phases of characterizing science by its method, we are now in a phase where the belief in the existence of a positive scientific method has eroded and what has been left to characterize science is only its fallibility. First was a phase from Plato and Aristotle up until the 17 th century where the specificity of scientific knowledge was seen in its absolute certainty established by proof from evident axioms; next was a phase up to the mid-19 th century in which the means to establish the certainty of scientific knowledge had been generalized to include inductive procedures as well. In the third phase, which lasted until the last decades of the 20 th century, it was recognized that empirical knowledge was fallible, but it was still granted a special status due to its distinctive mode of production. But now in the fourth phase, according to Hoyningen-Huene, historical and philosophical studies have shown how “scientific methods with the characteristics as posited in the second and third phase do not exist” (2008: 168) and there is no longer any consensus among philosophers and historians of science about the nature of science. For Hoyningen-Huene, this is too negative a stance, and he therefore urges the question about the nature of science anew. His own answer to this question is that “scientific knowledge differs from other kinds of knowledge, especially everyday knowledge, primarily by being more systematic” (Hoyningen-Huene 2013: 14). Systematicity can have several different dimensions: among them are more systematic descriptions, explanations, predictions, defense of knowledge claims, epistemic connectedness, ideal of completeness, knowledge generation, representation of knowledge and critical discourse. Hence, what characterizes science is the greater care in excluding possible alternative explanations, the more detailed elaboration with respect to data on which predictions are based, the greater care in detecting and eliminating sources of error, the more articulate connections to other pieces of knowledge, etc. On this position, what characterizes science is not that the methods employed are unique to science, but that the methods are more carefully employed.

Another, similar approach has been offered by Haack (2003). She sets off, similar to Hoyningen-Huene, from a dissatisfaction with the recent clash between what she calls Old Deferentialism and New Cynicism. The Old Deferentialist position is that science progressed inductively by accumulating true theories confirmed by empirical evidence or deductively by testing conjectures against basic statements; while the New Cynics position is that science has no epistemic authority and no uniquely rational method and is merely just politics. Haack insists that contrary to the views of the New Cynics, there are objective epistemic standards, and there is something epistemologically special about science, even though the Old Deferentialists pictured this in a wrong way. Instead, she offers a new Critical Commonsensist account on which standards of good, strong, supportive evidence and well-conducted, honest, thorough and imaginative inquiry are not exclusive to the sciences, but the standards by which we judge all inquirers. In this sense, science does not differ in kind from other kinds of inquiry, but it may differ in the degree to which it requires broad and detailed background knowledge and a familiarity with a technical vocabulary that only specialists may possess.

  • Aikenhead, G.S., 1987, “High-school graduates’ beliefs about science-technology-society. III. Characteristics and limitations of scientific knowledge”, Science Education , 71(4): 459–487.
  • Allchin, D., H.M. Andersen and K. Nielsen, 2014, “Complementary Approaches to Teaching Nature of Science: Integrating Student Inquiry, Historical Cases, and Contemporary Cases in Classroom Practice”, Science Education , 98: 461–486.
  • Anderson, C., 2008, “The end of theory: The data deluge makes the scientific method obsolete”, Wired magazine , 16(7): 16–07
  • Arabatzis, T., 2006, “On the inextricability of the context of discovery and the context of justification”, in Revisiting Discovery and Justification , J. Schickore and F. Steinle (eds.), Dordrecht: Springer, pp. 215–230.
  • Barnes, J. (ed.), 1984, The Complete Works of Aristotle, Vols I and II , Princeton: Princeton University Press.
  • Barnes, B. and D. Bloor, 1982, “Relativism, Rationalism, and the Sociology of Knowledge”, in Rationality and Relativism , M. Hollis and S. Lukes (eds.), Cambridge: MIT Press, pp. 1–20.
  • Bauer, H.H., 1992, Scientific Literacy and the Myth of the Scientific Method , Urbana: University of Illinois Press.
  • Bechtel, W. and R.C. Richardson, 1993, Discovering complexity , Princeton, NJ: Princeton University Press.
  • Berkeley, G., 1734, The Analyst in De Motu and The Analyst: A Modern Edition with Introductions and Commentary , D. Jesseph (trans. and ed.), Dordrecht: Kluwer Academic Publishers, 1992.
  • Blachowicz, J., 2009, “How science textbooks treat scientific method: A philosopher’s perspective”, The British Journal for the Philosophy of Science , 60(2): 303–344.
  • Bloor, D., 1991, Knowledge and Social Imagery , Chicago: University of Chicago Press, 2 nd edition.
  • Boyle, R., 1682, New experiments physico-mechanical, touching the air , Printed by Miles Flesher for Richard Davis, bookseller in Oxford.
  • Bridgman, P.W., 1927, The Logic of Modern Physics , New York: Macmillan.
  • –––, 1956, “The Methodological Character of Theoretical Concepts”, in The Foundations of Science and the Concepts of Science and Psychology , Herbert Feigl and Michael Scriven (eds.), Minnesota: University of Minneapolis Press, pp. 38–76.
  • Burian, R., 1997, “Exploratory Experimentation and the Role of Histochemical Techniques in the Work of Jean Brachet, 1938–1952”, History and Philosophy of the Life Sciences , 19(1): 27–45.
  • –––, 2007, “On microRNA and the need for exploratory experimentation in post-genomic molecular biology”, History and Philosophy of the Life Sciences , 29(3): 285–311.
  • Carnap, R., 1928, Der logische Aufbau der Welt , Berlin: Bernary, transl. by R.A. George, The Logical Structure of the World , Berkeley: University of California Press, 1967.
  • –––, 1956, “The methodological character of theoretical concepts”, Minnesota studies in the philosophy of science , 1: 38–76.
  • Carrol, S., and D. Goodstein, 2009, “Defining the scientific method”, Nature Methods , 6: 237.
  • Churchman, C.W., 1948, “Science, Pragmatics, Induction”, Philosophy of Science , 15(3): 249–268.
  • Cooper, J. (ed.), 1997, Plato: Complete Works , Indianapolis: Hackett.
  • Darden, L., 1991, Theory Change in Science: Strategies from Mendelian Genetics , Oxford: Oxford University Press
  • Dewey, J., 1910, How we think , New York: Dover Publications (reprinted 1997).
  • Douglas, H., 2009, Science, Policy, and the Value-Free Ideal , Pittsburgh: University of Pittsburgh Press.
  • Dupré, J., 2004, “Miracle of Monism ”, in Naturalism in Question , Mario De Caro and David Macarthur (eds.), Cambridge, MA: Harvard University Press, pp. 36–58.
  • Elliott, K.C., 2007, “Varieties of exploratory experimentation in nanotoxicology”, History and Philosophy of the Life Sciences , 29(3): 311–334.
  • Elliott, K. C., and T. Richards (eds.), 2017, Exploring inductive risk: Case studies of values in science , Oxford: Oxford University Press.
  • Falcon, Andrea, 2005, Aristotle and the science of nature: Unity without uniformity , Cambridge: Cambridge University Press.
  • Feyerabend, P., 1978, Science in a Free Society , London: New Left Books
  • –––, 1988, Against Method , London: Verso, 2 nd edition.
  • Fisher, R.A., 1955, “Statistical Methods and Scientific Induction”, Journal of The Royal Statistical Society. Series B (Methodological) , 17(1): 69–78.
  • Foster, K. and P.W. Huber, 1999, Judging Science. Scientific Knowledge and the Federal Courts , Cambridge: MIT Press.
  • Fox Keller, E., 2003, “Models, Simulation, and ‘computer experiments’”, in The Philosophy of Scientific Experimentation , H. Radder (ed.), Pittsburgh: Pittsburgh University Press, 198–215.
  • Gilbert, G., 1976, “The transformation of research findings into scientific knowledge”, Social Studies of Science , 6: 281–306.
  • Gimbel, S., 2011, Exploring the Scientific Method , Chicago: University of Chicago Press.
  • Goodman, N., 1965, Fact , Fiction, and Forecast , Indianapolis: Bobbs-Merrill.
  • Haack, S., 1995, “Science is neither sacred nor a confidence trick”, Foundations of Science , 1(3): 323–335.
  • –––, 2003, Defending science—within reason , Amherst: Prometheus.
  • –––, 2005a, “Disentangling Daubert: an epistemological study in theory and practice”, Journal of Philosophy, Science and Law , 5, Haack 2005a available online . doi:10.5840/jpsl2005513
  • –––, 2005b, “Trial and error: The Supreme Court’s philosophy of science”, American Journal of Public Health , 95: S66-S73.
  • –––, 2010, “Federal Philosophy of Science: A Deconstruction-and a Reconstruction”, NYUJL & Liberty , 5: 394.
  • Hangel, N. and J. Schickore, 2017, “Scientists’ conceptions of good research practice”, Perspectives on Science , 25(6): 766–791
  • Harper, W.L., 2011, Isaac Newton’s Scientific Method: Turning Data into Evidence about Gravity and Cosmology , Oxford: Oxford University Press.
  • Hempel, C., 1950, “Problems and Changes in the Empiricist Criterion of Meaning”, Revue Internationale de Philosophie , 41(11): 41–63.
  • –––, 1951, “The Concept of Cognitive Significance: A Reconsideration”, Proceedings of the American Academy of Arts and Sciences , 80(1): 61–77.
  • –––, 1965, Aspects of scientific explanation and other essays in the philosophy of science , New York–London: Free Press.
  • –––, 1966, Philosophy of Natural Science , Englewood Cliffs: Prentice-Hall.
  • Holmes, F.L., 1987, “Scientific writing and scientific discovery”, Isis , 78(2): 220–235.
  • Howard, D., 2003, “Two left turns make a right: On the curious political career of North American philosophy of science at midcentury”, in Logical Empiricism in North America , G.L. Hardcastle & A.W. Richardson (eds.), Minneapolis: University of Minnesota Press, pp. 25–93.
  • Hoyningen-Huene, P., 2008, “Systematicity: The nature of science”, Philosophia , 36(2): 167–180.
  • –––, 2013, Systematicity. The Nature of Science , Oxford: Oxford University Press.
  • Howie, D., 2002, Interpreting probability: Controversies and developments in the early twentieth century , Cambridge: Cambridge University Press.
  • Hughes, R., 1999, “The Ising Model, Computer Simulation, and Universal Physics”, in Models as Mediators , M. Morgan and M. Morrison (eds.), Cambridge: Cambridge University Press, pp. 97–145
  • Hume, D., 1739, A Treatise of Human Nature , D. Fate Norton and M.J. Norton (eds.), Oxford: Oxford University Press, 2000.
  • Humphreys, P., 1995, “Computational science and scientific method”, Minds and Machines , 5(1): 499–512.
  • ICMJE, 2013, “Recommendations for the Conduct, Reporting, Editing, and Publication of Scholarly Work in Medical Journals”, International Committee of Medical Journal Editors, available online , accessed August 13 2014
  • Jeffrey, R.C., 1956, “Valuation and Acceptance of Scientific Hypotheses”, Philosophy of Science , 23(3): 237–246.
  • Kaufmann, W.J., and L.L. Smarr, 1993, Supercomputing and the Transformation of Science , New York: Scientific American Library.
  • Knorr-Cetina, K., 1981, The Manufacture of Knowledge , Oxford: Pergamon Press.
  • Krohs, U., 2012, “Convenience experimentation”, Studies in History and Philosophy of Biological and BiomedicalSciences , 43: 52–57.
  • Kuhn, T.S., 1962, The Structure of Scientific Revolutions , Chicago: University of Chicago Press
  • Latour, B. and S. Woolgar, 1986, Laboratory Life: The Construction of Scientific Facts , Princeton: Princeton University Press, 2 nd edition.
  • Laudan, L., 1968, “Theories of scientific method from Plato to Mach”, History of Science , 7(1): 1–63.
  • Lenhard, J., 2006, “Models and statistical inference: The controversy between Fisher and Neyman-Pearson”, The British Journal for the Philosophy of Science , 57(1): 69–91.
  • Leonelli, S., 2012, “Making Sense of Data-Driven Research in the Biological and the Biomedical Sciences”, Studies in the History and Philosophy of the Biological and Biomedical Sciences , 43(1): 1–3.
  • Levi, I., 1960, “Must the scientist make value judgments?”, Philosophy of Science , 57(11): 345–357
  • Lindley, D., 1991, Theory Change in Science: Strategies from Mendelian Genetics , Oxford: Oxford University Press.
  • Lipton, P., 2004, Inference to the Best Explanation , London: Routledge, 2 nd edition.
  • Marks, H.M., 2000, The progress of experiment: science and therapeutic reform in the United States, 1900–1990 , Cambridge: Cambridge University Press.
  • Mazzochi, F., 2015, “Could Big Data be the end of theory in science?”, EMBO reports , 16: 1250–1255.
  • Mayo, D.G., 1996, Error and the Growth of Experimental Knowledge , Chicago: University of Chicago Press.
  • McComas, W.F., 1996, “Ten myths of science: Reexamining what we think we know about the nature of science”, School Science and Mathematics , 96(1): 10–16.
  • Medawar, P.B., 1963/1996, “Is the scientific paper a fraud”, in The Strange Case of the Spotted Mouse and Other Classic Essays on Science , Oxford: Oxford University Press, 33–39.
  • Mill, J.S., 1963, Collected Works of John Stuart Mill , J. M. Robson (ed.), Toronto: University of Toronto Press
  • NAS, 1992, Responsible Science: Ensuring the integrity of the research process , Washington DC: National Academy Press.
  • Nersessian, N.J., 1987, “A cognitive-historical approach to meaning in scientific theories”, in The process of science , N. Nersessian (ed.), Berlin: Springer, pp. 161–177.
  • –––, 2008, Creating Scientific Concepts , Cambridge: MIT Press.
  • Newton, I., 1726, Philosophiae naturalis Principia Mathematica (3 rd edition), in The Principia: Mathematical Principles of Natural Philosophy: A New Translation , I.B. Cohen and A. Whitman (trans.), Berkeley: University of California Press, 1999.
  • –––, 1704, Opticks or A Treatise of the Reflections, Refractions, Inflections & Colors of Light , New York: Dover Publications, 1952.
  • Neyman, J., 1956, “Note on an Article by Sir Ronald Fisher”, Journal of the Royal Statistical Society. Series B (Methodological) , 18: 288–294.
  • Nickles, T., 1987, “Methodology, heuristics, and rationality”, in Rational changes in science: Essays on Scientific Reasoning , J.C. Pitt (ed.), Berlin: Springer, pp. 103–132.
  • Nicod, J., 1924, Le problème logique de l’induction , Paris: Alcan. (Engl. transl. “The Logical Problem of Induction”, in Foundations of Geometry and Induction , London: Routledge, 2000.)
  • Nola, R. and H. Sankey, 2000a, “A selective survey of theories of scientific method”, in Nola and Sankey 2000b: 1–65.
  • –––, 2000b, After Popper, Kuhn and Feyerabend. Recent Issues in Theories of Scientific Method , London: Springer.
  • –––, 2007, Theories of Scientific Method , Stocksfield: Acumen.
  • Norton, S., and F. Suppe, 2001, “Why atmospheric modeling is good science”, in Changing the Atmosphere: Expert Knowledge and Environmental Governance , C. Miller and P. Edwards (eds.), Cambridge, MA: MIT Press, 88–133.
  • O’Malley, M., 2007, “Exploratory experimentation and scientific practice: Metagenomics and the proteorhodopsin case”, History and Philosophy of the Life Sciences , 29(3): 337–360.
  • O’Malley, M., C. Haufe, K. Elliot, and R. Burian, 2009, “Philosophies of Funding”, Cell , 138: 611–615.
  • Oreskes, N., K. Shrader-Frechette, and K. Belitz, 1994, “Verification, Validation and Confirmation of Numerical Models in the Earth Sciences”, Science , 263(5147): 641–646.
  • Osborne, J., S. Simon, and S. Collins, 2003, “Attitudes towards science: a review of the literature and its implications”, International Journal of Science Education , 25(9): 1049–1079.
  • Parascandola, M., 1998, “Epidemiology—2 nd -Rate Science”, Public Health Reports , 113(4): 312–320.
  • Parker, W., 2008a, “Franklin, Holmes and the Epistemology of Computer Simulation”, International Studies in the Philosophy of Science , 22(2): 165–83.
  • –––, 2008b, “Computer Simulation through an Error-Statistical Lens”, Synthese , 163(3): 371–84.
  • Pearson, K. 1892, The Grammar of Science , London: J.M. Dents and Sons, 1951
  • Pearson, E.S., 1955, “Statistical Concepts in Their Relation to Reality”, Journal of the Royal Statistical Society , B, 17: 204–207.
  • Pickering, A., 1984, Constructing Quarks: A Sociological History of Particle Physics , Edinburgh: Edinburgh University Press.
  • Popper, K.R., 1959, The Logic of Scientific Discovery , London: Routledge, 2002
  • –––, 1963, Conjectures and Refutations , London: Routledge, 2002.
  • –––, 1985, Unended Quest: An Intellectual Autobiography , La Salle: Open Court Publishing Co..
  • Rudner, R., 1953, “The Scientist Qua Scientist Making Value Judgments”, Philosophy of Science , 20(1): 1–6.
  • Rudolph, J.L., 2005, “Epistemology for the masses: The origin of ‘The Scientific Method’ in American Schools”, History of Education Quarterly , 45(3): 341–376
  • Schickore, J., 2008, “Doing science, writing science”, Philosophy of Science , 75: 323–343.
  • Schickore, J. and N. Hangel, 2019, “‘It might be this, it should be that…’ uncertainty and doubt in day-to-day science practice”, European Journal for Philosophy of Science , 9(2): 31. doi:10.1007/s13194-019-0253-9
  • Shamoo, A.E. and D.B. Resnik, 2009, Responsible Conduct of Research , Oxford: Oxford University Press.
  • Shank, J.B., 2008, The Newton Wars and the Beginning of the French Enlightenment , Chicago: The University of Chicago Press.
  • Shapin, S. and S. Schaffer, 1985, Leviathan and the air-pump , Princeton: Princeton University Press.
  • Smith, G.E., 2002, “The Methodology of the Principia”, in The Cambridge Companion to Newton , I.B. Cohen and G.E. Smith (eds.), Cambridge: Cambridge University Press, 138–173.
  • Snyder, L.J., 1997a, “Discoverers’ Induction”, Philosophy of Science , 64: 580–604.
  • –––, 1997b, “The Mill-Whewell Debate: Much Ado About Induction”, Perspectives on Science , 5: 159–198.
  • –––, 1999, “Renovating the Novum Organum: Bacon, Whewell and Induction”, Studies in History and Philosophy of Science , 30: 531–557.
  • Sober, E., 2008, Evidence and Evolution. The logic behind the science , Cambridge: Cambridge University Press
  • Sprenger, J. and S. Hartmann, 2019, Bayesian philosophy of science , Oxford: Oxford University Press.
  • Steinle, F., 1997, “Entering New Fields: Exploratory Uses of Experimentation”, Philosophy of Science (Proceedings), 64: S65–S74.
  • –––, 2002, “Experiments in History and Philosophy of Science”, Perspectives on Science , 10(4): 408–432.
  • Strasser, B.J., 2012, “Data-driven sciences: From wonder cabinets to electronic databases”, Studies in History and Philosophy of Science Part C: Studies in History and Philosophy of Biological and Biomedical Sciences , 43(1): 85–87.
  • Succi, S. and P.V. Coveney, 2018, “Big data: the end of the scientific method?”, Philosophical Transactions of the Royal Society A , 377: 20180145. doi:10.1098/rsta.2018.0145
  • Suppe, F., 1998, “The Structure of a Scientific Paper”, Philosophy of Science , 65(3): 381–405.
  • Swijtink, Z.G., 1987, “The objectification of observation: Measurement and statistical methods in the nineteenth century”, in The probabilistic revolution. Ideas in History, Vol. 1 , L. Kruger (ed.), Cambridge MA: MIT Press, pp. 261–285.
  • Waters, C.K., 2007, “The nature and context of exploratory experimentation: An introduction to three case studies of exploratory research”, History and Philosophy of the Life Sciences , 29(3): 275–284.
  • Weinberg, S., 1995, “The methods of science… and those by which we live”, Academic Questions , 8(2): 7–13.
  • Weissert, T., 1997, The Genesis of Simulation in Dynamics: Pursuing the Fermi-Pasta-Ulam Problem , New York: Springer Verlag.
  • William H., 1628, Exercitatio Anatomica de Motu Cordis et Sanguinis in Animalibus , in On the Motion of the Heart and Blood in Animals , R. Willis (trans.), Buffalo: Prometheus Books, 1993.
  • Winsberg, E., 2010, Science in the Age of Computer Simulation , Chicago: University of Chicago Press.
  • Wivagg, D. & D. Allchin, 2002, “The Dogma of the Scientific Method”, The American Biology Teacher , 64(9): 645–646
How to cite this entry . Preview the PDF version of this entry at the Friends of the SEP Society . Look up topics and thinkers related to this entry at the Internet Philosophy Ontology Project (InPhO). Enhanced bibliography for this entry at PhilPapers , with links to its database.
  • Blackmun opinion , in Daubert v. Merrell Dow Pharmaceuticals (92–102), 509 U.S. 579 (1993).
  • Scientific Method at philpapers. Darrell Rowbottom (ed.).
  • Recent Articles | Scientific Method | The Scientist Magazine

al-Kindi | Albert the Great [= Albertus magnus] | Aquinas, Thomas | Arabic and Islamic Philosophy, disciplines in: natural philosophy and natural science | Arabic and Islamic Philosophy, historical and methodological topics in: Greek sources | Arabic and Islamic Philosophy, historical and methodological topics in: influence of Arabic and Islamic Philosophy on the Latin West | Aristotle | Bacon, Francis | Bacon, Roger | Berkeley, George | biology: experiment in | Boyle, Robert | Cambridge Platonists | confirmation | Descartes, René | Enlightenment | epistemology | epistemology: Bayesian | epistemology: social | Feyerabend, Paul | Galileo Galilei | Grosseteste, Robert | Hempel, Carl | Hume, David | Hume, David: Newtonianism and Anti-Newtonianism | induction: problem of | Kant, Immanuel | Kuhn, Thomas | Leibniz, Gottfried Wilhelm | Locke, John | Mill, John Stuart | More, Henry | Neurath, Otto | Newton, Isaac | Newton, Isaac: philosophy | Ockham [Occam], William | operationalism | Peirce, Charles Sanders | Plato | Popper, Karl | rationality: historicist theories of | Reichenbach, Hans | reproducibility, scientific | Schlick, Moritz | science: and pseudo-science | science: theory and observation in | science: unity of | scientific discovery | scientific knowledge: social dimensions of | simulations in science | skepticism: medieval | space and time: absolute and relational space and motion, post-Newtonian theories | Vienna Circle | Whewell, William | Zabarella, Giacomo

Copyright © 2021 by Brian Hepburn < brian . hepburn @ wichita . edu > Hanne Andersen < hanne . andersen @ ind . ku . dk >

  • Accessibility

Support SEP

Mirror sites.

View this site from another server:

  • Info about mirror sites

The Stanford Encyclopedia of Philosophy is copyright © 2023 by The Metaphysics Research Lab , Department of Philosophy, Stanford University

Library of Congress Catalog Data: ISSN 1095-5054

What Is a Hypothesis? (Science)

If...,Then...

Angela Lumsden/Getty Images

  • Scientific Method
  • Chemical Laws
  • Periodic Table
  • Projects & Experiments
  • Biochemistry
  • Physical Chemistry
  • Medical Chemistry
  • Chemistry In Everyday Life
  • Famous Chemists
  • Activities for Kids
  • Abbreviations & Acronyms
  • Weather & Climate
  • Ph.D., Biomedical Sciences, University of Tennessee at Knoxville
  • B.A., Physics and Mathematics, Hastings College

A hypothesis (plural hypotheses) is a proposed explanation for an observation. The definition depends on the subject.

In science, a hypothesis is part of the scientific method. It is a prediction or explanation that is tested by an experiment. Observations and experiments may disprove a scientific hypothesis, but can never entirely prove one.

In the study of logic, a hypothesis is an if-then proposition, typically written in the form, "If X , then Y ."

In common usage, a hypothesis is simply a proposed explanation or prediction, which may or may not be tested.

Writing a Hypothesis

Most scientific hypotheses are proposed in the if-then format because it's easy to design an experiment to see whether or not a cause and effect relationship exists between the independent variable and the dependent variable . The hypothesis is written as a prediction of the outcome of the experiment.

  • Null Hypothesis and Alternative Hypothesis

Statistically, it's easier to show there is no relationship between two variables than to support their connection. So, scientists often propose the null hypothesis . The null hypothesis assumes changing the independent variable will have no effect on the dependent variable.

In contrast, the alternative hypothesis suggests changing the independent variable will have an effect on the dependent variable. Designing an experiment to test this hypothesis can be trickier because there are many ways to state an alternative hypothesis.

For example, consider a possible relationship between getting a good night's sleep and getting good grades. The null hypothesis might be stated: "The number of hours of sleep students get is unrelated to their grades" or "There is no correlation between hours of sleep and grades."

An experiment to test this hypothesis might involve collecting data, recording average hours of sleep for each student and grades. If a student who gets eight hours of sleep generally does better than students who get four hours of sleep or 10 hours of sleep, the hypothesis might be rejected.

But the alternative hypothesis is harder to propose and test. The most general statement would be: "The amount of sleep students get affects their grades." The hypothesis might also be stated as "If you get more sleep, your grades will improve" or "Students who get nine hours of sleep have better grades than those who get more or less sleep."

In an experiment, you can collect the same data, but the statistical analysis is less likely to give you a high confidence limit.

Usually, a scientist starts out with the null hypothesis. From there, it may be possible to propose and test an alternative hypothesis, to narrow down the relationship between the variables.

Example of a Hypothesis

Examples of a hypothesis include:

  • If you drop a rock and a feather, (then) they will fall at the same rate.
  • Plants need sunlight in order to live. (if sunlight, then life)
  • Eating sugar gives you energy. (if sugar, then energy)
  • White, Jay D.  Research in Public Administration . Conn., 1998.
  • Schick, Theodore, and Lewis Vaughn.  How to Think about Weird Things: Critical Thinking for a New Age . McGraw-Hill Higher Education, 2002.
  • Null Hypothesis Definition and Examples
  • Definition of a Hypothesis
  • What Are the Elements of a Good Hypothesis?
  • Six Steps of the Scientific Method
  • What Are Examples of a Hypothesis?
  • Understanding Simple vs Controlled Experiments
  • Scientific Method Flow Chart
  • Scientific Method Vocabulary Terms
  • What Is a Testable Hypothesis?
  • Null Hypothesis Examples
  • What 'Fail to Reject' Means in a Hypothesis Test
  • How To Design a Science Fair Experiment
  • What Is an Experiment? Definition and Design
  • Hypothesis Test for the Difference of Two Population Proportions
  • How to Conduct a Hypothesis Test

Gregg Henriques Ph.D.

Evolutionary Psychology

The conceptual unification of psychology, a unified framework will benefit the field.

Posted February 29, 2012

Have you ever looked across the field of psychology, and noted with some dismay all the various approaches taken in the discipline? Have you ever found yourself wondering if psychology was really a coherent discipline, and if so, why was it so hard to clearly define? Or have you ever considered whether all those approaches could someday be unified, such that the key insights from the various branches and paradigms could coherently connected into one grand, metapsychology?

Questions like these awakened in me a deep intellectual curiosity that ultimately culminated in the development of the " unified theory ". Trained as a clinical psychologist, I was fortunate in that early in my graduate education I gained a rich exposure to the psychotherapy integration movement, which is the idea that the best of the best approaches to psychotherapy should somehow be integrated. This led me to many important realizations about psychotherapy, including: a) many of the "single" schools were defined against one another both conceptually and politically; b) no single school had the depth and breadth in both the humanistic and scientific domains to offer a comprehensive solution; and c) much overlap between the schools becomes apparent as one becomes proficient in their language and concepts. However, despite these problems, there were significant difficulties in achieving a coherent integrative view.

First, the competing schools clearly had different (although often implicit) moral emphases. Second, if one considers, as I do, psychotherapy to be the application of psychological principles in the service of promoting human well-being, then it follows that the disorganization of psychological science seriously hampers, if not completely prevents, the development of a coherent, general approach to psychotherapy.

Although now obvious with the benefit of hindsight, I essentially backed into this second point. I was looking for basic, core conceptual commonalities that cut across the various perspectives in psychotherapy and started to explore a broad array of literatures. Fortunately, evolutionary psychology was just beginning to make a major impact on the field, and in it I found a major piece of the puzzle. All the major perspectives were grounded in an evolutionary perspective, thus this could provide a shared point of departure from which to view each of the competing paradigms.

The Development of the Justification Hypothesis

Learning about evolutionary theory set the stage for what I consider to this day to be a major theoretical break through-an idea I came to call the Justification Hypothesis . So, what, exactly is the Justification Hypothesis? Technically, it is the idea that the evolution of language created the adaptive problem of social justification and this adaptive problem shaped the design features of the human self-consciousness system. In more general terms, and more importantly for this context, it means that we can think about the organization of human reflective thought and human culture in terms of justification systems.

Although it would take years to develop into a formal proposal, the proverbial "flash" of insight came on a drive home after completing a psychological evaluation on a woman hospitalized following a suicide attempt. In her late thirties, she was diagnosed with a double depression and an avoidant personality disorder . A woman with an above average intellect, she had graduated from high school, worked as a teacher's aide and lived in almost complete isolation on the brink of poverty. In a reasonably familiar story line, her father was an authoritarian, verbally abusive, alcoholic who dominated her timid, submissive mother. He would also be physically and violently abusive to her older brother, who was much more defiant of his power. She distinctly remembered several episodes of her father beating her brother, while yelling at him that he needed to be more like his obedient sister. Perhaps the most salient feature of this patient's character structure was her complete sense of inadequacy. She viewed herself as totally incompetent in almost every conceivable way and expressed an extreme dependency on the guidance of others. In presenting the case to my supervisor and classmates, I argued that the network of self-deprecating beliefs served an obvious function, given her developmental history. Namely, the beliefs she had about herself had justified submission and deference in a context where any form of defiance was severely punished. It was the first time I explicitly used the concept of justification to describe how language-based beliefs about self and others were functionally organized.

I arrived home about a half an hour late following the discussion about the patient and found myself explaining to my wife that traffic was particularly bad. Traffic had been bad, but the reality also was that it only accounted for about ten minutes of my tardiness. I had left work twenty minutes later than anticipated because I was eagerly discussing the patient's dynamics with my fellow students. In a moment of heightened self-reflection, I became acutely aware that this reason for my tardiness was much less emphasized as I explained my actions to my wife. My mind had effortlessly accessed the traffic reason and just had effortlessly suppressed the reason that was significantly less justifiable, at least as far as my wife was concerned at the moment. It was upon reflecting on my own justifications and how they were selected that the broad generalization dawned on me. The patient was not the only individual whose "justification system" for why she was the way she was could be understood as arising out of her developmental history and social context.

I came to see processes of justification as being ubiquitous in human affairs. Arguments, debates, moral dictates, rationalizations, and excuses, as well as many of the more core beliefs about the self, all involve the process of explaining why one's claims, thoughts, or actions are warranted. In virtually every form of social exchange, from blogging to warfare to politics to family struggles to science, humans are constantly justifying their behaviors to themselves and to others. Moreover, it was not only that one sees the process of justification everywhere one looks in human affairs that made the idea so intriguing. It became clear upon reflection that the process is a uniquely human phenomenon. Other animals communicate, struggle for dominance, and form alliances. But they don't justify why they do what they do. Indeed, if I had to boil the uniqueness of human nature down to one word, it would be justification. We are the justifying animal.

The JH became an obsession for me because the idea seemed to cut across many different areas of thought. It was obviously congruent with basic insights from a psychodynamic perspective. It was also clearly consistent with many of the foremost concerns of the humanists. For example, Roger's argument that much psychopathology can be understood as a split between the social self and the true self could be easily understood through the lens of the JH. Consider how a judgmental, powerful other might force particular justifications in a manner that produces intrapsychic rifts between how a person "really" feels and how they must say they feel. The JH is also directly consistent with cognitive psychotherapy, which can be readily interpreted as a systematic approach to identifying and testing one's justification system. But the idea also pulled in psychological science. Cognitive dissonance , the self-serving bias , human reasoning biases, and the "interpreter function" of the left hemisphere all were readily accountable by the formulation of the JH. The JH also seamlessly incorporated insights from those who emphasize cultural levels of analysis.

unified hypothesis definition

The Tree of Knowledge System: The Second Key Insight

By clearly delineating the dimension of human behavior from the behavior from other animals, a fascinating new formulation began to emerge, which I called the Tree of Knowledge System and depicted in the following graphic.

unified hypothesis definition

The ToK System offers a vision emergent evolution as consisting of one level of pure information (Energy) and four levels or dimensions of complexity (Matter, Life, Mind, and Culture) that correspond to the behavior of four classes of objects (material objects, organisms, animals, and humans), and four classes of science (physical, biological, psychological, and social).

Another key element of the system, is that each of the four dimensions is associated with a theoretical joint point that provides the causal explanatory framework for its emergence. As explained in the prior post, the modern evolutionary synthesis is the theoretical merger of Darwin's theory of natural selection and genetics , and provides for the conceptual unification of biology. Biology is a unified discipline precisely because it has a clear, well-established definition (the science of Life), an agreed upon subject matter (organisms), and a theoretical system that provides the causal explanatory framework for its emergence (natural selection operating on genetic combinations across the generations). It is this crisp conceptual organization that leaves scientifically minded psychologists with feelings of bio- envy .

unified hypothesis definition

If the modern evolutionary synthesis represents the Matter-to-Life joint point, what about the Life-to-Mind and Mind-to-Culture joint-points? Here is where the unified theory does its best work. It shows that Skinner's ideas can be combined with cognitive neuroscience to provide the framework for the Life-to-Mind joint-point. This idea is called Behavioral Investment Theory. And the Justification Hypothesis connects Freud 's key observations with modern social and cognitive psychology to provide the framework for the Mind-to-Culture joint-point. Together, these two theoretical joint-points "box in" psychology and provide a unified theoretical framework for the field.

Not only does the system provide a way to theoretically integrate perspectives that have been very disparate, it also provides a powerful new tool in carving out the proper conception of the field at large. Consider that even a preliminary analysis corresponding the ToK System to the varying conceptions of the discipline suggests that the idea of what psychology is about has historically spanned two fundamentally separate problems: (1) the problem of animal behavior in general (Mind on the ToK System), and (2) the problem of human behavior at the individual level (Culture). Through the meta-level view afforded by the ToK System, we can now see that previous efforts to define the field have failed in part because they have attempted to force one solution onto a problem that consists of two fundamentally distinct dimensions.

unified hypothesis definition

I teach my students that the science of psychology should be divided into two large scientific domains of (1) basic psychology and (2) human psychology. Basic psychology is defined as the science of Mind (mental behavior) and corresponds to the behavior of animals. Human psychology is considered to be a unique subset of psychological formalism that deals with human behavior at the level of the individual. Because human behavior is immersed in the larger socio-cultural context (dimension four in the ToK System), human psychology is considered a hybrid discipline that merges the pure science of psychology with the social sciences. The crisp boundary system that I am proposing is in contrast to others who have conceived of the science of psychology as existing in a vague, amorphous space between biology and the social sciences.

The "critic" in the post that started this series claimed that psychology could not be conceptually unified because there are too many political forces that pull it apart. However, what many psychologists are now beginning to see is that with the right map, we can, in fact, rise above the political forces, and move toward a more coherent, accurate, integrative, and healthy vision of what the field is all about.

Gregg Henriques Ph.D.

Gregg Henriques, Ph.D. , is a professor of psychology at James Madison University.

  • Find a Therapist
  • Find a Treatment Center
  • Find a Psychiatrist
  • Find a Support Group
  • Find Teletherapy
  • United States
  • Brooklyn, NY
  • Chicago, IL
  • Houston, TX
  • Los Angeles, CA
  • New York, NY
  • Portland, OR
  • San Diego, CA
  • San Francisco, CA
  • Seattle, WA
  • Washington, DC
  • Asperger's
  • Bipolar Disorder
  • Chronic Pain
  • Eating Disorders
  • Passive Aggression
  • Personality
  • Goal Setting
  • Positive Psychology
  • Stopping Smoking
  • Low Sexual Desire
  • Relationships
  • Child Development
  • Therapy Center NEW
  • Diagnosis Dictionary
  • Types of Therapy

March 2024 magazine cover

Understanding what emotional intelligence looks like and the steps needed to improve it could light a path to a more emotionally adept world.

  • Coronavirus Disease 2019
  • Affective Forecasting
  • Neuroscience

Library homepage

  • school Campus Bookshelves
  • menu_book Bookshelves
  • perm_media Learning Objects
  • login Login
  • how_to_reg Request Instructor Account
  • hub Instructor Commons
  • Download Page (PDF)
  • Download Full Book (PDF)
  • Periodic Table
  • Physics Constants
  • Scientific Calculator
  • Reference & Cite
  • Tools expand_more
  • Readability

selected template will load here

This action is not available.

Chemistry LibreTexts

1.6: Hypothesis, Theories, and Laws

  • Last updated
  • Save as PDF
  • Page ID 47443

  Learning Objectives

  • Describe the difference between hypothesis and theory as scientific terms.
  • Describe the difference between a theory and scientific law.

Although many have taken science classes throughout the course of their studies, people often have incorrect or misleading ideas about some of the most important and basic principles in science. Most students have heard of hypotheses, theories, and laws, but what do these terms really mean? Prior to reading this section, consider what you have learned about these terms before. What do these terms mean to you? What do you read that contradicts or supports what you thought?

What is a Fact?

A fact is a basic statement established by experiment or observation. All facts are true under the specific conditions of the observation.

What is a Hypothesis?

One of the most common terms used in science classes is a "hypothesis". The word can have many different definitions, depending on the context in which it is being used:

  • An educated guess: a scientific hypothesis provides a suggested solution based on evidence.
  • Prediction: if you have ever carried out a science experiment, you probably made this type of hypothesis when you predicted the outcome of your experiment.
  • Tentative or proposed explanation: hypotheses can be suggestions about why something is observed. In order for it to be scientific, however, a scientist must be able to test the explanation to see if it works and if it is able to correctly predict what will happen in a situation. For example, "if my hypothesis is correct, we should see ___ result when we perform ___ test."
A hypothesis is very tentative; it can be easily changed.

What is a Theory?

The United States National Academy of Sciences describes what a theory is as follows:

"Some scientific explanations are so well established that no new evidence is likely to alter them. The explanation becomes a scientific theory. In everyday language a theory means a hunch or speculation. Not so in science. In science, the word theory refers to a comprehensive explanation of an important feature of nature supported by facts gathered over time. Theories also allow scientists to make predictions about as yet unobserved phenomena."

"A scientific theory is a well-substantiated explanation of some aspect of the natural world, based on a body of facts that have been repeatedly confirmed through observation and experimentation. Such fact-supported theories are not "guesses" but reliable accounts of the real world. The theory of biological evolution is more than "just a theory." It is as factual an explanation of the universe as the atomic theory of matter (stating that everything is made of atoms) or the germ theory of disease (which states that many diseases are caused by germs). Our understanding of gravity is still a work in progress. But the phenomenon of gravity, like evolution, is an accepted fact.

Note some key features of theories that are important to understand from this description:

  • Theories are explanations of natural phenomena. They aren't predictions (although we may use theories to make predictions). They are explanations as to why we observe something.
  • Theories aren't likely to change. They have a large amount of support and are able to satisfactorily explain numerous observations. Theories can, indeed, be facts. Theories can change, but it is a long and difficult process. In order for a theory to change, there must be many observations or pieces of evidence that the theory cannot explain.
  • Theories are not guesses. The phrase "just a theory" has no room in science. To be a scientific theory carries a lot of weight; it is not just one person's idea about something
Theories aren't likely to change.

What is a Law?

Scientific laws are similar to scientific theories in that they are principles that can be used to predict the behavior of the natural world. Both scientific laws and scientific theories are typically well-supported by observations and/or experimental evidence. Usually scientific laws refer to rules for how nature will behave under certain conditions, frequently written as an equation. Scientific theories are more overarching explanations of how nature works and why it exhibits certain characteristics. As a comparison, theories explain why we observe what we do and laws describe what happens.

For example, around the year 1800, Jacques Charles and other scientists were working with gases to, among other reasons, improve the design of the hot air balloon. These scientists found, after many, many tests, that certain patterns existed in the observations on gas behavior. If the temperature of the gas is increased, the volume of the gas increased. This is known as a natural law. A law is a relationship that exists between variables in a group of data. Laws describe the patterns we see in large amounts of data, but do not describe why the patterns exist.

What is a Belief?

A belief is a statement that is not scientifically provable. Beliefs may or may not be incorrect; they just are outside the realm of science to explore.

Laws vs. Theories

A common misconception is that scientific theories are rudimentary ideas that will eventually graduate into scientific laws when enough data and evidence has accumulated. A theory does not change into a scientific law with the accumulation of new or better evidence. Remember, theories are explanations and laws are patterns we see in large amounts of data, frequently written as an equation. A theory will always remain a theory; a law will always remain a law.

Video \(\PageIndex{1}\): What’s the difference between a scientific law and theory?

  • A hypothesis is a tentative explanation that can be tested by further investigation.
  • A theory is a well-supported explanation of observations.
  • A scientific law is a statement that summarizes the relationship between variables.
  • An experiment is a controlled method of testing a hypothesis.

Contributions & Attributions

Marisa Alviar-Agnew  ( Sacramento City College )

Henry Agnew (UC Davis)

Have a language expert improve your writing

Run a free plagiarism check in 10 minutes, generate accurate citations for free.

  • Knowledge Base

Methodology

  • How to Write a Strong Hypothesis | Steps & Examples

How to Write a Strong Hypothesis | Steps & Examples

Published on May 6, 2022 by Shona McCombes . Revised on November 20, 2023.

A hypothesis is a statement that can be tested by scientific research. If you want to test a relationship between two or more variables, you need to write hypotheses before you start your experiment or data collection .

Example: Hypothesis

Daily apple consumption leads to fewer doctor’s visits.

Table of contents

What is a hypothesis, developing a hypothesis (with example), hypothesis examples, other interesting articles, frequently asked questions about writing hypotheses.

A hypothesis states your predictions about what your research will find. It is a tentative answer to your research question that has not yet been tested. For some research projects, you might have to write several hypotheses that address different aspects of your research question.

A hypothesis is not just a guess – it should be based on existing theories and knowledge. It also has to be testable, which means you can support or refute it through scientific research methods (such as experiments, observations and statistical analysis of data).

Variables in hypotheses

Hypotheses propose a relationship between two or more types of variables .

  • An independent variable is something the researcher changes or controls.
  • A dependent variable is something the researcher observes and measures.

If there are any control variables , extraneous variables , or confounding variables , be sure to jot those down as you go to minimize the chances that research bias  will affect your results.

In this example, the independent variable is exposure to the sun – the assumed cause . The dependent variable is the level of happiness – the assumed effect .

Prevent plagiarism. Run a free check.

Step 1. ask a question.

Writing a hypothesis begins with a research question that you want to answer. The question should be focused, specific, and researchable within the constraints of your project.

Step 2. Do some preliminary research

Your initial answer to the question should be based on what is already known about the topic. Look for theories and previous studies to help you form educated assumptions about what your research will find.

At this stage, you might construct a conceptual framework to ensure that you’re embarking on a relevant topic . This can also help you identify which variables you will study and what you think the relationships are between them. Sometimes, you’ll have to operationalize more complex constructs.

Step 3. Formulate your hypothesis

Now you should have some idea of what you expect to find. Write your initial answer to the question in a clear, concise sentence.

4. Refine your hypothesis

You need to make sure your hypothesis is specific and testable. There are various ways of phrasing a hypothesis, but all the terms you use should have clear definitions, and the hypothesis should contain:

  • The relevant variables
  • The specific group being studied
  • The predicted outcome of the experiment or analysis

5. Phrase your hypothesis in three ways

To identify the variables, you can write a simple prediction in  if…then form. The first part of the sentence states the independent variable and the second part states the dependent variable.

In academic research, hypotheses are more commonly phrased in terms of correlations or effects, where you directly state the predicted relationship between variables.

If you are comparing two groups, the hypothesis can state what difference you expect to find between them.

6. Write a null hypothesis

If your research involves statistical hypothesis testing , you will also have to write a null hypothesis . The null hypothesis is the default position that there is no association between the variables. The null hypothesis is written as H 0 , while the alternative hypothesis is H 1 or H a .

  • H 0 : The number of lectures attended by first-year students has no effect on their final exam scores.
  • H 1 : The number of lectures attended by first-year students has a positive effect on their final exam scores.

If you want to know more about the research process , methodology , research bias , or statistics , make sure to check out some of our other articles with explanations and examples.

  • Sampling methods
  • Simple random sampling
  • Stratified sampling
  • Cluster sampling
  • Likert scales
  • Reproducibility

 Statistics

  • Null hypothesis
  • Statistical power
  • Probability distribution
  • Effect size
  • Poisson distribution

Research bias

  • Optimism bias
  • Cognitive bias
  • Implicit bias
  • Hawthorne effect
  • Anchoring bias
  • Explicit bias

Here's why students love Scribbr's proofreading services

Discover proofreading & editing

A hypothesis is not just a guess — it should be based on existing theories and knowledge. It also has to be testable, which means you can support or refute it through scientific research methods (such as experiments, observations and statistical analysis of data).

Null and alternative hypotheses are used in statistical hypothesis testing . The null hypothesis of a test always predicts no effect or no relationship between variables, while the alternative hypothesis states your research prediction of an effect or relationship.

Hypothesis testing is a formal procedure for investigating our ideas about the world using statistics. It is used by scientists to test specific predictions, called hypotheses , by calculating how likely it is that a pattern or relationship between variables could have arisen by chance.

Cite this Scribbr article

If you want to cite this source, you can copy and paste the citation or click the “Cite this Scribbr article” button to automatically add the citation to our free Citation Generator.

McCombes, S. (2023, November 20). How to Write a Strong Hypothesis | Steps & Examples. Scribbr. Retrieved April 15, 2024, from https://www.scribbr.com/methodology/hypothesis/

Is this article helpful?

Shona McCombes

Shona McCombes

Other students also liked, construct validity | definition, types, & examples, what is a conceptual framework | tips & examples, operationalization | a guide with examples, pros & cons, what is your plagiarism score.

  • More from M-W
  • To save this word, you'll need to log in. Log In

Definition of hypothesis

Did you know.

The Difference Between Hypothesis and Theory

A hypothesis is an assumption, an idea that is proposed for the sake of argument so that it can be tested to see if it might be true.

In the scientific method, the hypothesis is constructed before any applicable research has been done, apart from a basic background review. You ask a question, read up on what has been studied before, and then form a hypothesis.

A hypothesis is usually tentative; it's an assumption or suggestion made strictly for the objective of being tested.

A theory , in contrast, is a principle that has been formed as an attempt to explain things that have already been substantiated by data. It is used in the names of a number of principles accepted in the scientific community, such as the Big Bang Theory . Because of the rigors of experimentation and control, it is understood to be more likely to be true than a hypothesis is.

In non-scientific use, however, hypothesis and theory are often used interchangeably to mean simply an idea, speculation, or hunch, with theory being the more common choice.

Since this casual use does away with the distinctions upheld by the scientific community, hypothesis and theory are prone to being wrongly interpreted even when they are encountered in scientific contexts—or at least, contexts that allude to scientific study without making the critical distinction that scientists employ when weighing hypotheses and theories.

The most common occurrence is when theory is interpreted—and sometimes even gleefully seized upon—to mean something having less truth value than other scientific principles. (The word law applies to principles so firmly established that they are almost never questioned, such as the law of gravity.)

This mistake is one of projection: since we use theory in general to mean something lightly speculated, then it's implied that scientists must be talking about the same level of uncertainty when they use theory to refer to their well-tested and reasoned principles.

The distinction has come to the forefront particularly on occasions when the content of science curricula in schools has been challenged—notably, when a school board in Georgia put stickers on textbooks stating that evolution was "a theory, not a fact, regarding the origin of living things." As Kenneth R. Miller, a cell biologist at Brown University, has said , a theory "doesn’t mean a hunch or a guess. A theory is a system of explanations that ties together a whole bunch of facts. It not only explains those facts, but predicts what you ought to find from other observations and experiments.”

While theories are never completely infallible, they form the basis of scientific reasoning because, as Miller said "to the best of our ability, we’ve tested them, and they’ve held up."

  • proposition
  • supposition

hypothesis , theory , law mean a formula derived by inference from scientific data that explains a principle operating in nature.

hypothesis implies insufficient evidence to provide more than a tentative explanation.

theory implies a greater range of evidence and greater likelihood of truth.

law implies a statement of order and relation in nature that has been found to be invariable under the same conditions.

Examples of hypothesis in a Sentence

These examples are programmatically compiled from various online sources to illustrate current usage of the word 'hypothesis.' Any opinions expressed in the examples do not represent those of Merriam-Webster or its editors. Send us feedback about these examples.

Word History

Greek, from hypotithenai to put under, suppose, from hypo- + tithenai to put — more at do

1641, in the meaning defined at sense 1a

Phrases Containing hypothesis

  • Whorfian hypothesis
  • null hypothesis
  • nebular hypothesis
  • counter - hypothesis
  • planetesimal hypothesis

Articles Related to hypothesis

hypothesis

This is the Difference Between a...

This is the Difference Between a Hypothesis and a Theory

In scientific reasoning, they're two completely different things

Dictionary Entries Near hypothesis

hypothermia

hypothesize

Cite this Entry

“Hypothesis.” Merriam-Webster.com Dictionary , Merriam-Webster, https://www.merriam-webster.com/dictionary/hypothesis. Accessed 16 Apr. 2024.

Kids Definition

Kids definition of hypothesis, medical definition, medical definition of hypothesis, more from merriam-webster on hypothesis.

Nglish: Translation of hypothesis for Spanish Speakers

Britannica English: Translation of hypothesis for Arabic Speakers

Britannica.com: Encyclopedia article about hypothesis

Subscribe to America's largest dictionary and get thousands more definitions and advanced search—ad free!

Play Quordle: Guess all four words in a limited number of tries.  Each of your guesses must be a real 5-letter word.

Can you solve 4 words at once?

Word of the day, inalienable.

See Definitions and Examples »

Get Word of the Day daily email!

Popular in Grammar & Usage

Your vs. you're: how to use them correctly, every letter is silent, sometimes: a-z list of examples, more commonly mispronounced words, how to use em dashes (—), en dashes (–) , and hyphens (-), absent letters that are heard anyway, popular in wordplay, the words of the week - apr. 12, 10 scrabble words without any vowels, 12 more bird names that sound like insults (and sometimes are), 9 superb owl words, 15 words that used to mean something different, games & quizzes.

Play Blossom: Solve today's spelling word game by finding as many words as you can using just 7 letters. Longer words score more points.

U.S. flag

An official website of the United States government

The .gov means it’s official. Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

The site is secure. The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

  • Publications
  • Account settings

Preview improvements coming to the PMC website in October 2024. Learn More or Try it out now .

  • Advanced Search
  • Journal List
  • HHS Author Manuscripts

Logo of nihpa

A unified mechanism for intron and exon definition and back-splicing

1. Department of Biochemistry and Molecular Genetics, University of Colorado Denver Anschutz Medical Campus, Aurora, CO 80045, USA

Shiheng Liu

2. Department of Microbiology, Immunology, and Molecular Genetics, UCLA, Los Angeles, CA 90095, USA

3. Electron Imaging Center for Nanomachines University of California, Los Angeles (UCLA), Los Angeles, CA 90095, USA

Lingdi Zhang

Aaron issaian, ryan c. hill, sara espinosa, yanxiang cui, kalli kappel.

4. Biophysics Program, Stanford University, Stanford, CA USA

5. Department of Biochemistry and Department of Physics, Stanford University, Stanford, CA USA

Kirk C. Hansen

Z. hong zhou.

6. RNA Bioscience Initiative, School of Medicine, University of Colorado Denver Anschutz Medical Campus, Aurora, CO 80045, USA

Author Contributions

Associated Data

The molecular mechanisms of exon definition and back-splicing are fundamental unanswered questions in pre-mRNA splicing. Here we report cryoEM structures of the yeast E complex assembled on introns, providing the first view of the earliest event in the splicing cycle that commits pre-mRNAs to splicing. The E complex architecture suggests that the same spliceosome can assemble across an exon, which either remodels to span an intron for canonical linear splicing (typically on short exons) or catalyzes back-splicing generating circRNA (on long exons). The model is supported by our experiments demonstrating that E complex assembled on the yeast EFM5 or HMRA1 middle exon can be chased into circRNA when the exon is sufficiently long. This simple model unifies intron definition, exon definition, and back-splicing through the same spliceosome in all eukaryotes and should inspire experiments in many other systems to understand the mechanism and regulation of these processes.

The spliceosome forms sequentially the E, A, Pre-B, B, Bact, B*, C, C*, P, and ILS complexes through the splicing cycle. CryoEM structures of all but one S. cerevisiae (yeast) spliceosomal complexes 1 , 2 provided valuable information on later stages of the splicing cycle. There is, however, a lack of structural and mechanistic understanding of the E complex formation, the earliest event that initiates the splicing cycle. Thus, how the splicing machinery accurately defines introns and exons remains a fundamental unanswered question. In yeast which typically contain small introns and large exons, intron definition, where the spliceosome initially recognizes and assembles across an intron, seems to dominate 3 . On the other hand, exon definition 4 prevails in vertebrate, where small exons and large introns are prevalent. In the exon definition model, the spliceosome recognizes and assembles across an exon first. However, in order to splice out introns, it was assumed that the exon definition complex (EDC) needs to be remodeled to a cross-intron complex. Support for the exon definition model is largely circumstantial, and biochemical and structural analyses of the exon definition process are limited. Although the EDC seems to be similar to the intron definition complex (IDC) in composition 5 , 6 , we do not know if the two complexes differ in their structural organization and how an EDC remodels to span an intron.

In addition to canonical splicing, a peculiar back-splicing reaction generates a class of circular RNAs (circRNAs) observed in diverse eukaryotic species, prompting the speculation that back-splicing is also an ancient and conserved feature of eukaryotic gene expression pathway 7 . CircRNAs play roles in the regulation of their host genes or microRNAs, aging, and other disease processes 8 . Although canonical splicing signals and spliceosome are needed for circRNA production 9 , the exact players and mechanism of back-splicing remain unknown.

To fill these gaps, we set out to obtain molecular details of the earliest step of the yeast splicing cycle that commits a pre-mRNA to splicing. In yeast, intron recognition is initiated by the binding of U1 snRNP on the 5’ splice site (ss) 10 - 12 , and the recognition of the branch point sequence (BPS) by the BBP and Mud2 heterodimer (the 3’ ss is not recognized until much later), forming the E complex (also referred to as the CC2 complex) 13 . Here we report the cryoEM structure of the yeast E complex assembled on either the Act1 pre-mRNA or the Ubc4 pre-mRNA. These structures and subsequent biochemical analyses reveal a unified mechanism for intron definition, exon definition and remodeling, and back-splicing-mediated circRNA biogenesis.

In vitro assembled E complex is functional

After discovering that E complex purified from yeast is too heterogenous for structural determination, we assembled the E complex in vitro using uncapped 3xMS2-Act1 pre-mRNA (M3-Act1) and purified U1 snRNP, BBP and Mud2 proteins (referred to as the Act1 complex). The complex was purified sequentially by the MS2 tag and the CBP tag on U1A and Mud2. After RNase H cleavage of M3-Act1 into two fragments ( Extended Data Fig. 1a - ​ -b), b ), the MS2 tag still pulled down all U1 snRNP proteins, BBP, and Mud2 ( Fig. 1a ), confirming that U1 snRNP and BBP/Mud2 interact instead of being simply tethered through M3-Act1. In addition, the assembled Act1 complex can be chased into spliced M3-Act1 in U1 snRNA depleted yeast extract ( Fig. 1b , lane 5). Although excess Act1-3xMS2 pre-mRNA (Act1-M3) can effectively compete with free M3-Act1 for splicing ( Fig. 1b , lane 2), it cannot compete with assembled Act1 complex ( Fig. 1b , lane 6). These data indicate that our assembled E complex has not fallen apart significantly in the splicing extract and is functional.

An external file that holds a picture, illustration, etc.
Object name is nihms-1536398-f0001.jpg

(a) The assembled E complex (with or without DNA oligo-directed RNase H treatment to cleave between the 5’ ss and BPS) is purified using the MS2 tag on pre-mRNA and its protein components shown. (b) Yeast splicing extract with or without U1 snRNA depletion is incubated with in vitro transcribed M3-Act1 or E complex assembled on M3-Act1 in the presence or absence of ATP or excess Act1-M3 (top gel). The splicing outcome is monitored using RT-PCR with primers located in the MS2 binding site region and exon 2 of M3-Act1. The middle and bottom gels demonstrate levels of U1 and U2 snRNA in each sample. Experiments in Fig. 1 were repeated two additional times with similar results. For all gel source data in this paper, see Supplementary Figure 1 .

5’ ss recognition is facilitated by proteins and RNA secondary structures

We determined the cryoEM structure of the Act1 complex to 3.2 Å resolution ( Extended Data Fig. 2 - ​ -4, 4 , Extended Data Table 1 ). After observing low resolutions in several key areas, we also assembled the E complex on a capped Ubc4 pre-mRNA, crosslinked the complex with BS3, and determined its structure to 3.6 Å resolution. The overall structures of the two complexes are similar ( Extended Data Fig. 5a ) and subsequent discussions refer to their common features unless otherwise stated.

In these structures, the 5’ ss basepairs with the 5’ end of U1 snRNA ( Fig. 2a - ​ -b), b ), which is stabilized by U1C and Luc7 proteins, similar to that observed in the yeast A and pre-B structures 14 , 15 . In addition, a homology model of the yeast nuclear cap binding protein (NCBP) complex can be fitted as a rigid body into the density upstream of nucleotide −9 of Ubc4 ( Fig. 2c ), likely binding to the pre-mRNA cap. The RRM domain of U1-70K is shifted toward NCBP in the Ubc4 complex compared to the uncapped Act1 complex ( Extended Data Fig. 5a ), suggesting that NCBP directly interacts with U1-70K RRM and providing a possible mechanism by which NCBP recruits U1 snRNP and facilitates splicing of cap-proximal introns 16 , 17 . In both complexes, the RRM2 domain of Nam8 is positioned to bind to the intronic region immediately downstream of nt +13 ( Fig. 2c ), illustrating the structural basis of Nam8’s role in facilitating 5’ ss recognition 18 .

An external file that holds a picture, illustration, etc.
Object name is nihms-1536398-f0002.jpg

(a) The overall E complex structure. BPP/Mud2 are not modeled due to weak density, but their locations are indicated. (b) Ribbon diagrams of protein and RNA models immediately around the 5’ss. (c) Surface representation of proteins that are in close proximity to the 5’ ss (colored), other proteins (grey), and U1 snRNA (cyan). Pre-mRNA is shown in red and nucleotide positions relative to the 5’ ss are labeled (−1 and +1 denote the last nt of the exon and the first nt of the intron, respectively). (d) Secondary structure in pre-mRNA. Left: CryoEM density map (filtered to 6 Å) of the entire E complex showing density (in red dashed box) for the pre-mRNA double helix. Middle: Electrostatic potentials of the binding surface for the pre-mRNA double helix. Right: The binding surface formed by Prp39, Prp42, and U1C is shown in ribbon diagrams. Positively charged residues on Prp39 and Prp42 that interact with this double helix are shown in sticks. (e) Splicing efficiency of the WT and mutant Act1 intron (that disrupts the secondary structure in the 5’ ss to BPS region) in an Act1-Cup1 reporter plasmid, as evaluated by qRT-PCR. Dots represent three technical replicates. This experiment was repeated two additional times with similar results. (f) Surface representation of proteins that interact or possibly interact with Prp40 are shown in different colors. Locations of proteins or protein domains not modeled due to weak densities are indicated by various shapes. Transparent grey areas are 8 Å low-pass filtered densities showing likely contacts between Prp40 and U1-70K. Red dashed lines represent hypothetical paths of the pre-mRNA.

A striking feature in the Act1 complex is a ~25bp double helix on a binding surface formed by many positively charged residues on the C-terminal tail of Prp39 and the N-terminal domain of Prp42, as well as the C-terminal domain of U1C ( Fig. 2d ). Such double helix density is also observed in the Pre-B complex structure and is tentatively modeled as part of U2 snRNA 15 . Our Act1 complex is in vitro assembled and contains no U2 snRNA ( Extended Data Fig. 5b ). Furthermore, no such double helix exists in the Ubc4 complex, supporting that this helix is part of the Act1 pre-mRNA. Although we were unable to model specific nucleotides, a weak density connects this helix to the 5’ ss, suggesting that it belongs to the region downstream of the 5’ ss. The 5’ ss to BPS region (265nt) of the Act1 intron is predicted to form long stem-like structures, while the same region in Ubc4 (58nt) contains a much shorter possible secondary structure ( Extended Data Fig. 6a ), potentially explaining why a stem-like structure is observed in the Act1 but not the Ubc4 complex. Mutation of this region in the Act1-Cup1 reporter 19 that abolishes extensive secondary structures ( Extended Data Fig. 6b ) leads to significant pre-mRNA accumulation compared to the WT ( Fig. 2e ), suggesting that this secondary structure facilitates splicing. Our structures of the E and P complexes 20 therefore provided direct evidence that the intronic regions of pre-mRNA can form highly ordered secondary structures, which may help bring key intronic elements close together and whose direct interaction with proteins may also facilitate spliceosomal assembly.

The 5’ ss and BPS are bridged by intrinsically flexible Prp40

A critical event in the first step of the splicing cycle is to define the intron by bringing together the 5’ ss and BPS in which U1 snRNP protein Prp40 seems to be a key player 13 . Prp40 contains two N-terminal WW domains, a ~60-residue linker, and six C-terminal FF domains. In the region between U1-70K and Luc7 in the Ubc4 complex structure, there is a boomerang-shaped density that matches well with the crystal structures of tandem FF domains connected by long helices 21 , 22 ( Fig. 2f , Extended Data Fig. 4g ). (This density is not obvious in the Act1 complex, possibly because the Act1 complex is not cross-linked with BS3.) There is weak density connecting the boomerang-shaped density and U1-70K ( Fig. 2f ), and the C-terminal FF domains crosslinks to U1-70K in the Ubc4 complex (while the N-terminal and middle FF domains crosslink to Luc7 and Snu71) ( Extended Data Fig. 7a ). These observations led us to assign the boomerang-shaped density as the Prp40 FF4-6 domains (although we cannot rule out the possibility of this density being the other tandem FF domains such as FF3-5), which is also consistent with our observation that Prp40 FF1-3 domain interacts with Luc7 ( Extended Data Fig. 7b ).

Prior biochemical analyses have shown that Prp40 forms a stable dimer with Snu71 and a trimer with Snu71-Luc7 23 - 25 , and the Prp40 WW domains directly interact with the N-terminal domain of BBP 13 , 26 . BBP also forms non-exclusive interactions with both Prp40 and Mud2 13 , and BBP directly binds to the BPS of pre-mRNA 27 . In the Act1 complex structure, there is a large volume of weak density close to the pre-mRNA double helix ( Extended Data Fig. 4j ). The density can be best interpreted as the BBP/Mud2 dimer for three reasons: Its location corresponds roughly to where U2 SnRNP is in the A complex structure 14 ( Extended Data Fig. 4j ); crosslinking and mass spectrometry analyses indicate that BBP/Mud2 is located in this region ( Extended Data Fig. 7a ); BBP/Mud2 are the only proteins left in the E complex that are large enough to fill the volume of this density. This density is not obvious in the Ubc4 complex structure, potentially because Ubc4 lacks the pre-mRNA helix that brings the BPS close to the 5’ ss. Prp40 therefore bridges both ends of the intron by interacting with U1-70K and Snu71-Luc7 of U1 snRNP through its FF domains, and interacting with BBP through its WW domains. The entire ~60-residue linker region between the WW and FF domains is predicted to be disordered ( Extended Data Fig. 7c ), explaining why density corresponding to BBP/Mud2 is difficult to observe.

Exon definition occurs in yeast

The E complex architecture, in particular the relative positions between the 5’ ss and the BPS where BBP binds, suggests that the same E complex can form across an exon. Instead of connecting the upstream 5’ ss to a downstream BPS through an intron ( Fig. 3a ), the BPS can be connected to a downstream 5’ ss through an exon ( Fig. 3b ). Similarly, the A complex structure 14 suggests that the same A complex could also form across exons ( Fig. 3b ). Modeling using Rosetta RNP-denovo method 28 suggests that only 28nt between the upstream BP and downstream 5’ ss is needed to span the U2 snRNP and U1 snRNP in the A complex ( Extended Data Fig. 8a ). The minimal distance connecting the same BP and 5’ ss is likely similar or smaller in the E complex, given the similar spatial position and smaller size of BBP/Mud2 compared to U2 snRNP ( Extended Data Fig. 4j ). On the other hand, adding the tri-snRNP to form the pre-B complex forces an ~30° increase in the angle between U1 snRNP and U2 SF3b 14 , 15 ( Fig. 3b ). A relatively short exon may hinder this conformational change and also create steric hindrance for the addition of the bulky tri-snRNP ( Fig. 3b ). This may signal to the spliceosome that an EDC has formed and provide an opportunity for the upstream 5’ ss to interact with the tri-snRNP to form an intron-spanning B complex ( Fig. 3b ). We use EDC to refer to spliceosomal complexes assembled across an exon, which can be an exon-defined E, A, or unstable pre-B complex.

An external file that holds a picture, illustration, etc.
Object name is nihms-1536398-f0003.jpg

(a) Structures of the E, A, and pre-B complexes are shown in surface representations with U1, U2, and tri-snRNPs in different colors, illustrating the canonical assembly pathway across an intron. Pre-mRNA is shown in red with an arrow indicating the 5’ to 3’ direction. Red dashed line indicates the hypothetical path of intron connecting the 5’ ss and downstream BPS. Vertical dash lines are drawn to denote the orientation of U1 snRNP and U2 SF3b in the A complex. In the pre-B complex, the orientation of U1 snRNP remains the same but that of U2 SF3b is tilted about 30°. (b) The same spliceosomal E and A complexes as in (A) can assemble across an exon, but cannot form the pre-B complex on short exons due to steric hindrance. Blue dashed line indicates the hypothetical path of exon connecting the BPS and downstream 5’ ss. (c) Same as (b), but with a long exon (green dashed line), illustrating that the EDC on long exons can catalyze back-splicing. (d) A schematic representation showing how the EDC on a long exon carries out back-splicing and generates circular RNA through the same transesterification reactions used by canonical splicing.

To test whether the E complex can form across a yeast exon, we truncated the multi-intronic DYN2 gene to only contain its middle exon and partial flanking introns (Dyn2 IEI, Extended Data Fig. 8b ). Spliceosomal complexes assemble on either Dyn2 WT or IEI pre-mRNAs (using the same protocol as the Act1 complex) contain the same protein components in similar quantities, even after RNase H cleavage between the BPS and 5’ ss ( Extended Data Fig. 8c - ​ -e). e ). Furthermore, 2D classifications of negative-stain images of the Dyn2 IEI complex resemble those of the Act1 and Ubc4 complexes ( Extended Data Fig. 8f ). These observations support the formation of E complex across the Dyn2 middle exon in vitro .

We next asked whether exon-definition occurs in vivo in yeast, by evaluating whether mutation of splice sites bordering the Dyn2 middle exon negatively affect splicing of both flanking introns, a hallmark used to support the exon definition model in vertebrates 4 . We generated a BPS mutation in intron 1 (I1-BP mutant), a 5’ ss mutation in intron 2 (I2-5’ss mutant), and a double mutation on DYN2 gene ( Extended Data Fig. 8b ). We demonstrated that the I2-5’ ss and I1-BP mutations impaired the splicing of intron 1 and intron 2, respectively ( Fig. 4a ). We further evaluated the splicing products of WT and each mutant using PCR and primers located in exons 1 and 3 ( Fig. 4b ). If Dyn2 splicing is solely governed by intron definition, we would observe retention of the intron where these mutations reside (with minimal effect on the distal intron), generating products containing a single intron (255 and 271bp bands). On the other hand, if Dyn2 splicing is solely governed by exon definition, the mutations would lead to the retention of both introns ( i.e ., accumulation of the 351bp pre-mRNA band) or exon skipping (the 152bp band), but not any product containing a single intron (indicating that the distal intron was successfully spliced). The fact that we observed both pre-mRNA accumulation and single-intron-containing products ( Fig. 4b , lanes 4 and 5) suggest that both intron definition and exon definition contribute to Dyn2 splicing in vivo . We observed exon skipping for the I1-BP mutant but not the I2-5’ ss mutant, consistent with previous observations 29 . This observation differs from that in the mammalian system where exon definition mutation leads to predominantly exon skipping, likely because intron definition also contributes to Dyn2 splicing, leading to co-transcriptional splicing of intron 1 in I2-5’ ss mutant which prevents exon skipping 29 . Taken together, our results demonstrate that exon definition occurs for a fraction of the Dyn2 transcripts in vivo in yeast.

An external file that holds a picture, illustration, etc.
Object name is nihms-1536398-f0004.jpg

(a) A plasmid containing the WT DYN2 gene or various mutants was transformed into a DYN2 KO strain. The splicing efficiency of intron 1 and 2 were evaluated using qRT-PCR with primers specific for intron 1 or intron 2 (indicated by arrows in the schematics under the bar diagram) normalized to total mRNA. Dots represent three technical replicates. (b) RT-PCR of RNA extracted from yeast strain carrying indicated plasmids, using primers located in exons 1 and 3 of Dyn2. A schematic of the splicing product and their expected sizes are shown on the right side of the gel. RT-PCR products using primers in exon 3 (bottom gel) serve as an internal quality control of the samples. Experiments in Fig. 4 were repeated two additional times with similar results.

The EDC catalyzes back-splicing on long exons

An intriguing prediction of our exon definition model is that if the exon connecting the BP and downstream 5’ ss is long enough, it will not create much steric hindrance and will allow tri-snRNP to join the pre-B complex and complete the rest of the splicing cycle ( Fig. 3c ). As a result, the 5’ ss downstream of the exon will be back-spliced to the upstream 3’ ss, generating a circRNA through the same transesterification reaction used by canonical splicing ( Fig. 3d ). Supporting this hypothesis, 7 of the 10 multi-intron genes in S. cerevisiae form circRNA products 7 .

To test this model, we purified yeast spliceosome using TAP-tagged Cef1 (a strategy used to purify and determine the cryoEM structures of multiple spliceosomal complexes) from the Prp22 H606A mutant strain defective in exon release 30 . As expected, purified spliceosomes contained spliced mRNA and lariat for yeast single-intronic gene RPP1B , as well as the unique T-branch and circRNA for multi-intronic genes EFM5 and HMRA1 ( Fig. 5a , Extended Data Fig. 9a ). These results establish that Cef1-purified spliceosome contains both canonical and back-splicing products.

An external file that holds a picture, illustration, etc.
Object name is nihms-1536398-f0005.jpg

(a) RT-PCR of RNA isolated from spliceosome purified from the Prp22 H606A yeast strain (indicated by “S”) and PCR using yeast genomic DNA (indicated by “g”, as negative controls) for single intron gene RPP1B and multi-intronic genes EFM5 and HMRA1 demonstrate the presence of ligated exons (lane 1), lariat (lane 3), T-branches (lanes 5 and 7) and circRNA (lanes 9-10). Primer positions are indicated as arrows in the schematic diagrams below the gel. All images in Fig. 5 are RT-PCR/PCR products on agarose gel with EtBr staining. (b) RT-PCR of RNA extracted from WT or EFM5 KO strain carrying indicated plasmid, with or without RNaseR treatment, using primers shown in the schematic diagrams below the gel. Numbers 101 and 63 designate exon lengths. “mut” represents mutant. Lanes 1-7 indicate all EFM5 constructs are transcribed. (c) IEI-101-M3 (3xMS2 at the 3’ end) RNA or E complex assembled on IEI-101-M3 was incubated with splicing extract with or without U1 snRNA depletion in the absence or presence of 30-fold excess competing IEI-101 RNA. CircRNA products were monitored the same way as (b). Competing IEI-101 was modified to remove the primer binding sites so it is invisible in the RT-PCR reaction. Experiments in (a), (b), and (c) were repeated one, two, and two additional times, respectively, with similar results.

Further supporting this model ( Fig. 3c - ​ -d), d ), we showed using RT-PCR that the EFM5 IEI construct on an expression plasmid generated a RNase R-resistant circRNA corresponding to exon 2 in vivo ( Fig. 5b , lane 10, Extended Data Fig. 9b ). Mutating the BPS or 5’ ss or shortening exon 2 to 63 nt abolishes circRNA formation ( Fig. 5b , lanes 11, 12, and 14). E complex assembled on in vitro transcribed EFM5 IEI-101-M3 (exon 2 shortened to 101nt and 3xMS2 at the 3’ end) ( Extended Data Fig. 9c ) can be chased into circRNA in U1-depleted yeast extract in the presence of excess competing IEI-101 RNA ( Fig. 5c ). To ensure the generality of our observation, we carried out the same experiments using another yeast multi-intronic gene HMRA1 and obtained the same conclusion ( Extended Data Fig. 9d - ​ -e). e ). Taken together, these results support that exon definition occurs in yeast across the EFM5 or HMRA1 middle exon, which catalyzes back-splicing and generates circRNA when this exon is sufficiently long.

It was previously unclear whether the EDC is the same or different from the IDC. The architecture of the E complex makes it immediately apparent that the same complex can form across either intron or exon without the need of additional components or structural rearrangement, and the same can be deduced for the A complex. The structures of the E and A complexes predict a minimal BP to 5’ ss distance (28nt for the A complex and likely a similar or smaller number for the E complex) in order for exon definition to occur. An exon that is above this minimum but still relatively short potentially makes it difficult for tri-snRNP to join the spliceosome. This leads to the spliceosome to stall at the pre-B stage and fail to handoff the 5’ ss from U1 to U6, providing an opportune point for the spliceosome to remodel into an intron-spanning B complex involving the upstream 5’ ss. This model is consistent with the observation in mammalian systems where tri-snRNP is loosely associated with the EDC, and only becomes stably associated when a 5’ss-containing RNA oligo is added and the EDC is converted to a B-like intron-spanning complex 6 . Supporting our exon definition model, we showed that intron definition and exon definition both contribute to yeast Dyn2 splicing in vivo ( Fig. 4 ). Although yeast has few multi-intronic genes and exon definition is clearly not the driving force of splicing, our results provide the proof of principle evidence that both intron and exon definition can occur through the same spliceosomal structure in most or all species, even on the same pre-mRNA. Whether intron or exon definition is dominant in vivo is likely determined by gene architecture (such as the length of introns or exons) and other factors (such as exonic or intronic enhancers or suppressors and their associated proteins, RNA secondary structures, transcription processivity, and nucleosome positioning, etc ).

CircRNA generated by back-splicing of exons has attracted increasing attention, but its origin and biogenesis have largely remained a mystery 8 . Although exon definition was speculated to play a role in back-splicing 31 , 32 , it is unclear which of the canonical spliceosomal components are required and what exact signals are being recognized that makes an exon forming circRNA instead of participating in canonical splicing. Our results demonstrate that back-splicing is catalyzed by exon-definition complexes on long exons (or multiple exons) not remodeled to intron-spanning complexes, suggesting that circRNA is a natural byproduct of spliceosome-mediated splicing in all eukaryotic species. This model is consistent with what was envisioned by the Wilusz group based on competition between back-splicing and canonical splicing 31 and with previous observation that long but not short exon (when flanked by the same intronic sequences) can form circRNAs in human cells 33 . Indeed, the average exon length in circRNA is 690 nt 34 , much longer than the median length of 120 nt for human exons 35 . The long exon inevitably lowers the efficiency of initial exon-definition, contributing to the low frequency of back-splicing and circRNA production. Intronic complementary sequences flanking the exon and RNA-binding proteins potentially increase the efficiency of initial exon-definition and facilitate circRNA production 8 . These RNA elements or proteins may also bring opposite ends of different exons close together for back-splicing, generating circRNAs containing multiple exons. It is worth noting that, accurately speaking, the distance between the upstream BP and downstream 5’ ss across an exon instead of the exon length in yeast pre-mRNAs determines the fate of the EDC, since the 3’ ss is not recognized in early yeast spliceosomes. However, given the generally short distance between yeast BP and 3’ ss (19 nt for EFM5 and 10 nt for HMRA1 intron 1) 36 , exon lengths ultimately play a major role in determining the outcome of EDC remodeling.

In summary, our E complex structure enabled us to propose a simple model that unifies intron definition, exon definition, and back-splicing, without needing a different spliceosome for each process. This model is supported by our biochemical analyses performed exclusively in yeast which is now positioned to serve as a well-defined model system to understand exon definition or back-splicing. This model likely holds true for all eukaryotes, although many cis or trans factors (including RNA, protein, transcription, nucleosome, etc. ) may act as modulators to promote or suppress a particular process. In vertebrate, most exons are short which is likely the main signal for EDC remodeling, but other factors may facilitate remodeling of EDC assembled on long exons and lower the efficiency of back-splicing. This model should inspire experiments in many other systems to understand the mechanism and regulation of exon definition and back-splicing, some of the most fundamental unanswered questions in pre-mRNA splicing.

Yeast E complex assembly and purification

The coding regions of yeast BBP and Mud2 were amplified by PCR using genomic S. cerevisiae DNA as templates. BBP fused to an N-terminal protein A (protA) tag was inserted between a GPD promoter and a CYC1 terminator, and the resulting expression cassette was cloned into pRS414 to generate the pRS414/GPD-protA-BBP plasmid. Similarly, Mud2 with or without a C-terminal Calmodulin Binding Peptide (CBP) tag was cloned into pRS416 vectors to generate the pRS416/GPD-Mud2-CBP or pRS416/GPD-Mud2 plasmid. Six liters of BCY123 cells harboring both plasmids were grown in -URA-TRP selective media to OD600=3-4. The cells were flash-frozen in liquid nitrogen to form yeast “popcorns” and cryogenically ground using a SPEX 6870 Freezer/Mill. The frozen cell powder was thawed at room temperature and re-suspended in lysis buffer (50 mM Tris-HCl, pH 8.0, 400 mM NaCl, 0.1% NP-40, 1mM DTT) with protease inhibitor cocktails (Sigma-Aldrich) and 1 mM Benzamidine. The cell lysate was first centrifuged at 27,485 × g for 1 hr in a GSA rotor (Sorvall) and the supernatant was further centrifuged at 167,424 × g in a 45Ti rotor (Beckman) for 1.5 hr at 4°C. The supernatant was incubated with 2 ml of IgG Sepharose-6 Fast Flow resin (GE Healthcare) overnight at 4°C. The resin was first washed with IgG washing buffer (20 mM Tris-HCl, pH 8.0, 350 mM NaCl, 0.05% NP-40, 0.5 mM DTT, 1 mM Benzamidine and protease inhibitor cocktails), then with buffer containing 250 mM and 150 mM NaCl. The BBP/Mud2 dimer was released by TEV protease in TEV150 buffer (20 mM Tris-HCl, pH 8.0, 150 mM NaCl, 0.02% NP-40, 0.5 mM DTT).

The Act1 pre-mRNA used in this paper is consisted of a 73 nt 5′ exon, the 302 nt intron that lacks a cryptic branch point sequence, and a 167 nt 3′ exon 37 . The Ubc4 pre-mRNA is consisted of a 20 nt 5′ exon, a 95 nt intron, and a 32 nt 3′ exon 38 . The Dyn2 wildtype pre-mRNA is consisted of three exons (22 nt, 23 nt, and 35 nt in lengths) which are separated by two introns (96 nt and 80 nt in lengths). The Dyn2 Intron-Exon-Intron (IEI) pre-mRNA is consisted of intron 1 without the first 9 nt, the middle exon, and intron 2 truncated right before the branch point sequence. The EFM5 IEI-101 pre-mRNA is consisted of intron 1 without the first 10 nt, the middle exon shortened to 101 nt, and intron 2 truncated to 9 nt upstream of the BPS. The HMRA1 IEI-246 wild type pre-mRNA is consisted of intron 1 without the first 10 nt, the entire middle exon, and intron 2 truncated to 2 nt upstream of the BPS. The HMRA1 IEI-246 pre-mRNA was generated after mutating the underlined part of the last 41 nt of its middle exon from 5’-CAAAGAAATGTGGCATTACTCCACTTCAAGTAAGAGTTTGG-3’ to 5’-ACTAATGCCACTACTTTACTCCACTTCAAGTAAGAGTTTGG-3’. This modification enables us to use specific primers to detect only the exogenous but not endogenous HMRA1 in a WT yeast strain. DNA templates for in vitro transcription were generated after the addition of three copies of MS2 stem loops to the 5′-end of the ACT1 gene or to the 3’-end of the UBC4 , DYN2, EFM5, and HMRA1 genes. Pre-mRNA substrates were generated by run-off transcription from linearized plasmid DNA templates, and capped using Vaccinia Capping System (New England Biolabs) if indicated.

To obtain the yeast complex E for structural studies, the Act1 or Ubc4 pre-mRNA substrate was bound to the MBP-MS2 fusion protein and mixed with purified U1 snRNP 23 and BBP/Mud2 dimer (or BBP/Mud2-CBP in the case of Act1), then applied to amylose resin (New England Biolabs) pre-washed with buffer G120 (20 mM HEPES, pH 7.9, 120 mM KCl, 0.01% NP-40). After 3 h incubation at 4 °C, the resin was washed and eluted with buffer G120 containing 10 mM maltose. Elutions were pooled and applied to 100 μL of calmodulin resin (Agilent) pre-washed with washing buffer (20 mM Hepes, pH7.9, 120 mM KCl, 2 mM CaCl2, 1 mM imidazole, 0.01% NP-40), and incubated for 3 hr at 4 °C. The resin was washed with washing buffer, and eluted 6 times with 100 μL eluting buffer (20 mM Hepes, pH7.9, 120 mM KCl, 2 mM EGTA) each time. The elutions containing the most concentrated E complex were used for cryoEM imaging. Crosslinked sample was prepared by treating the complex with 1 mM BS3 (Thermo Fisher) on ice for 30 min, and subsequently quenched with 50 mM Tris (pH8.0).

CryoEM sample preparation and imaging

For cryoEM sample optimization, an aliquot of 3 μl of sample (~0.2-0.5 μM) was applied onto a glow-discharged lacey carbon film-coated copper grid (400 mesh, Ted Pella). The grid was blotted with Grade 595 filter paper (Ted Pella) and flash-frozen in liquid ethane with a FEI Mark IV Vitrobot. A FEI TF20 cryoEM instrument was used to screen grids. CryoEM grids with optimal particle distribution and ice thickness were obtained by varying the gas source (air using PELCO easiGlow™, target vacuum of 0.37 mbar, target current of 15 mA; or H 2 /O 2 using Gatan Model 950 advanced plasma system, target vacuum of 70 mTorr, target power of 50 W) and time for glow discharge, the volume of applied samples, chamber temperature and humidity, blotting time and force, as well as wait time before blotting. Our best grids for the Act1 complex were obtained with 50 s glow discharge using air and with the Vitrobot sample chamber set at 12°C temperature, 100% humidity, 2.5 s blotting time, −3 blotting force and 20 s wait time. The best grids for Ubc4 complex were obtained with 60 s glow discharge using air and with the Vitrobot sample chamber set at 12°C temperature, 100% humidity, 3 s blotting time, 1 blotting force and 60 s wait time.

Optimized cryoEM grids were loaded into a FEI Titan Krios electron microscope with a Gatan Imaging Filter (GIF) Quantum LS device and a post-GIF K2 Summit direct electron detector. The microscope was operated at 300 kV with the GIF energy-filtering slit width set at 20 eV. Movies were acquired with Leginon 39 by electron counting in either super-resolution mode at a pixel size of 0.68 Å/pixel (Act1 complex) or counting mode at a pixel size of 1.36 Å/pixel (Ubc4 complex). A total number of 40 frames were acquired in 8 seconds for each movie, giving a total dose of ~30 e-/Å2/movie.

Drift correction for movie frames

Frames in each movie were aligned for drift correction with the GPU-accelerated program MotionCor2 40 . The first frame was skipped during drift correction due to concern of more severe drift/charging of this frame. Two averaged micrographs, one with dose weighting and the other without dose weighting, were generated for each movie after drift correction. The averaged micrographs have a calibrated pixel size of 1.36 Å on the specimen scale. The averaged micrographs without dose weighting were used only for defocus determination and the averaged micrographs with dose weighting were used for all other steps of image processing.

Structure determination for the Act1 complex

For the Act1 complex, the defocus value of each averaged micrograph was determined by CTFFIND4 41 to be ranging from −1.5 to −3 μm. Initially, a total of 3,589,121 particles were automatically picked from 11,283 averaged micrographs without reference using Gautomatch ( http://www.mrc-lmb.cam.ac.uk/kzhang ). The particles were boxed out in dimensions of 352 × 352 square pixels square and binned to 176 × 176 square pixels (pixel size of 2.72 Å) before further processing by the GPU accelerated RELION2.1. The reported U1 model (EMD-8622) was low-pass filtered to 60 Å to serve as an initial model for 3D classification. After one round of 3D classification, only the classes exhibiting features characteristic of the E complex ( e.g ., 5’ss and pre-mRNA helix binding to U1 snRNP) were kept, which contained 1,852,842 particles. Several iterations of reference-free 2D classification were subsequently performed to remove “bad” particles ( i.e. , classes with fuzzy or un-interpretable features), yielding 1,108,069 good particles. Auto-refinement of these particles by RELION yielded a map with an average resolution of 5.44 Å (“Step 1” in Extended Data Fig. 2c ).

Next, we performed two rounds of focused classification on the pre-mRNA helix region of the E complex to further eliminate those particles without the pre-mRNA helix (“Step2” in Extended Data Fig. 2c ). The first round of this focused classification generated one good class containing 390,792 particles. These particles were un-binned to 352 × 352 square pixels (pixel size of 1.36 Å) and subjected to another round of focused classification. We re-centered the particles from one best class and removed duplications based on the unique index of each particle given by RELION.

The 270,587 un-binned, unique particles (7.5% of all particles) resulting from the focused classification were subjected to a final step of 3D auto-refinement (“Step 3” in Extended Data Fig. 2c ). The two half maps from this auto-refinement step were subjected to RELION’s standard post-processing procedure. The final map of the Act1 complex has an average resolution of 3.2 Å based on RELION’s gold-standard FSC (see below).

Structure determination for the Ubc4 complex

For the Ubc4 complex, the defocus value of each averaged micrograph was determined by CTFFIND4 to be ranging from −1.5 to −3 μm. Initially, a total of 1,924,710 particles were automatically picked from 8,997 averaged micrographs without reference using Gautomatch. The particles were boxed out in dimensions of 384 × 384 square pixels square and binned to 192 × 192 square pixels (pixel size of 2.72 Å) before further processing by the GPU accelerated RELION2.1. The reported U1 model (EMD-8622) was low-pass filtered to 60 Å to serve as an initial model for 3D classification. After one round of 3D classification, only the classes showing features corresponding to the E complex ( e.g. , 5’ss binding to U1 snRNP) were kept, which contained 800,735 particles. Several iterations of reference-free 2D classification were subsequently performed to remove bad particles (i.e., classes with fuzzy or un-interpretable features), yielding 756,303 good particles (“Step1” in Extended Data Fig. 3c ).

Next, we performed another two rounds of 3D classification to further improve the ratio of the intact E complex ( e.g. , Prp40, NCBP1/NCBP2, and Nam8 binding) (“Step2” in Extended Data Fig. 3c ). During each round of the 3D classification, only one class showed features corresponding to the intact E complex monomer (Likely due to the cross-linking reagent used for the Ubc4 complex, one class from the second round of 3D classification exhibiting features characteristic of the E complex dimer). These good particles from the final round of 3D classification were un-binned to 384 × 384 square pixels (pixel size of 1.36 Å). We re-centered these particles and removed duplications based on the unique index of each particle given by RELION.

The resulting 124,825 un-binned, unique particles (6.5% of all particles) were subjected to a final step of 3D auto-refinement (“Step 3” in Extended Data Fig. 3c ). The two half maps from this auto-refinement step were subjected to RELION’s standard post-processing procedure. The final map of the Ubc4 complex has an average resolution of 3.6 Å based on RELION’s gold-standard FSC (see below).

Resolution assessment

All resolutions reported above are based on the “gold-standard” FSC 0.143 criterion 42 . FSC curves were calculated using soft spherical masks and high-resolution noise substitution was used to correct for convolution effects of the masks on the FSC curves 43 . Prior to visualization, all maps were sharpened by applying a negative B-factor which was estimated using automated procedures 44 .

Local resolution was estimated using ResMap 45 . The overall quality of the maps for Act1 and Ubc4 complexes is presented in Extended Data Figs. 2b - ​ -d d and ​ and3b 3b - ​ -d, d , respectively. Data collection and reconstruction statistics are presented in Extended Data Table 1 .

Model building and refinement

To aid subunit assignment and model building, we took advantage of the reported U1 structure (PDB code: 5UZ5, 3.7 Å), which were fitted into the Ubc4 complex density map by UCSF CHIMERA 46 . The central regions of the Ubc4 complex have resolutions ranging from 3.0 to 4.5 Å ( Extended Data Fig. 3f ); thus protein and RNA components in these regions were rebuilt manually using COOT 47 . Briefly, for protein subunits that match well with the densities in the Ubc4 complex structure, we manually adjusted their side chain conformation and, when necessary, moved their main chains to match the density map. For protein subunits that exhibit significant main chain mismatches or have not been identified, we built atomic models de novo . To do so, sequence assignment was mainly guided by visible densities of amino acid residues with bulky side chains, such as Trp, Tyr, Phe, and Arg. Other residues including Gly and Pro also helped the assignment process. Unique patterns of sequence segments containing such residues were utilized for validation of residue assignment.

For the RNA region near 5’ss (nt −2 to +8 of pre-mRNA with respect to the exon-intron junction; nt 1-10 of the U1 snRNA), the well-defined nucleotide densities, along with the base pairs between U1 snRNA and pre-mRNA, facilitated the RNA model building process. RNA model building in these regions was performed de novo in COOT. For the central regions of U1 snRNA, the previous U1 snRNA model was adjusted for their base conformation and, when necessary, for their main chains to match the density map. The RNA components were subsequently adjusted using RCrane 48 and ERRASER 49 .

Models built for the protein and RNA subunits in these central regions include: U1-70K (aa 1-91), U1C (aa 3-197), U1A (aa 2-46, 55-125, 133-148), Prp42 (aa 1-544), Prp39 (aa 288-553, 561-627), Nam8 (aa 291-425, 432-449, 492-523), Snu56 (aa 43-170, 185-295), Snu71 (aa 1-52), Luc7 (aa 4-19, 38-140, 172-244), Sm ring; the core regions of U1 snRNA ( Extended Data Fig. 4 ), pre-mRNA (nt −2 to +8 with respect to the exon-intron junction). The long helix interacting with ZnF2 and the coiled coil domain of Luc7 was traced with poly-alanine, which likely belongs to Snu71 since deletion of the coiled coil domain of Luc7 reduces its interaction with Snu71 and Prp40 but only Snu71 has isolated long helices ( Extended Data Fig. 7B ).

Resolutions for the periphery of the Ubc4 complex were more varied, ranging from 6 Å to 25 Å, insufficient for de novo atomic modeling. The following proteins were built with homology modeling using I-TASSER server and rigidly docked into the low-pass filtered map of the 3.6 Å map using CHIMERA: RRM domain of U1-70K (aa 94-188), N-terminal region of Prp39 (aa 43-285), and RRM2 domain of Nam8 (aa 161-242). The homology model of NCBP1/NCBP2 heterodimer (NCBP1:616 aa 36-861; NCBP2: aa 19-156) was rigidly fitted into its local refinement map. In addition, the periphery region has a boomerang-shaped density which matches well with that of the crystal structures of tandem FF domains connected by long helices 21 , 22 . We assigned this density as the Prp40 FF4-6 domains, considering that DSSO crosslinking in the Ubc4 complex and mass spectrometry analyses demonstrate that the C-terminal FF domains crosslink to U1-70K ( Extended Data Fig. 7A ) and that there is weak density connecting the boomerang-shaped density and U1-70K, although we cannot rule out the possibility of this density being the other tandem FF domains such as FF3-5. The FF4 (aa 355-413, from I-TASSER), FF5 (aa 427-488, from I-TASSER) and FF6 (aa 491-552, PDB: 2KFD) domains were rigidly docked into the low-pass filtered map using CHIMERA, and manually connected using COOT with the long helix between FF domains ( Fig. 2F ).

Except for nucleotides 27-33 at the tip of Stem-Loop 1 and the last three nucleotides 566-568, the entire U1 snRNA is now modeled with DRRAFTER 50 . The estimated mean RMSD accuracies for the DRRAFTER models are: The estimated mean RMSD accuracies for the DRRAFTER models are: 0.4 Å for residues 39-41, 4.3 Å for residues 97-103, 0.7 Å for residues 175-177, 3.5 Å for residues 202-236, 3.0 Å for residues 289-294, and 4.9 Å for residues 325-516. The median structures of the best ten scoring models are shown in Fig. 2A . Using the low-pass filtered map, we could also manually trace the main chain for nt −9 to −3 and nt +9 to +13 of pre-mRNA. Combined with the previous atomic model, 23 nucleotides of the pre-mRNA were manually built, of which the upstream could directly insert into NCBP1/NCBP2 heterodimer and the downstream could interact with the RRM2 domain of Nam8.

The model of the pre-RNA helix and the putative localization of the BBP/Mud2 binding region were based on the 3.2Å resolution structure of the Act1 complex. The modeling procedure is similar to that used for modeling the Ubc4 complex except for the following differences. Firstly, we could observe the density for ~25bp double RNA helix with clear major and minor grooves on a binding surface formed by the C-terminal tail of Prp39, the N-terminal domain of Prp42, and the C-terminal domain of U1C. Such double helix density was also observed in the pre-B complex structure and was tentatively modeled as part of U2 snRNA. Since our Act1 complex was assembled from in vitro transcribed Act1 pre-mRNA, purified U1 snRNP, BBP, and Mud2 proteins, there is no U2 snRNA present in our sample ( Extended Data Fig. 5B ). Although some bases can be separated in the density map of this double helix, we were unable to model specific nucleotides. Nonetheless, there is weak density connecting it to the 5’ ss, suggesting that this double helix belongs to the pre-mRNA region downstream of the 5’ ss. Secondly, there is a large volume of weak density close to pre-mRNA double helix ( Extended Data Fig. 4I ). The density can be best interpreted as the BBP/Mud2 dimer bound to pre-mRNA, given that its location corresponds roughly to where U2 SnRNP is in the A complex structure ( Extended Data Fig. 4I ).

The above models were refined using PHENIX in real space 51 with secondary structure and geometry restraints. Refinement statistics of the E complex were summarized in Extended Data Table 1 . These models were also evaluated based on Morprobity scores 52 and Ramachandran plots ( Extended Data Table 1 ). Model/map FSC validation was shown in Extended Data Fig. 2g and ​ 3g. 3g . Representative densities for the proteins and RNA are shown in Extended Data Fig. 4 . All structure-related images in this paper were generated using UCSF CHIMERA 46 and CHIMERAX 53 .

To determine the minimum number of nucleotides needed to connect an upstream BPS to a downstream 5’ SS in the A complex 14 , we modeled connection lengths ranging from 21 to 30 nucleotides between nucleotides 74 and −1 of pre-mRNA (chain I in PDB ID 6g90, nucleotide 70 is BP and +1 is 5’ ss) using the Rosetta RNP-denovo method with full-atom refinement 28 , 50 . Nucleotides 75-78 of pre-mRNA were excised from the structure and all other nucleotides were kept fixed. 21-30 uridines were modeled de novo to connect nucleotide 74 to nucleotide −1. During the initial low-resolution stages of the modeling, score terms rewarding favorable RNA-protein interactions, RNA base pairing, and compact RNA structures were turned off. Score terms that penalize clashes within the RNA and between the RNA and protein were included during this stage. During the final all-atom refinement, the complete all-atom RNA-protein score function was used. The weight on the score term penalizing chainbreaks was increased to 50.0 and models with a chainbreak score less than 0.5 were considered fully connected. The top scoring model (at least 250 models were built for each connection length) by full Rosetta score was used as a representative model for each connection length. We found that the representative model for 22-nucleotide connection length is fully connected and has similar total Rosetta score as models for longer connection lengths, indicating 22 nucleotides are sufficient for connecting nucleotide 74 to −1 without highly unfavorable interactions such as clashes.

Crosslinking and mass spectrometry

Purified yeast spliceosome E complex was crosslinked with 10 mM DSSO (disuccinimidyl sulfoxide) for 45 min at 4 °C, the reaction was quenched by adding ammonium bicarbonate to a final concentration of 50 mM. Crosslinked complex was proteolytically digested according to the FASP (filter-aided sample preparation) protocol as previously described 54 . Briefly, ~100 μg of crosslinked sample was reduced, alkylated, and digested at 1:50 with sequencing grade trypsin (Promega) by incubating at 37 °C for 18 hours. Peptides were eluted and acidified to 0.1 % formic acid. Enrichment of crosslinked peptides was performed by using strong cation exchange chromatography (SCX) with a Dionex UltiMate 3000 system (Thermo Fisher Scientific). A Proteomix SCX-NP1.7 column (4.6 mm inner diameter, 150 mm length, Sepax Technologies) was used. Briefly, peptides were separated using the following gradient: 0 % B (0 – 3.5 min), 0 – 22.5 % B (3.5 – 18.5 min), 22.5 – 50 % B (18.5 – 21.5 min), 50 – 100 % B (21.5 – 23 min), 100 % B (23 – 25.5 min) with solvent A (10 mM KH 2 PO 4 , 25 % acetonitrile, pH 3.00) and solvent B (10 mM KH 2 PO 4 , 25 % acetonitrile, 500 mM KCl, pH 3.00) at a flow rate of 0.7 ml/min. Fractions were collected every minute. Fractions 6-26 were pooled into groups of three and desalted using StageTips for subsequent LC-MS/MS analysis.

Crosslinked peptides were then analyzed by nano-UHPLC-MS/MS (Easy-nLC1200, Orbitrap Fusion™ Lumos™Tribrid™, Thermo Fisher Scientific). 14 μl of sample was directly loaded onto an in-house packed 100 μm i.d. × 250 mm fused silica column packed with CORTECS C18 resin (2.7 μm, spherical solid-core). Samples were run at 400 nL/min over a 90 min linear gradient from 4-32% acetonitrile with 0.1% formic acid. The mass spectrometer was operated in positive ion mode with two sequential experiments per duty cycle. For crosslink peptide identification, MS1 scans were ran in the orbitrap from 375-1500 m/z at 60,000 resolution. MS2 was performed on the top 4 ions from each precursor scan and fragmented at a CID collision energy of 22%. MS3 was triggered by the targeted mass difference of 31.9721 Da represented by the cleavage of the DSSO sulfoxide bond, and was performed as a stepped HCD collision energy of 33% +/−3. For linear peptide identification, a second precursor scan was performed at 120000 resolution in a scan range of 350-1000 m/z. Stoichiometric sampling of ions for MS2 fragmentation was capped at 2 seconds and performed at an HCD collision energy of 30% in the orbitrap. Data acquisition was performed using Xcalibur (version 4.1) software.

Instrument raw files were directly loaded in to Proteome Discoverer 2.2 and were searched against twenty-two proteins making up the E Complex of U1snRNP from S. cerevisiae of the Swiss-prot database (update 2018_08_08) using the XlinkX plugin. Search parameters included carbamidomethylation-C as a fixed modification, oxidation-M, DSSO-K, DSSO/amidated-K, and DSSO/hydrolysed-K as variable modifications, allowing for two missed cleavages. MS2_MS3 was set for crosslink detection against DSSO. Precursor mass tolerance was set to 10 ppm, with MS/MS mass tolerance set to 20 ppm. Results were manually validated and visualized using xVis 55 .

Oligo-directed RNase H digestion of pre-mRNA in purified E complex

Purified E complex with Act1 or Dyn2 IEI pre-mRNA as substrate was incubated with RNase H (New England Biolabs) in the presence or absence of 5 μM DNA oligo, at 25 °C for 30 min. The complex was then bound to amylose resin pre-washed with buffer G120. The resin was washed with buffer G120 and eluted in buffer G120 containing 10 mM maltose. The eluted samples were analyzed on SDS-PAGE and stained with Coomassie to visualize the proteins. For RNA detection, the samples were digested with 1 μg/μL proteinase K and separated on 7M urea denaturing polyacrylamide gel and stained with EtBr, or on native agarose gel and stained with SYBR gold (Life Technologies). DNA oligo used for digestion of Act1 is 5’-AAAATAAACGATGACACAG-3’, and for Dyn2 is 5’-TCATGGAAGAAAACCTCAC-3’.

Chase experiment with assembled E complex in U1-depleted yeast extract

BSY82 (GAL-U1) yeast strain 56 obtained from Dr. Michael Rosbash’s lab was maintained in YEP media containing 2% galactose. For U1 snRNA depletion cultures, log phase cells growing in 2% galactose were diluted into 2% glucose containing media to an OD600 of 0.03 and grown for 17 h to an OD600 of 2.5. Yeast splicing extracts were prepared from 2 liters of yeast cells cultured in media containing either galactose or glucose. Splicing reactions were carried out at 23°C for 15 min in a 25 μL reaction containing 2.5 nM M3-Act1 pre-mRNA alone or purified E complex, with or without 50-fold (in molar quantity) of Act1-M3 (Act1 pre-mRNA containing 54nt 3’ exon fused with three copies of MS2 stem loops at the 3′-end), 40% yeast extract, and splicing buffer (60 mM potassium phosphate, pH 7.4, 3% PEG 8000, 2.5 mM MgCl 2 , 2 mM ATP). RNAs were then phenol/chloroform extracted and precipitated with 2.5 volumes of ethanol. After DNase I (Roche) treatment, first strand cDNAs were synthesized from 1 μg of RNA using ProtoScript II reverse transcriptase with reverse primers specific to M3-Act1, U1 snRNA, or U2 snRNA. PCR was performed using cDNA transcribed from 25 ng of RNA as template and the following primers: MS2 Forward 5’-TCCGATATCCGTACACCATC-3’; Act1 exon 2 Reverse 5’-TGATACCTTGGTGTCTTGGTCT-3’; yeast U1 Forward 5’-AAACATGCGCTTCCAATAGT-3’, Reverse 5’-TATGTGTGTGTGACCAAGGAG-3’ 57 ( 75 ); and yeast U2 Forward 5’-AACTGAAATGACCTCAATGAGGCTC-3’, Reverse 5’- AGACCTGACATTAGCGGAAAACAAC-3’. The products were analyzed on 3% low melting point (LMP) agarose gel stained with EtBr.

The same experiment was also performed using EFM5 IEI-101-M3 or HMRA1 IEI-246-M3 pre-mRNA and E complex assembled on both pre-mRNAs. A second EFM5 IEI-101 pre-mRNA was designed to remove the primer binding sites so it is invisible in the RT-PCR reaction and was in vitro transcribed to be used as competing RNA. For HMRA1, wild type IEI-246 pre-mRNA was used as competing RNA. Primers used to detect circRNA formed specifically from EFM5 IEI-101 are: IEI-101 cir Forward 5’- CCTTGAAGAATTCAAAAGAGAGGATAG −3’ and cir Reverse 5’- CGAGGGCATTAGCAGAAAG −3’. Primers used to detect circRNA formed specifically from HMRA1 IEI-246 are: IEI-62 cir Forward 5’-TCCACTTCAAGTAAGAGTTTGG-3’ and cir Reverse 5’- GAGTAAAGTAGTGGCATTAGTCA-3’.

Analyzing the role of Act1 intronic secondary structure in splicing

We mutated multiple stretches of sequences in the 5’ ss to BPS region of Act1 so that it no longer forms extensive secondary structures. The final sequence of this region (mutated nucleotides underlined) is: GTATGTTCTAGCGCTTGCACCATCCCATTTAACTGTAAAAAAAAATGCACGGTCCCAATTGCTCGAGAGATTTCTCTTTTACAAAAAAATACTATTAAAAAAAGAGAAAAAACCTCCTATATTGACTGATCTGTAATAACCACGATATTATTGGAATAAATAGGGGCTAAAAATTTGGAAAAAAAGAAAAACTGAAATATTTTCGTGATAAGTGATAGTGATAAAAAAAAATTATTTGCTACTGTGTCTCATGTACTAA. We synthesized this mutant region through Genscript and replaced the 5’ ss to BPS region of the Act1-Cup1 reporter plasmid 19 with this sequence to generate the mutant reporter plasmid pMA6.

The WT and mutant Act1-Cup1 reporter plasmids were transformed into yeast strain BY4741, and grown in synthetic complete (SC) -Leu medium until OD600=1.0. RNA is extracted from 5 mls of culture, treated with DNase I (Roche), and reverse transcribed using the ProtoScript II First Strand cDNA Synthesis Kit (New England Biolabs) and random primers. qPCR was performed using the iTaq™ Universal SYBR® Green Supermix (Biorad) with cDNA transcribed from 10 ng of RNA as template. The primers for detecting pre-mRNAs are located in the Act1 intron and Cup1: Act intron Forward 5’-TTATTTGCTACTGTGTCTCATG-3’; YAC7 Reverse 5’- GCATTGGCACTCATGACCTT-3’. The primers for detecting total reporter mRNA are located in Act1 exon 2 and Cup1: ActEx2 Forward 5’- GTTCTGGTATGTGTAAAGCC-3’; CUP1end Reverse 5’-CCAGAGCAGCATGATTTCTT-3’.

Co-purification assay in yeast

The coding regions of yeast Prp40, Snu71, and Luc7 full length or truncations were amplified by PCR using genomic S. cerevisiae DNA as templates, and ligated into pRS414, pRS416 and pRS317 vectors. The final plasmids constructed are: pRS414/GPD-protA-Prp40 (full-length, FF1-3 and FF4-6), pRS317/GPD-Prp40, pRS414/GPD-protA-Snu71, pRS416/GPD-CBP-Luc7 or Luc7ΔCC (Luc7 coiled coil domain (residues 123-190) deletion). Yeast BCY123 cells were transformed with different combination of the plasmids and selected on appropriate selective media. Clones from the transformation were cultured in 50 ml of liquid selective media to OD600=3-4. Cells were harvested and lysed in lysis buffer (50 mM Tris-HCl, pH 8.0, 150 mM NaCl, 0.05% NP40, 1 mM DTT, 2.5mM CaCl 2 , 1.5mM MgCl 2 ) using the bead-beating method. The lysates were incubated with IgG resin for 3 hr at 4°C. The resins were washed with the lysis buffer. The proteins were cleaved off IgG resin using TEV protease in buffer (20 mM Tris-HCl, pH 8.0, 150 mM NaCl, 0.01% NP40, 0.5mM DTT). The proteins were separated on SDS-PAGE and transferred to a nitrocellulose membrane. Western blot was performed using an anti-CBP tag antibody (GenScript, A00635).

Dyn2 splicing analyses

The DYN2 gene was PCR amplified from S. cerevisiae genomic DNA along with the 5′-UTR (356 bp) and 3′-UTR (305 bp) and cloned into the pRS415 vector. The BPS of the first intron was mutated from “TACTAAC” to “TAGTACC” and the 5’SS of the second intron from “GT” to “CG” separately or together, to generate the I1-BP, I2-5’SS, and double mutant. The wild type and mutant plasmids were transformed into a DYN2 deletion yeast strain (Open Biosystem), and transformants were selected on SC -Leu plates. Cells were grown in SC -Leu media to OD600 of ~1.0. Total RNA was isolated from 10 ml of cells using a hot-phenol extraction method and dissolved in 100 μl of diethylpyrocarbonate (DEPC)-treated water. A total of 1 μg of RNA was treated with DNase I and reverse transcribed into cDNA. qPCRs were performed using the following primers: E1 Forward 5’-CCAAAATGAGCGATGAAAATAAGAG-3’ and I1 Reverse 5’-TCATGGAAGAAAACCTCACTC-3’ to detect exon 1-intron 1 product; I1 Forward 5’-TATGTCAGTTAATCTCAGTCACAAT-3’ and E2 Reverse 5’-TATGTCAGACGCCTTAACAATAG-3’ to detect intron 1-exon 2 product; E2 Forward 5’-CTATTGTTAAGGCGTCTGACATA-3’ and I2 Reverse 5’-GGTCTAAGTTTTCTCCTTGTTAG −3’ to detect exon 2-intron 2 product; I2 Forward 5’-CATGTTTTGTGTGTGTACATTTG-3’ and E3 Reverse 5’-CAGGTATTGCCGTATTTGAC-3’ to detect intron 2-exon 3 product; E3 Forward 5’-CGACAAGCTGAAAGAGGATA-3’ and E3 reverse to detect exon 3.

CircRNA detection from purified spliceosome

Spliceosome was purified from three liters of yeast cells carrying the Prp22 H060A mutant to enrich the post-catalytical complex 20 to increase our chance of detecting branching and ligation products before their release. RNA from the purified spliceosomal complex was purified and reverse transcribed into cDNA. PCR was performed to detect the presence of T-branches and circRNAs from yeast multi-intronic genes EFM5 (YGR001C) and HMRA1 (YCR097W), using circle-specific primers 7 and the following primers for T-branches: EFM5 I1 Forward 5’- TTTTCAACACAGTAACGTAGAATTAC-3’, I2 Reverse 5’- AACAGTTAGTAAGATGAAAAGATACTGG-3’; HMRA1 I1 Forward 5’-GTATGTTTTCATTTCAAGGATAG-3’, I2 Reverse 5’-TGTTAGTATAGGATATATTTAAGTTTGA-3’. PCR products were analyzed on 3.5% LMP agarose gel stained with EtBr, and cloned into pMiniT vectors (New England Biolabs) for sequencing.

CircRNA detection from EFM5 and HMRA1 IEI constructs in vivo

The region from +47 to +630 of EFM5 gene (containing EFM5 partial intron 1, exon 2, and partial intron 2) was PCR amplified from S. cerevisiae genomic DNA and was inserted between a GPD promoter and a CYC1 terminator, and the resulting expression cassette was cloned into a pRS317 vector, generating the EFM5 IEI plasmid. The BPS of the first intron was mutated from “TACTAACTAAC” to “TAGTACCTACC” (mutations underlined) 58 and the 5’SS of the second intron from “GT” to “CG” separately, to generate the BP and 5’SS mutant. The IEI-101 truncation was generated by fusing the first 49 nt to the last 52 nt of the middle exon. The IEI-63 truncation was engineered on IEI-101 to shorten the middle exon to 63 nt in length, of which the sequence is: 5’-GACACTTTCTGCCCTTGAAGAATTCAAAAGAGAGGATAGATTGTTAATTGACCCAAACAAAGT-3’.

The EFM5 IEI plasmid and the empty pRS317 vector were transformed into an EFM5 deletion yeast strain (Open Biosystem). Cells were grown in SC -Lys media to OD 600 of ~1.0. Total RNA was isolated and reverse transcribed into cDNA. For RNase R treatment, 1 μg of total RNA was incubated at 37 °C for 30 min with 5 U/ μg of RNase R (Epicentre Technologies) and used directly for reverse transcription without further purification. PCR was performed using specific primers 7 to detect circRNA formed from exon2 of EFM5 using the following primers: EFM5 cir Forward 5’-GAGAGGATAGATTGTTAATTGACCC-3’ and EFM5 cir Reverse 5’- CTTTTGAATTCTTCAAGGGCA-3’. The primer pair used to detect un-spliced EFM5 pre-mRNA are: EFM5 I1 Forward 5’-TTTTCAACACAGTAACGT AGAATTAC-3’, I2 Reverse 5’- GAGTAGGGATATGTTTATGATATACATAC −3’. The products were analyzed on 3% LMP agarose gel stained with EtBr.

The region from +116 to +439 of HMRA1 gene (containing exon 2 flanked by partial intron 1 and intron 2) was PCR amplified from S. cerevisiae genomic DNA and cloned into pRS317 vector in the same way. The IEI-62 truncation was engineered to shorten the middle exon to 62 nt in length, of which the sequence is: 5’-TTTATAATGGAAAGTAATTTGAC TAATGCCACTACTTTACTCCACTTCAAGTAAGAGTTTGG −3’. Primers used to detect circRNA are IEI-62 cir Forward and Reverse primer, which are the same as those used for IEI-246. The primer pair used to detect un-spliced HMRA1 pre-mRNA are: HMRA1 I1 Forward 5’-CCA AGAACTTAGTTTCGACTCTAGATTTCAAGGATAGCCTTTGAATC-3’, I2 Reverse 5’- AACTAATTACATGATGGGCCCGGATATATTTAAGTTTGATTCTCATATTACATAC-3’.

Extended Data

Extended data figure 1..

An external file that holds a picture, illustration, etc.
Object name is nihms-1536398-f0006.jpg

(a) A schematic representation of the Act1 pre-mRNA tagged with three MS2-binding sites (M3-Act1) used for E complex assembly and purification. Boxes represent exon 1 (E1) and truncated exon 2 (E2). The 5’ ss (GU) and BPS (UACUAAC) are also shown. The red line represents the DNA oligo complementary to a region 5nt upstream of the BPS for the RNase H cleavage experiment. (b) RNA components of the assembled E complex (with or without DNA oligo and RNase H treatment) after proteinase K digestion are shown on a denaturing urea gel or native agarose gel. These results demonstrate that RNase treatment cleaved M3-Act1 into two fragments. Note that the sizes of RNA on the native gel do not match their linear length, possibly due to the existence of secondary structures. This experiment was repeated two additional times with similar results.

Extended Data Figure 2.

An external file that holds a picture, illustration, etc.
Object name is nihms-1536398-f0007.jpg

(a) A representative drift-corrected cryoEM micrograph (out of a total of 11,283 images) of the E complex assembled on the Act1 pre-mRNA. A representative particle is shown in a white dotted circle. (b) Representative 2D class averages of the Act1 complex obtained in RELION. This experiment was repeated one additional time with similar results. (c) Data processing workflow. For processing above the red dash line, the particle images were binned to a pixel size of 2.72 Å. The rest of processing was performed with a pixel size of 1.36 Å. The masks used in data processing are outlined with red solid line. Please refer to Methods for more details. (d) Angular distribution for all particles used for the final 3.2 Å map of the Act1 complex. (e) FSC as a function of spatial frequency demonstrating the resolution for the final reconstruction of the Act1 complex. (f) Resmap local resolution estimation. (g) FSC coefficients as a functional of spatial frequency between model and cryoEM density maps. The generally similar appearances between the FSC curves obtained with half maps with (red) and without (blue) model refinement indicate that the refinement of the atomic coordinates did not suffer from severe over-fitting.

Extended Data Figure 3.

An external file that holds a picture, illustration, etc.
Object name is nihms-1536398-f0008.jpg

(a) A representative drift-corrected cryoEM micrograph (out of a total of 8,997 micrographs) of the E complex assembled on the Ubc4 pre-mRNA. A representative particle is shown in a white dotted circle. (b) Representative 2D class averages of the Ubc4 complex obtained in RELION. (e) 2D classification of negative-stain TEM images of the E complex assembled on Dyn2 IEI pre-mRNA. This experiment was repeated one additional time with similar results. (c) Data processing workflow. For processing above the red dash line, the particle images were binned to a pixel size of 2.72 Å. The rest of processing was performed with a pixel size of 1.36 Å. The masks used in data processing are outlined with red solid line. Please refer to Methods for more details. (d) Angular distribution for all particles used for the final 3.6 Å map of the Ubc4 complex. (e) FSC as a function of spatial frequency demonstrating the resolution for the final reconstruction of the Ubc4 complex. (f) Resmap local resolution estimation. (g) FSC coefficients as a functional of spatial frequency between model and cryoEM density maps. The generally similar appearances between the FSC curves obtained with half maps with (red) and without (blue) model refinement indicate that the refinement of the atomic coordinates did not suffer from severe over-fitting.

Extended Data Figure 4.

An external file that holds a picture, illustration, etc.
Object name is nihms-1536398-f0009.jpg

Panels (a-i) are densities for the Ubc4 complex and (j) is density for the Act1 complex. The cryoEM density maps are shown for (a) selected regions of U1 snRNA; (b) C-terminal region of Prp39; (c) N-terminal domain of Snu71; (d) the pre-mRNA and U1 snRNA duplex; (e) U1C ZnF domain; (f) Luc7 ZnF2 domain; (g) the tandem FF domains of Prp40 (known structures of tandem FF domains from CA150 are also shown with the characteristic boomerang-shape); (h) the RRM2 domain of Nam8; (i) NCBP1 and NCBP2; (j) the weak density in the Act1 complex that is assigned as the putative BBP/Mud2 heterodimer. The A complex is also shown, with U1 snRNP in the same orientation as the Act1 complex and U2 snRNP located in similar positions as the BBP/Mud2 heterodimer with respect to U1 snRNP. The map of Act1 complex was low-pass filtered to 40 Å.

Extended Data Figure 5.

An external file that holds a picture, illustration, etc.
Object name is nihms-1536398-f0010.jpg

(a) Comparison of the ribbon models of the Act1 complex, the Ubc4 complexes, and U1 snRNP from other previously determined structures (the U1 snRNP, A, and pre-B complex). Labels in shade indicate protein or RNA components that are different between the Act1 and Ubc4 complexes. These components and the RRM2 domain of Nam8 are also absent from previously determined structures. Note that U1-70K is shifted towards NCBP2 in the Ubc4 complex. (b) Purified E complex does not contain U2 snRNA. A native polyacrylamide gel shows the solution hybridization ( 78 ) result of total cellular RNA or RNA from purified E complex hybridized with fluorescent probes specific for U1 and U2 snRNAs. This experiment was repeated one additional time with similar results.

Extended Data Figure 6.

An external file that holds a picture, illustration, etc.
Object name is nihms-1536398-f0011.jpg

(a) Secondary structures predicted by RNAstructure 6.0 ( https://rna.urmc.rochester.edu/RNAstructureWeb/ ). (b) Sequence between the 5’ ss and BPS (underlined) of Act1. Red nucleotides were mutated to A (other than the one A which was mutated to G) in the mutant Act1 to disrupt predicted secondary structures.

Extended Data Figure 7.

An external file that holds a picture, illustration, etc.
Object name is nihms-1536398-f0012.jpg

(a) DSSO crosslinking and mass spectrometry analyses of the Ubc4 complex. Each blue line indicate crosslinks observed between a pair of Lys residues. Note that BBP/Mud2 are crosslinked to Luc7, Prp40, Snu56, and Snu71. (b) Co-purification assays probing the interaction between Snu71 (or Prp40) and Luc7. Various combinations of protein A-TEV-Prp40, protein A-TEV-Snu71, and CBP-tagged Luc7 or Luc7ΔCC [coiled coil domain (residues 123-190) of Luc7 deleted] were co-overexpressed in yeast (only Snu71 is protein A tagged in the Snu71+Prp40 lanes), purified using IgG resin, eluted through TEV cleavage, analyzed on SDS-PAGE, and visualized using both Western blot with an anti-CBP antibody to detect Luc7 (top) and Ponceau S stain to show Snu71 or Prp40 (middle). Western blot using the same anti-CBP antibody was used to demonstrate Luc7 expression levels in cell lysates (bottom). The faint band around 26 kD in all lanes is TEV. This experiment was repeated one additional time with similar results. (c) The linker (residues 73-131) between the WW and FF domains of Prp40 is predicted to be disordered using program MetaDisorderMD2 ( 79 ).

Extended Data Figure 8.

An external file that holds a picture, illustration, etc.
Object name is nihms-1536398-f0013.jpg

(a) The minimal length of RNA needed to connect the upstream BP and downstream 5’ ss in the A complex is modeled using the Rosetta RNP-denovo method. The A complex (PDB ID 6g90) is shown in grey. The pre-mRNA is shown in green. The upstream BP and downstream 5’ ss are shown in purple space filling models. 28 nucleotides are sufficient to connect the upstream BP and downstream 5’ ss (not including the BP and 5’ ss themselves) without any chainbreak and clashes. (b) Schematics of the Dyn2 pre-mRNA WT and mutants (mutated nucleotides shown in red), IEI, and untagged IEI used for the EDC assembly and in vivo exon definition experiments. Stem-loops represent the MS2 binding sites, and the red line represents the DNA oligo used for RNase H cleavage. (c) SDS-PAGE shows protein components of complexes assembled on WT and IEI substrates (lanes 1-2), on WT in the presence of competing untagged IEI (lane 3), and on IEI after RNase H treatment in the absence and presence of the DNA oligo (lanes 4-5). This experiment was repeated one additional time with similar results. (d) RNA components of the same complexes as in lanes 4-5 of (b), confirming that RNase H treatment + oligo indeed cleaves the pre-mRNA. The smaller cleaved fragment (61 nucleotides) is difficult to see since EtBr has a low efficiency staining short single stranded RNA. This experiment was repeated two additional times with similar results. (e) Mass spectrometry analyses of spliceosome assembled on the Dyn2 IEI and WT pre-mRNA indicate that the two complexes have the same components in similar quantities with the exception of NCBP1 and 2 which are absent from the IEI complex. (f) 2D classification of negative-stain TEM images of the E complex assembled on Dyn2 IEI pre-mRNA. This experiment was repeated one additional time with similar results.

Extended Data Figure 9.

An external file that holds a picture, illustration, etc.
Object name is nihms-1536398-f0014.jpg

(a) Sanger sequencing confirmed that the PCR products in Figure 5A were derived from T-branches and circRNAs of EFM5 and HMRA1. “/” shows where two ends of exon 2 are ligated. “∣” shows where the 5’ ss of intron 2 is ligated to the BP of intron 1. The 5’ ss and BPS are shown in bold. The BPS contains deletions (show as -) due to errors caused by reverse transcriptase reading through the branch. (b) RT-PCR was carried out on RNA extracted from WT yeast cells with or without RNaseR treatment using primers indicated in the schematic diagrams below the gel, indicating that RNase R treatment eliminates linear RNAs. This experiment was repeated four additional times with similar results. (c) Protein and RNA components of E complex assembled on EFM5 IEI-101-M3 pre-mRNA. (d) RT-PCR of RNA extracted from BY4742 yeast strain carrying indicated HRMA1 plasmids, with or without RNaseR treatment, using primers shown in the schematic diagrams below the gel. Numbers 246 and 62 designate exon lengths. Lanes 1-3 indicate all constructs were transcribed (endogenous HMRA1 pre-mRNA level is too low to be detected as indicated in lane 3). The HMRA1 middle exon was slightly modified to create a circRNA primer binding site so that only the modified exogenous ( e.g. , IEI-246 in lane 5) but not WT HMRA1 circRNA (IEI-246 WT in lane 4) can be detected. (e) IEI-246-M3 (3xMS2 at the 3’ end) RNA or E complex assembled on IEI-246-M3 was incubated with WT or U1-depleted yeast extract in the absence or presence of 30-fold excess competing IEI-246 WT RNA. CircRNA products were monitored using RT-PCR the same way as (d). Experiments in (c) - (e) were repeated one additional time with similar results.

Extended Data Table 1.

Cryo-EM data collection, refinement and validation statistics.

Supplementary Material

Acknowledgements.

This work was supported by NIH grants GM126157 and GM130673 (R.Z.); GM071940 and AI094386 (Z.H.Z.); and GM122579, GM121487, and CA219847 (R.D.). S.E. is a Howard Hughes Medical Institute Gilliam Fellow. K.K. was supported by an NSF GRFP award and a Stanford Graduate Fellowship. We acknowledge the use of instruments at the Electron Imaging Center for Nanomachines (supported by UCLA and by grants from the NIH (1S10OD018111, 1U24GM116792) and NSF (DBI-1338135 and DMR-1548924)) as well as the CU Anschutz School of Medicine Cryo-EM and proteomics core facilities (partially supported by the School of Medicine and the University of Colorado Cancer Center Support Grant P30CA046934). Molecular graphics and analyses were performed with the UCSF Chimera and ChimeraX, developed by the Resource for Biocomputing, Visualization, and Informatics at the University of California, San Francisco, with support from NIGMS P41-GM103311 (Chimera, ChimeraX) and NIH R01-GM129325 (ChimeraX). We also thank M. Ares, D. Black, and D. Brow for comments on early versions of the manuscript.

Reprints and permissions information is available at www.nature.com/reprints .

The authors declare no competing interests.

Data Availability

The coordinate files have been deposited in the Protein Data Bank (6N7P for the Ubc4 complex and 6N7R for the Act1 complex). The cryoEM maps have been deposited in the Electron Microscopy Data Bank (EMD-0360 for the Ubc4 complex and EMD-0361 for the Act1 complex).

  • Electromagnetism

Unified Field Theory

We have learned of various field theories in our previous sessions. Field theory helps to understand the behaviour of particles when they are placed at different points in the field. Field theories such as electric and magnetic field theories can be used for studying the electric and magnetic fields of a unit positive charge. Using the field theory, even electromagnetic waves can be analyzed. In this session, let us know about an essential theory known as unified field theory.

Combination of Standard Model and Gravity

The strong force, the weak force, the gravitational force, and the electromagnetic force are known as fundamental forces. These forces are mediated by fields, which in the Standard Model of particle physics, result from the exchange of gauge bosons. Specifically, the four fundamental interactions to be unified are:

Strong interaction: the interaction responsible for holding quarks together to form hadrons, and holding neutrons and also protons together to form atomic nuclei. The exchange particle that mediates this force is the gluon.

Electromagnetic interaction: the familiar interaction that acts on electrically charged particles. The photon is the exchange particle for this force.

Weak interaction: a short-range interaction responsible for some forms of radioactivity that acts on electrons, neutrinos, and quarks. It is mediated by the W and Z bosons.

Gravitational interaction: a long-range attractive interaction that acts on all particles. The postulated exchange particle has been named the graviton.

Modern unified field theory attempts to bring these four forces and matter together into a single framework.

The concept of unified field theory is covered under particle Physics. It aims at describing all fundamental forces as well as the relationship between elementary particles in a single theoretical framework. Unified field theory is also referred to as Theory of Everything (ToE). It focuses on explaining nature as well as the behaviour of matter and energy that exists. This theory made an attempt to explain that fundamental forces like gravity and electromagnetism would emerge as different aspects of a single fundamental field.

albert-einstein

As we know force can be described on fields that facilitate interactions between separate objects. Forces are not directly transmitted between objects that are interacting but they are described and interrupted by intermediary entities called fields.

The aim of developing a unified field theory has led to the progress of theoretical Physics. Let us now see how unified field theory was developed.

Evolution of Unified Field Theory

In the 19th century, famous Physicist James Clerk Maxwell developed the first field theory in his theory of electromagnetism . Later in the 20th century, the theory of general relativity was developed by Albert Einstein. Along with other scientists, Einstein attempted to develop a unified field theory where gravity and electromagnetism would emerge as different aspects of a single fundamental field.

Quantum field theory combines the theories of special relativity, classical field theory, and quantum mechanics. This theory treats particles (quanta) as excited states of their underlying quantum fields, which are more fundamental than the particles. The theory of relativity explains the nature and behaviour of all phenomena on the macroscopic level (things that are visible to the naked eye); a quantum theory explains the nature and behaviour of all phenomena on the microscopic (atomic and subatomic) level. The four fundamental forces of nature are electromagnetic interaction, strong interaction, weak interaction, and gravitational interaction.

In the course of the 1960s and 70s, physicists found that matter is made of fundamental particles called quarks and leptons. The quarks are bound within larger particles like protons and neutrons. Quarks are bound by the short-range strong force that engulfs electromagnetism at subnuclear distances.

During the 1970s quantum field theory dedicated to the strong force known as quantum chromodynamics (QCD) was developed. In quantum chromodynamics, interaction of quarks takes place through the exchange of particles known as gluons. Physicists further proceeded to find if strong force can be unified with the electroweak force in a grand unified theory (GUT). A grand unified theory still did not include gravity and is not a successful field theory.

Equation of Unified Field Theory

the changing magnetic flux/attractive transition and the electric field is given by:

In the integral form, the above equation can be written as:

The equation for magnetic flux is:

Unified field theory was aimed at unifying the laws of cosmos given as;

Tensors are the formulas used in Einstein’s Unified Field equation.

Hope you have understood the history of unified field theory in detail. Stay tuned with BYJU’S to know more.

Frequently Asked Questions – FAQs

What are the fundamental forces.

Fundamental forces are the strong force, the weak force, the gravitational force, and the electromagnetic force.

Define gravity.

Gravity is the fundamental force that is considered as the universal force of attraction acting between all matter.

Who developed the first field theory?

The first field theory was developed by Physicist James Clerk Maxwell.

State true or false: The grand unified theory includes the concept of gravity.

Who developed the theory of general relativity.

The theory of general relativity was developed by Albert Einstein.

unified hypothesis definition

Leave a Comment Cancel reply

Your Mobile number and Email id will not be published. Required fields are marked *

Request OTP on Voice Call

Post My Comment

unified hypothesis definition

  • Share Share

Register with BYJU'S & Download Free PDFs

Register with byju's & watch live videos.

close

IMAGES

  1. Research Hypothesis: Definition, Types, Examples and Quick Tips

    unified hypothesis definition

  2. Best Example of How to Write a Hypothesis 2024

    unified hypothesis definition

  3. 🏷️ Formulation of hypothesis in research. How to Write a Strong

    unified hypothesis definition

  4. SOLUTION: How to write research hypothesis

    unified hypothesis definition

  5. Hypothesis

    unified hypothesis definition

  6. What is a Hypothesis

    unified hypothesis definition

VIDEO

  1. Intro to hypothesis, Types functions

  2. How To Formulate The Hypothesis/What is Hypothesis?

  3. Hypothesis Definition

  4. What Is A Hypothesis?

  5. Hypothesis Formulation

  6. Hypothesis

COMMENTS

  1. Depressive Disorders: Toward a Unified Hypothesis

    Our scientific understanding of psychiatric syndromes, including the phenomena of depression, has been hampered because of: (i) the use of metapsychological concepts that are difficult to test; (ii) methodological and linguistic barriers that prevent communication among psychoanalysts, behaviorists, experimental psychologists, and psychiatrists ...

  2. The Unified Theory in a Nutshell

    The Unified Approach argues the former is much more likely if we can link our large scale justification systems together and coordinate the change in a manner consistent with the human condition ...

  3. The scientific method (article)

    The scientific method. At the core of biology and other sciences lies a problem-solving approach called the scientific method. The scientific method has five basic steps, plus one feedback step: Make an observation. Ask a question. Form a hypothesis, or testable explanation. Make a prediction based on the hypothesis.

  4. The Unity of Science

    A classic reference to this compositional type of account is Oppenheim and Putnam's "The Unity of Science as a Working Hypothesis" (Oppenheim and Putnam 1958; Oppenheim and Hempel had worked in the 1930s on taxonomy and typology, a question of broad intellectual, social and political relevance in Germany at the time).

  5. Scientific Method

    Science is an enormously successful human enterprise. The study of scientific method is the attempt to discern the activities by which that success is achieved. Among the activities often identified as characteristic of science are systematic observation and experimentation, inductive and deductive reasoning, and the formation and testing of ...

  6. What Is a Hypothesis? The Scientific Method

    A hypothesis (plural hypotheses) is a proposed explanation for an observation. The definition depends on the subject. In science, a hypothesis is part of the scientific method. It is a prediction or explanation that is tested by an experiment. Observations and experiments may disprove a scientific hypothesis, but can never entirely prove one.

  7. The Conceptual Unification of Psychology

    Biology is a unified discipline precisely because it has a clear, well-established definition (the science of Life), an agreed upon subject matter (organisms), and a theoretical system that ...

  8. 1.6: Hypothesis, Theories, and Laws

    Marisa Alviar-Agnew ( Sacramento City College) Henry Agnew (UC Davis) 1.6: Hypothesis, Theories, and Laws is shared under a CK-12 license and was authored, remixed, and/or curated by Marisa Alviar-Agnew & Henry Agnew. Although many have taken science classes throughout the course of their studies, people often have incorrect or misleading ideas ...

  9. How to Write a Strong Hypothesis

    5. Phrase your hypothesis in three ways. To identify the variables, you can write a simple prediction in if…then form. The first part of the sentence states the independent variable and the second part states the dependent variable. If a first-year student starts attending more lectures, then their exam scores will improve.

  10. Unified Theories

    UNIFIED THEORIESThe quest for unification has been a perennial theme of modern physics, although it dates back many millennia. The belief that all physical phenomena can be reduced to simple elements and explained by a small number of natural laws is the central tenet of physics, indeed of all science. One of the first unifying scientific principles was the atomic hypothesis, beautifully ...

  11. Gaia hypothesis

    The Gaia hypothesis posits that the Earth is a self-regulating complex system involving the biosphere, the atmosphere, the hydrospheres and the pedosphere, tightly coupled as an evolving system. The hypothesis contends that this system as a whole, called Gaia, seeks a physical and chemical environment optimal for contemporary life.

  12. How to Write a Hypothesis w/ Strong Examples

    What is a Hypothesis / Definition. A hypothesis is like a bet: you size things up and tell your mates exactly what you think is going to happen with respect to X, Y, Z. It can also be like an explanation for a phenomenon, or a logical prediction of a possible causal correlation among multiple factors. In science—or, really, in any field, a ...

  13. The Unified Airway Hypothesis: Evidence From Specific Intervention With

    Introduction. The unified airway hypothesis is an established concept that the upper and lower airways form a single unified organ that is interconnected and interrelated by several physiologically important shared traits. 1, 2, 3 These shared traits include immunology and pathophysiology, epidemiology, and clinical characteristics. 4, 5 The main basis of the unified airway hypothesis is that ...

  14. Hypothesis Definition & Meaning

    hypothesis: [noun] an assumption or concession made for the sake of argument. an interpretation of a practical situation or condition taken as the ground for action.

  15. Theory of everything

    A theory of everything (TOE), final theory, ultimate theory, unified field theory or master theory is a hypothetical, singular, all-encompassing, coherent theoretical framework of physics that fully explains and links together all aspects of the universe.: 6 Finding a theory of everything is one of the major unsolved problems in physics. Over the past few centuries, two theoretical frameworks ...

  16. Unified science

    unified science, in the philosophy of logical positivism, a doctrine holding that all sciences share the same language, laws, and method or at least one or two of these features. A unity-of-science movement arose in the Vienna Circle, a group of scientists and philosophers that met regularly in Vienna in the 1920s and '30s and was associated in particular with Rudolf Carnap and Otto Neurath.

  17. Hypothesis Definition & Meaning

    Hypothesis definition, a proposition, or set of propositions, set forth as an explanation for the occurrence of some specified group of phenomena, either asserted merely as a provisional conjecture to guide investigation (working hypothesis ) or accepted as highly probable in the light of established facts. See more.

  18. On the History of Unified Field Theories

    In a very Cartesian spirit, Tonnelat (Tonnelat 1955 [], p.5) gives a definition of a unified field theory as "a theory joining the gravitational and the electromagnetic field into one single hyperfield whose equations represent the conditions imposed on the geometrical structure of the universe." No material source terms are taken into account. ...

  19. Chronic Rhino-Sinusitis and Asthma: Concept of Unified Airway Disease

    The unified airway disease (UAD) hypothesis purposes that upper and lower airway diseases are both are infections of a single inflammatory process within the respiratory tract. Synonyms of UAD include allergic rhino bronchitis and combined allergic rhino-sinusitis and asthma. ... In 2003, a definition of chronic rhino-sinusitis was introduced ...

  20. Unified field theory

    Unified field theory. In physics, a unified field theory ( UFT) is a type of field theory that allows all that is usually thought of as fundamental forces and elementary particles to be written in terms of a pair of physical and virtual fields. According to modern discoveries in physics, forces are not transmitted directly between interacting ...

  21. A unified mechanism for intron and exon definition and back-splicing

    A unified model for intron definition, exon definition, and back-splicing. (a) Structures of the E, A, and pre-B complexes are shown in surface representations with U1, U2, and tri-snRNPs in different colors, illustrating the canonical assembly pathway across an intron. Pre-mRNA is shown in red with an arrow indicating the 5' to 3' direction.

  22. Unified neutral theory of biodiversity

    The unified neutral theory of biodiversity and biogeography (here "Unified Theory" or "UNTB") is a theory and the title of a monograph by ecologist Stephen P. Hubbell. It aims to explain the diversity and relative abundance of species in ecological communities. Like other neutral theories of ecology, Hubbell assumes that the differences between members of an ecological community of trophically ...

  23. Unified Field Theory

    Unified field theory is also referred to as Theory of Everything (ToE). It focuses on explaining nature as well as the behaviour of matter and energy that exists. This theory made an attempt to explain that fundamental forces like gravity and electromagnetism would emerge as different aspects of a single fundamental field.