Mike’s Primer on Writing Scientific Proposals and Papers
Students regularly ask me a lot of questions about how to write for science. Below I have summarized my usual answers. I cover my approaches to developing good scientific questions and hypotheses, logically organizing proposals and papers, and (briefly) writing in a style appropriate to scientific communications. I don’t claim any of this is comprehensively correct, and as soon as a rule is stated an exception presents itself. There is no one right way to write, but here are my suggestions.
Science is inherently question-driven, so start there. What’s your question?
Scientific questions for ecological research (i.e., those trying to better understand natural systems or the effects of management on them) can be described as:
1. Is something happening?
2. What is happening?
Description of a pattern
3. How is something happening?
Identification of a mechanism leading to a pattern
4. Why is something happening?
Evolutionary/ecological causation for a mechanism leading to a pattern
Not all scientific questions are created equal. The questions above are organized from top to bottom, from easiest to answer but least interesting and useful, to hardest to answer but most interesting and useful.
Question 1 should be used rarely, if ever. It is the question asked from complete ignorance, i.e., do foxes have home ranges? Possible answer: yes. So what? Question 1 isn’t very interesting or useful because A) the vast majority of the time we already know the answer, and B) the answer “yes” helps us to understand little about what we are studying.
Question 2 is better because its answer provides a description that invites further questions, i.e., what do home ranges of foxes look like? This is still a relatively weak question, though, because a description of something doesn’t help us to understand it much better than knowing it exists. It is nonetheless, the foundation for asking more interesting and useful questions. Possible answer: foxes have home ranges that average 2 ha in size and contain equal amounts of forest and grassland. Interesting… but why? The vast majority of published studies in ecology ask question 2, implicitly or explicitly. Asking the question again is therefore unlikely to add new insights.
Question 3 is where things start to get really interesting because we are asking how causal mechanisms produce the answers to questions 1 and 2, i.e., how do foxes choose their home ranges? Possible answer: foxes choose their home ranges to provide access to prey, to reduce competition with other foxes, and to protect their pups. Importantly, knowledge produced by answering question 3 is the minimum needed to manage wild populations effectively; answers to questions 1 and 2 tell us nothing about what can be manipulated to achieve management goals.
Question 4 is the brass ring of ecological research because we go beyond description of causal mechanisms and actually explain them, i.e., why do foxes choose the home ranges they do? Possible answer: foxes choose home ranges that maximize the energy gained from food resources over the energy lost to obtaining and protecting resources, and to producing and protecting offspring. Answering question 4 well maximizes the information available for effective conservation.
Finally, logistical and financial challenges increase from question 1 to 4. The realities of funding, manpower, geography, and timelines may ultimately limit the questions we can ask. It is thus important to match the question to the goals of the study. A study that can realistically only answer question 2 should not be used to answer questions 3 or 4. If answering question 2 does not meet the goals of a study then either A) creative thinking may be needed to redesign a study so it can answer the needed question within current constraints, B) increase funding, manpower, or geographic and temporal scopes so that the study answers the needed question under the current design, or 3) forego the study.
Science begins with asking good questions, but the rest of it is about providing good answers to them. In ecological science, this is accomplished by testing biological hypotheses. A hypothesis is a plausible answer to a research question. This is different from a statistical hypothesis which is a numerical pattern that would be predicted for a biological hypothesis if it were true. For example, the biological hypothesis “growth of fox populations is density dependent” would produce the statistical hypothesis (i.e., prediction) that “growth rate for a fox population will decline as the population grows.”
For any good ecological question (i.e., questions 3 or 4, above) the number of plausible hypotheses is technically infinite. The number of hypotheses that are interesting or useful, however, is thankfully much smaller. The art of designing scientific research is figuring out what those are and creating a study to test them. Identifying useful and interesting hypotheses is really difficult, requiring a lot of thought about your questions, answers to those questions that best meet study objectives, and data limitations. That thought process can be depicted as:
After identifying a question, potential hypotheses are drawn from empirical precedent, existing theory, and completely new ideas you might have. The best hypotheses do not repeat the past but build on it. Unless you discern fundamental flaws in empirical precedent or existing theory, testing hypotheses that are already convincingly supported or refuted isn’t worth doing.
Combing empirical precedent can help generate new hypotheses as long as you don’t get swamped in the sheer volume of what is available. Good hypotheses rarely suggest themselves from a comprehensive reading of the primary literature. Teasing those hypotheses out requires a fundamental understanding of key concepts is difficult to acquire directly from the primary literature because foundational theory for most studies is either implicit or referenced briefly. Reading first a good book that summarizes the fundamental concepts important to your research objectives can improve your ability to synthesize the thinking found in the primary literature.
Multiple hypothesized answers to a single scientific question are presented in the diagram. This reflects reality because more than 1 plausible answer to any question always exists, and a smart approach to research because testing a single hypothesis may yield results of limited interest (consider a study where data do not support a lone hypothesis compared to one where data support one hypothesis over others). Importantly, testing multiple hypotheses avoids the pitfall of designing research to affirm a pet hypothesis, one of the greatest temptations in science. Attempting to falsify a pet hypothesis along with other plausible alternatives improves the rigor, credibility, and contribution of the study.
Data are the anchor to reality for any empirical study. Strong inference in ecological research is obtained by comparing patterns in collected data to those predicted by hypotheses. Testing some hypotheses may require data that are not reasonably obtainable given real-world constraints, suggesting a mismatch that can doom a study if it is not avoided. Considering interactions between questions, hypotheses, and data is therefore critical when designing research.
Note the gray line in the diagram above indicates that the process is reversible, allowing full consideration of the interdependencies between questions, hypotheses, and data. A series of questions that illustrate both directions of this process could look like this:
- What is your question?
- What plausible answers to your question do you hypothesize?
- What predictions do your hypotheses suggest?
- What data would be needed to compare to those predictions?
- Are those data reasonably obtainable given funding, time, and manpower available?
- If not, what patterns could reasonably obtainable data reveal?
- What variations of your hypotheses would predict such patterns?
- To what questions do such hypotheses represent plausible answers?
This process ensures that questions, hypotheses, and data all match. Note that this exercise is about identifying the kind of data that can be used to statistically distinguish hypotheses, not selecting data a priori that will be consistent with (i.e., “prove”) particular hypotheses. Going through several iterations of the process before identifying good questions and hypotheses is a realistic expectation.
Proposals market your research, to your committee or to a funding agency. As such, they must present your proposed research clearly and compellingly. Great ideas are not enough if you can’t sell them. Selling great research ideas effectively is largely about organization.
A proposal needs to walk the reader from the conceptual context for your research to how you will test hypotheses within that context. The flow from concepts to methods needs to be clear and logical throughout the proposal. After reading your proposal, the reader needs to understand your research questions, why you are asking them, why your hypothesized answers are credible, that your methods offer a strong likelihood for meeting your objectives, and finally that your research will produce new and interesting results. All of this must be accomplished in a very limited amount of space.
Organization to achieve this end can be envisioned as a funnel that progressively leads the reader from the general to the specific:
From top to bottom:
1. All ecological research takes place under a conceptual umbrella. For example, a survival study takes place under the conceptual umbrella of demography. It is hard to understand why a survival study should be done without knowing something about its ecological importance. You do not want to get too big or too obvious in this but get to the point: what is the immediate context needed to understand why you are doing your research? Importantly, this material needs to go beyond your study species. Example: a variety of factors can limit growth of animal populations.
2. Most ecological concepts have already been addressed through research of one form or another. A summary of this work can A) familiarize the reader with current ideas relevant to your research, and B) ultimately make clear how your research will add to it. Again, this presentation should not be specific to your study species. Example: studies A, B … have shown how factors X, Y… have/have not limited a variety of populations.
3. Orient the reader to your study system (i.e., your study species, community, its relevant ecological context, etc.) and explain how the concepts you have laid out lead to interesting questions for it. Here is where you justify the purpose for your study and it should be convincing. Avoid a justification of “no one has addressed these concepts for this species in this location.” Good research will yield inferences beyond your study species and location; understanding the conceptual umbrella for your work will help make that happen. Example: fox population X is declining but the causes are unknown.
4. State your research questions, having walked the reader from the general ideas behind your work to an application of those ideas to your study system. When you state your questions, the reader should understand what led up to them and that they are important and need to be answered. Example: What factors are causing the decline of fox population X?
5. Present the hypotheses you have chosen that represent interesting and plausible answers to your questions. Like the questions, they should have been well justified before this point. For example: Decline of fox population X could be due to loss of habitat, increased predation, declining prey populations… etc.
6. State predictions of ecological patterns you would expect to see for each hypothesis, based on the assumption it is true. Example: prey populations are most important to survival of foxes would predict survival will be highest in areas with the most prey. Note that the prediction is in the form of a numerical relationship, i.e., a statistical hypothesis. Although hypotheses and predictions are shown separately in the diagram, the most effective presentation usually pairs each hypothesis with its prediction
7. State the methods you will use to collect the data for your hypothesis tests. Example: I will obtain historical data on abundance of fox population X, changes in landcover, prey density, predator density … etc.
8. State the analytical methods you will use to test your hypotheses. Every analysis needs to tie explicitly to a hypothesis test, in the same order hypotheses were presented in the introduction. A common temptation is to present methodological steps sequentially, which can be confusing because connections to hypothesis tests become unclear. Avoid presenting any results in this section. Example: I will use multiple linear regression to evaluate the correlations between abundance of fox population X and percent availability of suitable habitat, density of prey, density of predators… etc.
If you have done a good job with your funnel then organizing a paper is surprisingly easy. The funnel for your proposal provides the foundation for the first half of your paper, sometimes with only modest changes needed. To add your results and discussion just flip the funnel, i.e., visualize your organization as an hourglass:
The results section should be organized identically to your methods section, according to your research questions and hypothesis tests, under the same subheadings.
Writing a discussion is very difficult after being immersed in data collection and analyses. Even experienced scientists can struggle to broaden their perspective enough to interpret their results well, at least at first. A sure sign that you might have such research myopia is that your results seem self-evident and writing a discussion that doesn’t just repeat your results is difficult. Results are never self-evident to people unfamiliar with your research; you need to put yourself in their shoes when you interpret your results, which isn’t easy. Allowing other researchers that do not share your research focus to read and critique your writing can help a lot.
The bottom half of the hourglass:
9. Give a broad overview of the data you collected, e.g., the number of animals you captured, their sexes and ages, number of locations per animal, etc.
10. State the results of your statistical tests. These should be very concise and factual. Do not interpret results here, save that for the discussion.
11. Your statistical results represent tests of your hypotheses, either supporting them or refuting them. Those results have implications for the biological arguments you used to justify your hypotheses, here is where you discuss them. If your results do not support a hypothesis, what does it mean for the biology behind it? What does it mean if your results do support it? Interpret each hypothesis test in the same order as hypotheses, methods, and results are presented throughout the paper. Conclude with a synthesis on the biological implications of all of your hypothesis tests combined—how do the results of your hypothesis tests answer your research questions? Example: My hypothesis that habitat loss has resulted in the decline of fox population X was supported. Foxes in population X are likely limited by prey availability, but other possible causes cannot be ruled out. My hypothesis that predation of foxes has produced the decline was falsified, suggesting... etc.
12. Scale up from the animals you sampled to your study system. What biological inferences for can you draw for your ecological system based on your hypothesis tests, e.g., how might predation affect prey populations in your region or in similar ecosystems? A note of caution: your interpretations now extend beyond your data set and are hypothetical. You need to present them as hypotheses, not conclusions. My results suggest that decline of elk populations in the northern Rocky Mountains may be driven more by loss of habitat than predation… etc.
13. Scale up further and discuss how inferences produced from your hypothesis tests integrate with other research addressing questions similar to yours. How are your results consistent with other work, how are they different? What explanations can you offer to explain this? How does a synthesis of your findings with those of previous studies provide new insights? Be sure to include work on species other than yours. Take care to also state these explanations as hypotheses. Example: Previous work has shown that elk populations are strongly affected by predation. My work demonstrates that this may not be the case where… etc.
14. Your final scaling up will be a discussion of how everything prior to this point addresses the fundamental concepts motivating your work. How did those concepts help explain what you found, how did they fail? What explanations can you offer? Do your results call these concepts into question or add to them? Example: Theory has long posited compensatory and additive effects of food limitation and predation on growth of prey populations. My results suggest… etc.
15. All research is based on assumptions because it evaluates a simplified version of reality (e.g., assuming a landcover class represents the habitat needed by an animal). Violations of those assumptions can bias results. Present a concise, forthright discussion of each important assumption you made, what the effects of its violation on your results could be, and your assessment of the likelihood of a violation is. A discussion of standard assumptions associated with statistical tests is not usually needed unless their validity is debatable for your analyses.
Many wildlife research papers conclude with a management implications section. This is difficult for many researchers to write, especially if they don’t work closely with managers. The management implications section should not rehash topics or conclusions from the discussion. It should present a succinct series of statements that can help managers make decisions. There is little value in the suggestion that your work illuminates important ecology that managers should consider. Tell them how they should consider it. If you can’t, then your results have no management implications. Discussing your results with a manager prior to writing this section is a good idea. Finally, avoid any appearance of condescension when you advise managers based on your research (i.e., never use the word “should”). Appearing to talk down to managers is the surest way to see that they won’t use your work.
Even with funnels and hourglasses in mind, staring at a blank screen to start your writing is a bad idea. It leads to rambling, stream-of-consciousness writing that produces long and confusing papers. Outlining can reduce this problem considerably, if it is done well. Before drafting an outline, it is important to understand that the way we were all taught to write in English classes (i.e., using successive arguments that lead to a conclusion) is ineffective for scientific writing. Journalists call the traditional approach “burying the lead.” Journalists are taught to state the conclusion first then use subsequent material to expound on it (because editors trim printed articles from the bottom up to fit available space). This approach lends itself to effective scientific writing as well.
One way to outline accordingly is to plan on: A) having a topic paragraph that starts each section of a paper that summarizes the section’s content, followed by paragraphs that explain it, and B) having a topic sentence that starts each paragaph, followed by sentences that explain it. An effective outline can then be simply writing the topic sentence for each paragraph. You should be able to get all of the general information you need to understand your research and its findings by reading such an outline. If something is missing then you can add a topic sentence for a new paragraph. If topic sentences seem redundant you can eliminate a paragraph. For each topic sentence you can then outline the sentences that will explain it. As you add these sentences you will find paragraphs can become large and complex, suggesting the need to split them out into >1 topic, or they become short and uninformative, suggesting the need to collapse multiple topics into one paragraph.
Learning to write in a style appropriate to scientific communications is a discipline to be mastered. Nobody is born with the ability to do it well; everyone learns it over time and through dogged struggle. Many books have been written to explain writing styles appropriate to science. I think the following 3 are pretty good.
Schimel, J. 2012. Writing Science: how to write papers that get published and proposals that get funded. Oxford University Press, New York, USA.
Greene, A. E. 2013. Writing science in plain English. University of Chigao Press, Chigao, IL, USA.
… and of course the venerable…
Strunk, W., Jr., and E. B. White. 2000. The elements of style, 4th edition. Allyn and Bacon, Boton, USA.
I won’t attempt to expound on these, except to emphasize several style issues I encounter regularly.
Be concise. Most of us are unnecessarily verbose when we write. Good writing maximizes the density of meaning per word. Accordingly, words should be selected to have the fewest syllables, sentences should contain the fewest words, and paragraphs should contain the fewest sentences needed to convey clear meaning. Initially writing what comes naturally is acceptable, as long as it is followed by ruthless removal of everything extraneous (which invariably is more than we think it is).
These are true. No one gets it right the first time, or the second, rarely the third…
Avoid jargon. This is difficult because science is so laden with technical terms, or terms that are commonly used but whose meaning can vary among readers (e.g., habitat). Jargon unnecessarily impedes clear communication and nearly all of it can be communicated in plain English. Ambiguous terms should be defined and technical terms should be limited to results and methods sections.
Use first-person, active voice throughout the paper (e.g., I collected data) instead of third-person, passive voice (e.g., data were collected).
Avoid abbreviations. They can interrupt the flow of reading because the reader has to slow down to recall what the abbreviation means, particularly if it requires backtracking to where the abbreviation was first defined. Abbreviations may be substituted sometimes for frequently used, complex terms (e.g., NDVI for normalized difference vegetation index).
Avoid using nouns as adjectives, particularly in strings. For example, “The university has a community relations improvement program” should be “The university has a program for improving relations with the community.” This may seem counter to the need to be concise, but such phrasing is much easier to read and therefore worth the extra words.
Remember: your job as a writer it to make the reader’s job as easy as possible. Learn to identify and remove anything in your writing that increases the workload of the reader.