Writing for Science Journals: errata and additions for the 2014 edition

This page contains errata (plus corrections) and additions to the 2014 edition of the book:

Miscellaneous information
Chapter 2: Ethics

Chapter 3: Choosing a journal

Chapter 4: Outlines

Chapter 5: Using your word processor

Chapter 6: Structure and format

Chapter 7: The first pages

Chapter 9: Methods and materials

Chapter 10: Results

Chapter 11: Discussion and Conclusions

Chapter 13. References and citations

Chapter 14: Experimental design and statistics

Chapter 15: Numbers and variables

Chapter 16: Figures

Chapter 18: Online supplemental material

Chapter 19: English difficulties

Chapter 20: Writing style

Chapter 21: Preparing for peer review

Chapter 22: The review process

Software

Bibliography

Miscellaneous information

Thoughts that don't fit well into the existing chapters, but that may nonetheless be important:

The future of journal manuscripts: We are living in an era when even a proven approach to communication, the traditional journal manuscript, may be about to change radically. Dorothy Bishop has some intriguing thoughts about Will traditional science journals disappear? I particularly like the idea of peer-reviewing research plans before the research is conducted. We should be doing this already, but since many of us don't, perhaps it's time for journals to get involved in this aspect of science.
MSc and PhD thesis copyright and originality issues: One popular form of thesis requires a graduate student to publish several articles in a peer-reviewed journal, then combine them into a single document to create the thesis. Although this approach has many advantages, it may not be legal: the published journal articles are protected by the journal's copyright, and cannot be republished without the publisher's permission. In addition, because the work has been previously published, it is possible that the resulting thesis will not be considered "original work", and thus may not be eligible for publication as a thesis. (As a result, some universities do not allow this approach.) If you are considering whether to write this kind of thesis, (i) confirm that it is permitted under your university's rules and regulations by contacting the university's Intellectual Property Office and the person in your graduate school who is responsible for approval of theses to obtain written permission, and (ii) obtain written permission from the journals that will legally allow you to republish your papers in your thesis. Store these permission letters somewhere safe in case you need them in the future.
Finding author guidelines: It's not always easy to find a journal's author guidelines. Google is often your best bet: use "instructions to authors" or "author guidelines" as the search term and you'll usually find the correct information in the first few hits.
Learning (or helping your students learn) how to read journal papers: Dr. Jennifer Raff provides an excellent guide for non-scientists that may also teach you a few things. Specifically, if you can answer all the questions she raises and perform the exercises she recommends, you can ensure that your paper is fair, complete, logical, and persuasive.
Programming and models: In graduate school, one important skill you learned was the ability to learn new skills. In modern science, computer programming is one of those new skills. Unfortunately, programming is itself a discipline that requires expertise, and if you haven't spent several years earning a degree in this subject, you're not an expert. The best solution is to work with an expert to develop any software you will need to support your research; the second-best solution is to have such an expert review the program code that you've written to ensure that it is correct, sufficiently accurate for your needs, and computationally efficient. Once you have designed robust software, design equally robust test datasets that you can use to validate the software's output; remember to include "edge cases" (extreme or unusual values) to ensure that your software handles those exceptions correctly. Errors that enter the literature can have important consequences, including endangering human lives and misleading future researchers for years or decades before the problem is detected.
Unique identification of each researcher: In a world as large as ours, many people share the same name. How can we know which person we are looking for, such as when we want to contact a research colleague to obtain information? The goal of the ORCID organization is to provide a way to uniquely identify every researcher, and this is sufficiently important that some journals now strongly recommend that authors obtain this identification. It only takes a few minutes to sign up, and registration is currently free.
Sharing your research through collaboration networks: STM, an association of academic and professional publishers, has proposed a list of "Voluntary Principles on Article Sharing on Scholarly Collaboration Networks". It will be worthwhile monitoring this initiative to see how it evolves over time.
Reading a journal manuscript is a learned skill:In case you needed another reason to make an effort to simplify your writing, remember that students and researchers early in their career need to be taught how to read journal papers.
When you speak to the public, remember that they don't use the same words you use: Tom Gauld's clever cartooon reminds us that we need to modify our language when we speak to the public.
It's OK to feel stupid: In "The importance of stupidity in scientific research", Martin Schwartz reminds us that the most interesting science and research happens when we remember that we don't understand everything, and that this is why we do research. After all, there's no point researching things we already understand.
Researcher identification numbers: The digital object identifier (DOI) system is intended to provide a unique identifier to every publication. However, it's also important to be able to uniquely identify individual researchers (e.g., to support literature searches, to evaluate research productivity based on the number of publications). There are various ways to achieve this, but one of the most common that I see in my work is the Open Researcher and Contributor ID (ORCID) system. Registration is quick and easy, and there is no cost, so I recommend that you apply to obtain an ID number.
Reporting guidelines for research: The "Equator" network (Enhancing the QUAlity and Transparency of health Research) offers guidelines for reporting 421 types of research, including randomise trials, observational studies, and qualitative research.
Publish and perish? Dr. Iva Cheung recently completed her PhD degree, and had a few thoughts about "publish and perish" and just how dysfunctional and broken academic publishing can be.
Converting PDF files into HTML: If you need to convert a journal manuscript in PDF format into HTML format, try the "Paper to HTML converter" software. Although this is impressive, it's worth noting that if the original word processor document is available, most word processors can save their contents in HTML format that's good enough for most purposes. However, you'll miss things like automatically creating links to headings that may be useful, particularly if you need to convert many documents.
Backups: Most researchers understand the importance of backing up their data, but may forget about the importance of backing up their experimental materials. If you've spent several years and thousands of dollars developing something unique, such as a genetically important organism or a tissue culture, you should protect that investment too, whether against fires and floods or against careless janitors.
The high cost of reformatting: One of the least enjoyable tasks when we write for journals is the need to reformat the manuscript for every journal, since journal editors seem to love inventing cumbersome formats just to make their mark on the journal. Reformatting wastes millions of dollars annually that would be better spent on research. What can be done about this waste? Work with your colleagues and professional associations to pressure journal publishers to accept free-format submissions (i.e., any format that is clear and consistent)—or better still, to develop a single format for all journals that emphasizes both clarity of communication and ease of formatting.

Chapter 2: Ethics

The ethics of small errors: It sometimes seems that small errors in a manuscript are only a small problem that you can ignore. But if you consider the number of people who will use your descriptions of methodology, your data, or your calculated results in their own studies, you'll understand why it's essential to solve these problems. A problem that takes the reader only 5 minutes to solve becomes much more significant if 1000 people will read your paper and each must spend 5 minutes solving that problem. The consequences are worse if they do not notice the problem. The ethical solution is to carefully scrutinize your manuscripts so that nobody else will ever encounter these problems. This is particularly important for simple calculations such as differences and averages. Authors often make mistakes in these calculations, and the errors can affect every subsequent reader of your paper. Most readers will not take the time to search the literature to learn whether you or a reader has identified and reported a problem.
Ethics of mistakes that invalidate some or all of your data: The always enlightening Randall Munroe describes ethical and unethical solutions you can choose when a data error occurs.

Chapter 3: Choosing a journal

Speed of publication: Another good reason to choose a journal is its speed of publication. This represents a combination of how long peer reviews take (some journals require quick reviews, others don't) and how long it takes for your manuscript to be published after it's been accepted. For graduate students, it's sometimes necessary to publish one or more papers before you graduate, and during evaluations for employment opportunities or tenure, it can be important to have as many publications as possible to improve your chances. If you're working in a highly competitive area of research, it can also be important to get your work into print before any of your competitors. In such cases, you must balance the need to publish in the most prestigious or appropriate journals with the need to publish your research quickly. Also, be careful not to sacrifice quality control in an effort to publish quickly: there's no point publishing your paper quickly if it must be subsequently withdrawn or if people will not cite it in the future because of serious errors. There are also ethical consequences to rushing something into print before it's ready to be published.
What if your paper straddles two disciplines or two different audiences? One option is to present parts of the same information in two journals, but revised heavily to account for the different needs of the different audiences. For example, a large and complex study may be interesting from a theoretical perspective, in which case you can publish the theoretical aspects in a journal that focuses on theory or basic science. If it also has important implications for practice, you can write about the practical aspects in an applied-science journal that is read mostly by practitioners. This often happens in modeling studies: the model development is of interest to other modelers, and requires a full paper just to describe the model, but use of the model (not its development) is of interest to practitioners. If you want to try this approach, get permission first. Clearly explain your proposal to the editors of both journals in a query letter before you begin writing the paper. Use wording such as the following: "I want to publish a paper on the methodological and mathematical details in a paper for [name of journal] but with your permission, I would like to submit an article to [name of the editor's journal] that explains how these details apply in a case study of using that model." (Reverse the wording for the editor of the other journal.) Then explain why you believe the two papers will be distinct contributions to the research literature: "I will differentiate between the two papers as follows [summary of the differences]." So long as each paper contains significantly different information and meets the needs of different audiences, and so long as you obtain permission from both editors, this is a legitimate approach.
A new way to assess journals: The Center for Open Science has published New Measure Rates Quality of Research Journals’ Policies to Promote Transparency and Reproducibility. This article proposes a new "TOP Factor" that scores journals based on their transparency and the reproducibility of the research they publish. Worth a look!
Additional criteria to evaluate when evaluating a journal: When you evaluate whether to submit your paper to a journal, determine whether it complies with the Principles of Transparency and Best Practice in Scholarly Publishing. Most reputable journals clearly indicate in their Web site or instructions to authors how they meet these criteria.
Predatory journals: Some journals exist solely to take money from researchers without providing any benefits, such as rigorous peer review. These are often called "predatory" journals. Because their publishers are highly motivated to fool you, it can be hard to identify such journals. One good step is to consider whether they meet the criteria for Principles of Transparency and Best Practice in Scholarly Publishing. Grudniewicz et al. (2019) propose the following definition: "Predatory journals and publishers are entities that prioritize self-interest at the expense of scholarship and are characterized by false or misleading information, deviation from best editorial and publication practices, a lack of transparency, and/or the use of aggressive and indiscriminate solicitation practices."
More about how to avoid predatory journals: Nature offers warnings about how "Predatory Journals Entrap Unsuspecting Scientists". In particular, they provide useful links to resources such as Think Check Submit that will help you evaluate journals more critically.

Chapter 4: Outlines

Outline in English as a second language, or in your native language? If your English skills are weak, it may be more efficient if you create your outline in your native language, and then translate the results into English. Many writers (including me) find it easier to write and think in their native language. Once the writing is done, you can then translate the results into English. If you are highly skilled with English, this may less efficient because it adds a separate translation step that can take almost as long as the initial writing. Try both methods and see which works best for you.
More thoughts on dividing your results into two or more papers: In some cases, adding relevant data to a paper would require additional trips to the field to collect that data. In lab research, it may require weeks or months of work to obtain or create a new sample that will provide that data. This may be impractical for financial or logistical reasons, and in that case, it would be perfectly legitimate to collect the missing data in future research, leading to the publication of two papers on essentially the same research. However, sometimes the missing data is easily available, as in the case of meteorological, geological, astrophysical, or other datasets that are readily available online. In that case, a separate study solely to obtain the missing data (thus, publication of a second paper that incorporates the "new" data) is difficult to justify, since the data is (by definition) easily available.
Start with the journal, not the idea? Every journal has a specific focus (e.g., experimental rather than theoretical results), and this focus determines both what you will write about and the data you will require to support that description. If you want to publish in a journal with specialized needs, design your research to provide the right kind of data and enough of that data to meet those needs. This will greatly reduce the risk that reviewers will send you back to the lab or the field to obtain more data. It will also mean that you can develop an outline specifically designed to support the journal's focus. Although it is possible to write a generic article that would be somewhat suitable for any journal, such articles generally require considerable revision to meet the needs of the journal. You'll save much writing and revision time if you start with an outline optimized for the journal that will review your paper.

Chapter 5: Using your word processor

Entering accented characters: Alan Wyatt's WordTips offers additional ways to enter accented characters.
Word processor efficiency: Even if you like using alternatives to Microsoft Word, such as LaTeX, Word may prove to be more efficient (Knauff and Nejasmic 2014, "An Efficiency Comparison of Document Preparation Systems Used in Academic Research and Development").

Chapter 6: Structure and format

Other fictional detectives: Because this book is also intended to help international authors, I have provided the names of some key detectives from other cultures. For China, Judge Di Renjie is an important historical figure, but has had several novels written based on his life. For India, Feluda and Byomkesh Bakshi are good examples. If you have other suggestions, please send them to me!
Telling stories really works: Hillier et al., in their paper "Narrative Style Influences Citation Frequency in Climate Change Science", note that better storytellers seem to have their papers read and cited more often. Unfortunately, the authors do not appear to have controlled for the actual quality of the science, which is likely to be at least as important as the writing style in determining whether an article will be read or cited.
Another way to think about how to structure your paper: PLOS describes "ten simple rules for structuring papers".

Chapter 7: The first pages

Choosing titles and keywords that increase your paper's findability: Wiley has provided guidelines that increase the likelihood that readers will find your paper in a Web search by sites such as Google.
An interesting study of the power of Abstracts to intrigue readers: Statistics notwithstanding, I think the message of this article is that it's more important to write interestingly and well than concisely.
Keyword suggestions: If you're having trouble thinking of keywords, try describing your study in one sentence. If you then break that sentence into its component words or phrases, many of these will be suitable choices. For example, if your description is "we developed a shared-cost econometric model", then model, model development, cost sharing, and econometric are all potential keywords.
Curated keyword lists: Some fields are actively developing lists of standardized keywords to make searching easier. In addition to the suggestions in my book, psychology researchers should consider the PsycInfo thesaurus of index terms) and medical researchers should consider the PubMed medical subject headings index.
Contents of the "Highlights" section: Many journals ask authors to include a series of bullet points called "highlights", which are usually a list of four or five short points, each with a maximum length of 85 characters (including spaces). What should you include in this list? Consider this as being similar to an extremely short version of the Abstract, in which you describe the scientific problem you studied, explain why it's important, note the methods you used to study the problem, report your most important results, and state the implications of those results.
Choosing titles to increase citations: There's considerable debate over the optimal form of title for a manuscript. However, in Catchy, Clear, Concise: Three-part Phrases Boost Research Paper Citations, Dalmeet Singh Chawla suggests that titles composed of three-part phrases (e.g., "social, economic, and environmental benefits") are more likely to be cited by other researchers. This supports previous research that suggested "catchy" titles, including humorous titles, are more likely to encourage potential readers to become actual readers.

Chapter 9: Methods and materials

Making your research protocols available to other researchers:Making research methods more easily available is a great way to improve your own research and that of others. Nature has created a "protocol exchange" that accomplishes this purpose. The goal is to create an open repository of methods, and may evolve into a way to provide a standardized description of your research methods without having to repeat them in the Methods section of your paper.
Paste tense versus present: Verb tenses are complex, and some aspects of their use are somewhat subjective; that is, you have options. For example, the present tense should be used for things that are true at the time of reading and that will likely remain true in the future. This is why we use words such as "Figure 1 shows" : it shows that thing you're describing now, while you're reading the paper, and will always show that thing each time you return to the paper. Results are a bit trickier: because the research and analysis occurred in the past, it's legitimate to use the past tense to describe them: "Our statistics showed". In the future, with more knowledge, we may change our mind and decide that the results show something else entirely. But it's nonetheless true that at the time you are reading the paper, in the present, the results are still correct. Thus, you could instead write "our statistics show". To some extent this choice of tenses is a matter of personal preference, so journals or reviewers may require you to use one or the other approach.
Overfitting and underfitting: When you create a model from a large dataset, it's common to divide the model into a training dataset, which will be used to train the model, and a validation dataset, which will be used to test the applicability of the model to other data. (Most of my authors seem to use 90% of the data for training and 10% for validation. Consult the literature for your field of research to learn the recommended balance between the two datasets.) Overfitting occurs when the training model fits the training data so well that it can no longer be generalized to other datasets. This may result from including extraneous variables in the model; for example, if you created one variable per data point, your model would completely describe the training set but not any other dataset. Conversely, underfitting occurs when variables that could explain key aspects of the data are missing from the model. WikiPedia's article on overfitting provides some good guidance on this subject.

Chapter 10: Results

Deciding which results to report: In any large or complex study, you are likely to obtain more results than you can present within the available space. You can decide which ones to report by defining a quantitative criterion and only discussing results that meet that criterion. The most obvious criterion is whether a result supports or contradicts one of your hypotheses. Statistical significance is a good starting criterion: emphasize significant results that support a hypothesis and non-significant results that either fail to support or that contradict a hypothesis. If there are still too many results, choose a defensible quantitative criterion for which of the remaining results you should focus on. For example, only discuss the three largest or most important results, or focus on results that show the same pattern in one group of treatments but different results in another group. These similarities and differences are important.
Avoiding bias when interpreting your results: Writing in Nature Regina Nuzzo ("How scientists fool themselves – and how they can stop") provides some important insights into how to interpret your data correctly—by avoiding many of the common errors that human thought is vulnerable to.
A more objective way to identify outliers: There are many subjective ways to eliminate outliers, but the problem with them is that they're subjective. As a result, they represent guesswork rather than a defensible choice. Fortunately, there are statistical tests you can use to identify outliers. For normally distributed data, you can use the Grubbs test, Chauvenet's criterion, or Dixon's Q test. For data that follows a Gaussian distribution, try Peirce's criterion.
Problems with the formulation "1 in x": Many cognitive biases affect how data is interpreted. For example, researchers often describe risk using the "1 in x" formulation with wording such as "you have a 1 in 10 chance of developing cancer if you live long enough". In "Time to Retire the 1-in-X Risk Format", Brian Zikmund-Fisher discusses the problem. In summary, it's easier and less error-prone to understand and perform calculations such as doubling risk when the risk is expressed on a constant basis (e.g., per 1000 individuals). In addition, readers may overestimate risk with 1-in-x presentations (e.g., 1 in 100) more frequently than with constant-basis presentation (e.g., 10 per 1000). Mathematically, the meanings are identical; the problem is with how the numbers are perceived.

Chapter 11: Discussion and Conclusions

Differences between Conclusions and the Abstract:One difference between the Conclusions section and the Abstract is that the Conclusions should explicitly refer back to the hypotheses and research objectives you defined in the Introduction. There are two others: (1) Because the section focuses on your research in the present study, you should not include literature citations because citations, by definition, refer to previous research. (2) You should focus on overall principles that result from your research rather than repeating numerical results that you have already described at least twice (in the Results section and the Discussion).
Remember the limitations of any models you develop: Bret Devereaux, historian, notes that "A simulation is not a real-world experiment, but rather a thought-experiment." All models of the real world rely on one or more assumptions, usually chosen to simplify a very complex reality. Never forget those limitations when you discuss your results and the implications of your model.
Always challenge your assumptions: During most of the first year of the Covid-19 pandemic, public health officials relied on outdated and incorrect assumptions to model the spread of the virus. In summary, they believed that Covid-19 was spread primarily or exclusively by droplet transmission, and that maintaining a disance of 2 m and washing your hands would protect us. This was based on an outdated and incorrect definition of the maximum particle size for aerosols. Thousands of people died before health authorities were willing to accept the possibility of aerosol transmission; even today, more than a year after the pandemic began, authorities continue to recommend distancing and hand hygiene rather than improving ventilation of enclosed spaces (to remove aerosols) and wearing an effective mask (to prevent inhalation of virus-bearing aerosols). Always try to identify and challenge your assumptions; the older the assumption, the more likely it is no longer valid or has been modified to account for new knowledge. This is particularly important if the consequences of an incorrect assumption are serious.
Directions for future research: When you describe the limitations of your study, be aware that if it's relatively easy to solve the problem you've described, reviewers may ask you to perform that analysis. For example, if you suggest that future research should add more environmental variables in your analysis, and that data is easily available, you'll probably be asked to do that now rather than in a future paper. Your paper will be stronger if you fix the problem before you submit your paper for review. Otherwise, only mention simple fixes if you're willing to perform that analysis if the reviewer asks you to do so.

Chapter 13. References and citations

Humor about citations: Jorge Cham, researcher and cartoonist, has provided a series of funny cartoons about literature citations. They also have much truth in them. They start on 11 September 2015 and continue for several cartoons.
Goals of literature citations: Buehl (2016), added below in the bibliography, suggests several possible reasons to cite a paper. Here are my suggestions based on his categories: to describe methodology, provide affirmation (e.g., to support an assumption), provide theoretical or empirical support for a concept, provide a contrasting result (particularly if the contrast reveals an important underlying process that differentiates between two situations), negate something (e.g., disprove an assumption or explain contradictory evidence), and persuade the reader (e.g., to provide additional evidence to supplement your own data).
Advice on when and how to cite references: Springer provides useful guidelines for literature citations.
Why you should always read the original article before you cite it: Authors sometimes make errors when they cite another author's article, particularly when they only read the Abstract rather than the whole article. These mistakes enter the literature, and may adversely affect future authors for decades. Another reason to read the original is that the original article doesn't always exist.
JSTOR permanent links to content: JSTOR is an online resource that provides access to millions of journal articles, books, and "primary sources". Recently, to fight the problem of disappearing links, it added a service of permanent (stable) links to its online content.
Citation graphs promoted by the Open Citations Initiative: The Initiative for Open Citations is developing a way to trace the history of research through citation databases. These graphics show networks of citations, and are therefore called "citation graphs" or "citation networks".
Learning to perform literature searches: The University of Nottingham provides a useful introduction to searching the PubMed database.
JabRef reference manager: JabRef is an alternative for reference management, particularly if you prefer to use the LaTeX format.
Classifying your literature search results: When you begin reviewing your literature search results, consider dividing the references into useful categories, such as "include" (cite the reference in your paper) and "exclude" (references you will not cite, along with the reason). Particularly for large literature reviews, it may be useful to provide the list of excluded references to the journal's reviewers so they will understand why you didn't cite a seemingly relevant reference in your study. If they accept your rationale, then you won't have to explain this exclusion when you respond to review comments.
Is a citation really necessary? Obvious statements of fact don't require a literature citation. For example, if you state that rice is an important crop for an Asian nation's food security, no reference is required. However, if you want to report rice production statistics, you'll need to cite the source fo those statistics. However, where a supposed fact is not universally accepted (i.e., there is disagreement among researchers), provide a citation to support your choice of an interpretation. When you express a personal opinion that may not be widely accepted, provide support for that opinion based on your data, the literature, or both, particularly if the point is subtle or not obvious.
Choosing and evaluating literature: Historian Bret Devereaux offers some interesting insights into how historians evaluate the literature in their field: First, they define a research question; next, they select literature that will help to answer the question; third, they evaluate the literature from several perspectives; and finally, they interpret it in light of the question. Though scientists work somewhat differently, the same principles apply.
List of journal name abbreviations: The ISSN Organization offers a list of ISO-standard abbreviations for journal names.
Journals may consciously or unconsciously manipulate citations: It's rarely worth fighting with journals or peer reviewers over requests to add certain literature citations. But a recent article published by Nature (Researchers who agree to manipulate citations are more likely to get their papers published) suggests that refusing to do so may actually reduce the likelihood your paper will be accepted.

Chapter 14: Experimental design and statistics

Eliminating hypothesis testing? The journal Basic and Applied Social Psychology has decided to eliminate the reporting of levels of statistical significance. For some of my thoughts on this subject, see my blog.
More thoughts on the significance of statistical significance: Cosma Shalizi, professor of statistics at Carnegie Mellon University, has some interesting thoughts about why Any P-Value Distinguishable from Zero is Insufficiently Informative.
Learning the R statistical software: I found a good article that describes this popular software and resources for learning it.
Calculating the required sample size: For a clear discussion of how to calculate the sample size required for statistical significance, see the following article: Kadam, P.; Bhalerao, S. 2010. Sample size calculation. International Journal of Ayurveda Research 1(1): 55-57.
Maximizing research impact by maximizing the number of papers: It is most common to produce a single paper from each study, but the larger the study, the less efficient this is in terms of maximizing that study's impact. Impact can be assessed in many ways (e.g., citations of your research by other authors), but one important way is to publish as many papers as possible from each research project. This should not be done by artificially dividing what should be a single well-integrated paper into multiple fragmentary papers. But when you design your research, look for ways to produce different papers from different aspects of the research. For example, if you develop and validate a new method at the start of a study, a description of that method could become the subject of a metholodogical paper for a journal that prefers to publish methodology papers; data collected using the validated method could then become a second paper for a journal that prefers to publish research results.
Data mining carries a risk of statistical errors: In the article "Researchers look to add statistical safeguards to data analysis and visualization software", Kevin Stacey describes a new tool that can reduce the risk of false discoveries due to the "multiple hypothesis testing error".
The importance of visualization: Very different datasets can produce the same summary statistics (mean, SD, etc.), and you can't always detect this problem just by examining those statistics. As Justin Matejka and George Fitzmaurice note in their article "Same stats, different graphs", graphing the data can reveal the real meaning of the summary statistics.
Panel data: If you are studying multidimensional data obtained for both individuals and points in time, the resulting data set is called "panel data". "Time series" are a subset of panel data in which a single individual is tracked over time. "Cross-sectional data" is a subset of panel data in which multiple individuals are examined at a single point in time. If you combine the two approaches, you achieve the same effect as in panel data. See the Wikipedia article for more details.
Before basing research on published data, consider validating that data: In "One in five Materials Chemistry papers may be wrong", the authors report a recent study that suggests a significant number of studies may have reported data that diverges greatly from the real values, whether due to measurement errors or natural variability. Thus, before you design your future research based on someone else's data, consider validating that data.
The importance of framing a clear hypothesis:The hypothesis you define before you begin your research is important, because it constrains your analyses and the methods you will use to obtain data for those analyses. Those constraints help you to produce results that test the hypothesis rather than providing irrelevant or less relevant data. In that way, the hypothesis also helps restrict the scope of your study so that you don't pursue interesting but unrelated questions. After writing your hypothesis, carefully examine it to ensure that it provides a single overall question, or a small group of closely related questions. Then for each question, think about what method will provide the best answers to that question. Ideally, choose two or more methods that provide mutual confirmation (i.e., triangulation).
The value of P values: The American Statistical Association has published the policy statement "The ASA's Statement on p-Values: Context, Process, and Purpose" to provide a reminder of the purpose and limitations of P values. A special issue of the journal The American Statistician contains 43 papers that together strongly urge a paradigm shift in which scientists stop relying on P = 0.05 as the only measure of significance for a paper. This number has always been an arbitrary choice, and has no basis in the laws of nature, so it's important to consider what statistical significance really means: it is not an absolute measure of the truth of a statement, but rather is a way to describe the relative confidence we have in the correctness of two or more measurements or calculations. That is, we should be more confident in a result with a very low P value than in a result with a very high P value, but this does not necessarily mean that the former result is correct and the latter result is wrong. Note that until the scientific community as a whole (or the specific journal that will review your manuscript) accepts this new philosophy, there will be a transitional period in which P = 0.05 remains the standard criterion. If you want to use the new approach, ask the journal editor for permission.
More about P values: For a simple and concise overview of the problems with P values, see: Demworth, L. 2019. A significant problem. Scientific American October:62-67.
The reproducibility problem: Recently, there has been much discussion of the difficulty replicating experiments. In "Open is not enough", a group of physics researchers describe how some of the techniques used in physics research to make data widely available and to adopt practices that make it easier for other researchers to replicate your research.
Problems with appropriate use of meta-analysis: Writing in The Lancet, Bishal Gyawali discusses "Meta-analyses and RCTs in oncology—what is the right balance?"
Thoughts on starting the design process: There are many ways to design an experiment, including starting with a previous design that you have used successfully. (In that case, carefully consider whether that design is appropriate for the new research questions that you want to answer, or must be modified to be suitable to account for the new context.) One useful thought exercise is to start by listing all factors that could affect the variables that you plane to measure. You may want to hold some of these factors constant so that you can focus more closely on other factors. For example, if you're studying plants, light and water are obvious factors that will affect their survival and growth. If you don't want to study the effects of drought, ensure that the plants are well watered so that you can ignore the effects of water and focus on the response to light. Next, narrow this list of factors to the most important factors that you can measure with the available resources (personnel, equipment, time, and money). Look for ways to control the other factors so that they have the smallest possible influence on your results. One key purpose of the literature review that will eventually appear in the Introduction to your paper is to help you understand what factors you must account for, and how other researchers have accounted for them. Learn from their successes as well as their failures.
Calculations of "effect size" to support a meta-analysis: When you review the literature to support a meta-analysis, you should consider converting the data into an effect size such as the "standardized mean difference" and other measures of effect size such as Cohen's d to facilitate comparisons between studies. The Campbell collaboration offers an online effect size calculator you may find useful. Software such as Covidence can help with data extraction, abstraction, and management to support reviews.
Statistical power: The statistical power of a test represents the probability that your data will reject the null hypothesis when the alternative hypothesis is true. Various methods exist for improving the power of your experimental design.
Evaluating the quality of evidence: Although designed specifically for healthcare research, the GRADE framework provides important insights into evaluating the quality of evidence that should be applicable to other fields.
Blinded experiments: Though blinding is a good thing in human studies, because it eliminates certain biases that arise from the perceptions of the participants and the researchers who interact with them, it is not without costs: in Fool's Gold: Why Blinded Trials Are Not Always Best, the British Medical Journal describes some of the lesser-known problems with blinded experiments.
Citing other papers to explain your results: In the Discussion of most papers, you will find authors who cannot explain some of their results based on the data they collected in the present study. They therefore cite previous research that provides a potential explanation. When this happens with your own research, the need to cite another researcher is a strong sign that you should measure that factor yourself in future research to confirm your explanation.
Avoiding common problems with statistics: Dyke, G., 1997. How to avoid bad statistics. Field Crops Research 51, 165-187.
Make sure you analyze all the data: When you analyze data, make sure to ask yourself whether you're analyzing all the data or just the data you have. The clearest example of why this can be a problem relates to "survivor bias": those individuals who died are often not included in the analysis, in many cases because they are no longer present to be studied. Trevor Bragden provides an excellent example using bomber aircraft during World War II. But there are many other examples, such as ignoring the people who didn't respond to a survey or ignoring field study sites that were too difficult or expensive to reach.

Chapter 15. Numbers and variables

"Are" as a unit of area measurement: Although the "are" (plural, ares; commonly abbreviated as "a") is a legitimate unit of measurement, equivalent to 100 m², it is almost never used as a unit of measurement in the West. Instead, convert all values to values per "hectare" (ha, equal to 100 ares), which is equivalent to 10 000 m². Depending on the size of the area being described, m² or km² may be more convenient units.
Italics for Greek letters: Although my explanation of why I don't italicize Greek letters for variables is logical (i.e., the purpose of italics is to communicate "this character is not being used as an English letter, but Greek letters are clearly not English letters"), it's clearly not consistent with the broader statement that italics should be used for all variables. This is why some authorities prefer to italicize all Greek letters that are used for variables. Others only italicize lower-case Greek letters, which makes no sense whatsoever, since these are usually the forms that are self-evidently not English letters; in contrast, upper-case Greek letters (which these authorities don't capitalize) are the ones that most often resemble English characters and would therefore benefit most from italics. As is often the case, style decisions represent tradition rather than logic, and you must choose which logic makes the most sense to you. Of course, if a journal that you're writing for follows a specific style, you should always follow that style, not matter how illogical. There are more important things to argue with journals about.
Difference between a power function and exponential function: Although both forms of equation involve exponents (powers), the form of the equation is very different. If m and b are numerical coefficients and x is the independent variable:

power function: y = mx^b

exponential function: y = ma^bx
Typing Greek letters and math symbols: If you're using Microsoft Word as your writing tool, the Autocorrect feature offers a whole list of special characters, including Greek letters (e.g., \alpha for α) and math operators (e.g., \+- for the ± character). To learn what shortcuts are available or add your own, select (Windows) File > Options > Proofing tab > AutoCorrect Options > Math AutoCorrect tab or (Macintosh) Tools > AutoCorrect Options > Math AutCorrect tab. In both operating systems, select the checkbox for "Use Math AutoCorrect outside of math regions". Deselect that checkbox to stop using that function.

Chapter 16: Figures

An online archive of different types of data graphics: The Data Visualization Project has provided access to its online collection of a wide variety of ways to visualize your data. If the familiar chart types that you've been using can't communicate your data—or help you to explore and understand it efficiently—consulting this list may provide a better solution.
Manipulating graph axes to magnify small responses: The comic XKCD, written by physicist Randall Munroe, provides a good example of adjusting your graph axes to conceal the fact that your data really doesn't show much of a response. Don't do this. Your readers are smart enough to detect such tricks, and they won't trust anything you say after they see such obvious fakery.
Colorblindness: A surprising number of readers have some form of colorblindness. To ensure that your graphics will still be usable to these readers, consult the Coblis Color Blindness Simulator.
Accessibility of figures: Many of our readers have various physical or other handicaps (e.g., impaired vision) that make it difficult to use the graphics (figures) we produce. Moritz Giessman has created the "Accessibility Cheatsheet" to summarize how to make your visual information easier for these people to access. This is particularly important as more and more information is moved from print to online information.
Guidelines for creating figures: Wiley offers a good, comprehensive list of guidelines for creating figures.
Useful tool for extracting data from published graphs: If you need to extract accurate values of the underlying data from a graph, such as when you are performing a meta-analysis, try the free WebPlotDigitizer software. Much faster and more accurate than trying to visually interpolate or extrapolate to estimate values.
Use logarithmic graphs with caution: As Dr. Sally Le Page notes, you should almost never use a logarithmic or semi-logarithmic graph to communicate with the general public. Because most non-scientists don't understand the meaning of logarithms, they can badly misinterpret the meaning of such graphs. In particular, it's easy to mistakenly assume that small differences along a logarithmic axis represent small numeric differences, and that often isn't the case. I have seen many scientists make this kind of mistake when they interpreted their own graphs, so if you feel that you need to use such graphs, carefully confirm that your interpretation is correct.
Choosing visually different greyscale values: When you design a graph so that different bars or symbols have different colors, it's helpful to know what different color intensities look like. The following illustration shows a greyscale chart, divided into 10% intensity intervals, that will save you time when you choose colors that need to be visually distinct:

The same chart can be used to choose initial intensities for colors, although some experimentation will still be required because of the different visual characteristics of each color.

Chapter 18. Online supplemental material

Sonication: Sonication is the equivalent of visualization, but using sound instead. In addition to the data contained in sound itself, sound can be used to communicate additional evidence about visual information. For example, in a study of movement by the nematode Caenorhabditis elegans, researchers discovered that synchronizing sound with the nematode's movements revealed important information (e.g., temporary stoppages of movement) that were missed when they relied only on visual images.

Chapter 19: English difficulties

English tutorial: Springer provides some useful guidelines on writing in English.
Arable vs. cultivated: "Arable" means that land is potentially suitable for cultivation, even if it is not currently being cultivated, whereas "cultivated" means that the land is currently being used to grow crops, even if it is not truly suitable for this purpose.
Concave vs. convex: "Concave" most often mean that the open end (the "cave") is facing upwards (v), whereas "convex" means that the pointy end is facing upwards (^). However, for clarity, it is better to state the direction explicitly. Examples: "concave up" (v), "concave down" (^), "convex up" (^), "convex down" (v).
Conditions: When you use a phrase that ends in "conditions", such as temperature conditions, flow conditions, and climate conditions, you can generally delete "conditions"; the meaning is included in the meaning of the first word (temperatures, flows, climate).
Effective vs. efficient: An effective solution works, though it may be expensive or not very fast. An efficient solution produces a high output per unit of input, but the actual output may be small if the input is also small. In mathematical terms, let O = output and I = input. An effective solution has high O; an efficient solution has high O/I.
Estimates: "Estimation" is a process; "estimate" is the result of that process.
Insignificant vs. non-significant: "Insignificant" refers to a very low magnitude, as in a number so small that it has no practical meaning. "Non-significant" refers to not statistically significant. A number can be statistically significant, but insignificant.
Measures vs. measurements: A "measure" (noun form) means a plan or course of action; a "measurement" is an attempt to quantify something. Thus, you can "take measures such as calibration of instruments to ensure the accuracy of your measurements".
Guidelines for English punctuation: The Punctuation Guide provides some simple and useful advice on American-style punctuation. Most of the advice will also be valid for other dialects of English.
Normalize vs. standardize: Use "normalize" when you mean that you used a mathematical transformation to transform a dataset that is not normally distributed so that it has a normal distribution; this allows you to perform certain statistical tests that require normally distributed data. Use "standardize" when you are converting a dataset so that parameters with different units of measurement all have the same range (usually from 0 to 1); this allows you to compare the proportional changes in parameters that could not ordinarily be compared.
Psychological terminology: In Frontiers in Psychology, Scott Lilienfield et al. provide a list of 50 terms to avoid or use carefully because of confusion over their meaning.
Relatively: This word can generally be replaced by a shorter word that contains the same meaning of "relative to" some standard. For example, "higher" means the same things as "relatively high", "larger" means the samething as "relatively large", and so on.
Reproducible vs. replicable: These two words have increasingly begun to be used as synonyms, but as Mark Liberman notes in Language Log, they should have clearly distinct meanings. Reproducible most often means that another researcher, using the same dataset you used, will achieve the same analytical results; replicable most often to mean that if another researcher repeats your experiment, under nearly identical conditions, they will achieve the same dataset. If they then analyze that dataset and obtain the same results that you obtained, then the experiment was also reproducible. Unfortunately, this useful distinction is being lost in the literature, and once that happens, it's not possible to restore the words to their original meanings. Given that there is now considerable confusion over the use of this terminology, I recommend that you both use the correct word and clearly indicate its meaning to ensure that readers know how you're using that word.
Resulted in [noun that describes the result of a verb]: In general, you can delete "resulted in" and simply change the noun into a verb form. For example, "resulted in an increase" = "increased".
Stochastic vs. random: A truly random process cannot be predicted; that unpredictability (the lack of a pattern) is part of the definition of random. However, randomness may have boundaries, as in the case of an electron's position within an orbital: the actual position is impossible to predict, but will always be within that orbital. "Stochastic" is sometimes used to refer to processes that appear random, but that have some underlying rule that constrains their values (e.g., that constraint the electron within its orbital), making their value predictable to some extent. Outside of specialized disciplines such as mathematics, the two terms are used interchangeably.
Techniques vs. technologies: use "technologies" to describe machines and software; use "techniques" (alternatively, "methods" or "approaches") if you mean how something is done. That method may or may not use technologies.
Terminology definitions: Elsevier has introduced a new resource, ScienceDirect Topics, that provides definitions of the common—and less common—terminology used by scientists. The service is currently free.
Hyphenation problems: Although hyphens are necessary for clarity in compound adjectives (two or more words that together act together to modify or clarify a noun), they can make text harder to understand if overused. In addition, researchers recently discovered that hyphens in titles may reduce citations of the paper. It seems likely that this is a relatively simple bug in the search and linking algorithms used by indexing sites such as Scopus and Web of Science, but until the problem is fixed, it's probably best if you simplify hyphenated titles to eliminate the hyphens. For example, change "age-adjusted mortality" to "mortality adjusted for age".
Commonly confused words: Crystal Shelley provides a handy list of similar-seeming words with different meanings that complements the list in my book.
Free book on writing science in English: Matthew Stevens offers his book Writing Science in English: a Guide for Japanese Scientists (free)

Chapter 20: Writing style

Paraphrasing: It's important to note that simply changing the word order and using synonyms rather to replace another author's original words is generally not sufficient. You must express the author's thoughts in your own words, and the best way to do this is to make an effort to understand how their thoughts relate to your own work. Focusing on that relationship will generally let you find a way to communicate the key points in your own words because the focus shifts from their study to your study.
Citing your previous work: When you are describing research similar to what you have done previously, you should clearly cite that previous work and clarify how your new paper builds on (or replicates, if replication was necessary) that previous work. This is particularly important when you are repeating previously published data to support your analysis in the present paper.
Focus on the tangible message rather than its abstract basis: In The Sense of Style, neurolinguist Steven Pinker emphasizes the importance of focusing on the tangible message rather than the abstract basis for that message. For example, we should use phrases such as "weight increased with increasing food consumption" rather than "there was a strong positive correlation between weight and food consumption". Similarly, we should emphasize the meaning rather than the variable used to arrive at that meaning: the "depth below the soil surface", not the "sand burial depth treatment", or "as the NaCl concentration increased", not "as the NaCl treatment increased". The meaning is the same in both cases, but you communicate it more directly.
Some thoughts on how to write better, more interesting papers: Writing in the London School of Economics and Political Science blog, Lewis Spurgin offers some thoughts on Science and the English Language—Lessons from George Orwell.
More thoughts on how to write a clear and interesting paper: Cormac McCarthy offers good advice on how to write a good science paper.

Chapter 21: Preparing for peer review

Submission letters: Not all journals require a submission (cover) letter, but if the journal's author guidelines require this letter, ensure that you have included all the information that they require.
Guidelines for peer review: PLOS has created a resource center for peer reviewers. The goal is to help you do a better job of reviewing manuscripts, or teach your students how to review effectively.

Chapter 22: The review process

Responding to a review comment: An important strategy is to start your response by acknowledging the truth behind a comment. (Even if the comment is incorrect, or incorrect in the context of your study, it usually has some truth behind it.) If the comment has some merit, explain how you have described that truth or (if you have not done so) describe what information you will add to achieve the reviewer's goal. However, reviewers sometimes completely miss the point of your paper; see my next point for suggestions on how to gently point this out.
When a reviewer completely misses the point of your paper: Reviewers are human, and sometimes have preoccupations that are not relevant to your paper. For example, a reviewer once tried to reject a paper by one of my authors because he had not discussed the possibility of using a neural network in his paper. But that approach was clearly not relevant to the research problem. In such cases, resist the temptation to attack the the reviewer (e.g., to accuse them of being stupid). Instead, concisely and politely explain why that approach, though potentially interesting, is beyond the scope of your paper.
Peer review is less effective than it should be: Particularly when research addresses an urgent topic, such as responses to a pandemic, it's essential to increase the quality of peer review to prevent the publication of misinformation and disinformation. But the conventional peer review process is failing to accomplish that goal. Besançon et al. (2022) have some thoughts on how to solve that problem in PLOS Biology. One thing they don't mention: the wisdom of asking at least one colleague to rigorously review your paper before you submit it. This greatly reduces the risk of publishing something that will embarrass you for many years.

Software

Open Journal Systems: open-source software for managing a peer-reviewed journal (with some modification, you should be able to use this software to track the review and revision status of your own manuscripts or of manuscripts written by a large research group).
Learning the R statistical software: I found a good article that describes this popular software and resources for learning it.
Beware Microsoft Excel: Although Excel is powerful software for performing calculations, its history shows some questionable design decisions that have caused significant problems for some researchers. Most recently, its automatic formatting function has been implicated in introducing a large number of errors in gene names: "Gene name errors are widespread in the scientific literature". If you've used Excel to produce publishable data, it would be wise to double-check your data to ensure that no errors are present. If you insist on using Excel anyway, be very careful how you import data: instead of double-clicking or opening a file containing comma-separated values, use the text import wizard or the PowerQuery tool.
Artificial intelligence tools for writers: On 29 August 2022, Alla Katsnelson published "Poor English skills? New AIs help researchers to write better" on the Nature Web site (DOI: https://doi.org/10.1038/d41586-022-02767-9). These tools will not replace strong English skills, a colleague's peer review, or a good editor, but they will help you spot problems with some aspects of your writing. It's important to note that you should not use these tools if you don't understand English grammar well enough to separate good advice from bad. Artificial intelligence is still a work in progress, and doesn't always provide good advice.

Bibliography

I've been writing a series of articles on scientific writing for a Japanese client. I'll eventually integrate these articles into the second edition of this book. In the meantime, you can read them for free on my Web site.
Alex Reinhart's Statistics Done Wrong is now available as a paperback book.
Baykoucheva, S. 2015. Managing Scientific Information and Research Data. 1st ed. Chandos Publishing, 162 p.: a concise overview of scientific ethics, data management, data sharing, and other issues in the modern scientific publishing environment.
Buehl, J. 2016. Assembling arguments: multimodal rhetoric and scientific discourse. University of South Carolina Press, 281 p.
Gross, A.G.; Buehl, J. 2016. Science and the Internet: Communicating Knowledge in a Digital Age. Baywood Publishing, 300 p.: a collection of essays on how scientific communication is changing in response to the Internet.
Grudniewicz, A., Moher, D., Cobey, K.D. and 32 co-authors. 2019. Predatory journals: no definition, no defence. Nature 576(12 Dec.): 210-212.
Hempel, S. 2020. Conducting Your Literature Review. American Psychological Association, Washington, DC. 145 p.]
Runté, R. 2017. Writing strategies for theses and disssertations in social sciences, humanities, education, and business.