Manuscript Prep
Another Benefit of the Project Folder
Project folders are also set up to help integrate figures, statistical results, etc. into the manuscript writing process. By keeping all relevant files in the same project folder, and using the controller script to weave different components together, it should be easier to write (and then update) a manuscript draft with the correct results. For example, smart usage of inline code will save you from having to manually update statistical results in tables and text throughout the draft when changes are made (saving time and helping to avoid confusion down the road!). And by using the files in the Output directory to place figures, the manuscript should always be pulling from up-to-date visualizations.
By default, this workflow will produce a nicely formatted PDF and word document when the .Rmd file is knit (via the controller script). This is different from the report .Rmd file (which by default outputs an HTML and special markdown file) - the main difference here is that code chunks will not appear in the PDF or word documents, while they do in the HTML and markdown outputs. In order to ensure the correct file paths, it is important that the manuscript markdown file be knit using the render( ) code in the controller script.
In order to make this all work, there should be a directory entitled “Manuscript” within each project folder. The focal point of this directory is the .Rmd file - this markdown file is where you will actually write the manuscript. When knitting the document, the output formats are taken from a template .docx file and a .tex file for the word document and PDF, respectively. The in text citations and reference section are handled by a .bst file and a .bib file. For more on the citations, see the Citations and References section below.
Getting Started
In order to create PDF documents from .Rmd files, you will need to have some version of LaTeX installed on your computer. A sure fire way to check whether you already have something installed is to try to knit a .Rmd file to a PDF. You can make a new .Rmd file and just knit that, no need to spend time customizing it. If it works, you have LaTex installed. If it doesn’t, you probably don’t. You can follow the instructions from R to install LaTeX, but this download is typically very large. LaTeX can also be installed using TinyTex. When LaTeX is installed, you can also check for information in the Global Tools as discussed here. When developing the current project folder, I’ve been using pdfLaTeX, but XeLaTeX should work just as well. Note, if you installed LaTeX using TinyTex, the first time you knit a document / template, it might take several minutes; it should run faster from then on though.
Before you start writing the first draft, rename the .Rmd file name to reflect the manuscript. Typically we will format the file name as it would be referred to in an in-text citation (ex - LeadAuthor_et_al_2022, Author1_and_Author2_2022, etc.). If you’ve got multiple files with the same name (meaning multiple papers per project - nice job!), you can append letters to the name (LeadAuthor_et_al_2022a, LeadAuthor_et_al_2022b, etc.) or add a little context (LeadAuthor_et_al_2022_plasticity, LeadAuthor_et_al_2022_latitudinal, etc.).
Keep the use of code within the manuscript .Rmd file to a minimum. All major data analysis and figure generation should be performed in other scripts within the project folder. When placing figures, for example, it will be substantially faster to use the image files output by the Report.Rmd file than to create the figures from code. Given the number of times you’ll likely knit a manuscript during the writing and revision process, this saved time adds up. It is therefore in your best interest to ensure that the report.Rmd file (or whichever script is outputting the image files) reflects publication-ready figures whenever possible.
Citations and References
The use of a reference manager is strongly recommended. RStudio has a built in citation function, but the ability to store and organize references outside of the project is also useful. Zotero is an open source reference manager that integrates nicely with RStudio and markdown documents. This is the program I use/recommend. There’s a useful write up of how citations work in .Rmd files here. Note, you’ll need to be using RStudio version 1.4 or later to access these features.
Keep in mind that citations are contained within square brackets rather than parentheses, and should be separated by semi-colons rather than commas. This will produce a citation in the format of (Author1 et al., YEAR; Author2 et al., YEAR). For citations like “… as shown by Author1 and Author2 (YEAR)…”, the square brackets are omitted. As you add citations, I’ve found that citekeys (the short text fragment used to indicate which paper should be cited where) should not contain special characters, otherwise the document will not knit correctly. These include characters like ö, á, è, etc. even though these are common in author names. In those cases, the citekey should be manually edited in the textbox that pops up when inserting a new citation. The authors’ names will still be properly displayed in the reference section, it’s just the citekey that’s affected.
Getting Feedback
The workflow for writing the manuscript draft will be similar to that used during the Project Development phase, at least while putting together the initial versions. Changes are made on a working branch, which are reviewed via Pull Request. The PR interface works well for providing comments, but not the sort of line editing that is common during the manuscript writing process. We’ll use the PR to make general comments, but specific edits might also be made directly on the .Rmd file. These changes will be made on your working branch, so to review the changes you’ll just have to remember to pull the updates onto your local machine. Once you’re ready for the changes to be reviewed again, include “Ready for Review” in your commit message. We’ll likely repeat this PR-comments-edits-pull cycle several times before accepting and merging the manuscript draft into the main branch.
If comments are required from co-authors who do not use GitHub, it is the responsibility of the lead author to ensure they are sent an up-to-date word document (or even better, Google Doc) and that their comments are properly integrated into the GitHub repo.
Submissions
As you approach the submission stage (the manuscript draft is completed and just being polished now), it’s time to start thinking about how the data and scripts will be shared with people outside our group. One easy approach is to archive the GitHub repo with Zenodo. This allows you designate a Version of Record, which has an associated DOI. See here for more details.
Once the new repo DOI is included in a Data Availability statement in the manuscript, it’s ready for submission. The .Rmd file is knit and the resulting PDF and word documents are prepared for submission to a preprint server and journal, respectively.
During peer review, a manuscript will be read and evaluated by several reviewers who will give (varying amounts of) feedback. Once you get these reviews, you can start a new working branch for a second version of the manuscript and start making the recommended changes and addressing the reviewer’s comments. Here again the benefits of using a project folder are evident - depending on the feedback you might have to make changes to the analyses or figures, which will be automatically incorporated into the manuscript (at least where file paths and inline code are used).
Post-publication
The generally static nature of published work in the current body of scientific literature creates many issues. The Github repository for each project is a “live” entity, however, that can overcome some of these issues by serving as a central source for corrections, updates, and expansions of the manuscript after publication. This approach is outlined in this paper, which integrates seamlessly with our workflow.