Converting a Word docx file to a draft R Markdown file

Many of have been using MS Word as a word processor for decades now.  What is then an R Markdown document?  An R Markdown document is written in markdown (fancy way of saying that it is all in plain text) and embedded in it can be chunks of R code.  Once written, you can render the file into many formats including HTML, MS Word and PDF.  So, why would someone like me choose to convert a MS Word file to a R Markdown file.  Isn’t MS Word enough to meet my needs?  I have two good reasons to convert my MS word files of my open-resource Numerical Methods course to R Markdown files.

  1. On conversion from the MS Equation editor 3.0 to the currently available MS equation editor for .docx files, the equations from my old .doc documents  were getting displayed in a compact inline form.  Using the display option of an equation would have supposedly helped, but some equations refused to get properly justified, tabbing was becoming a guessing game, and using a created Word style was not helping.  Sometimes, equations would not show with letters italicized, and italicizing a single equation would change the whole document to italics font.  Ctrl+Z would help in un-italicizing the document but that was not foolproof either as it would sometimes mess up the tabs.
  2. The second reason was that I was embedding PDF files in a frame in an online adaptive platform lesson and even with a 12-point size in the original document, the font of the PDF files would show up too small (see Figure 1).  Yes, one can use a bigger font size in the Word file but this may not be suitable for use in, say, a printed textbook.  Maintaining different versions with different font sizes is not a recommended practice in today’s world.  A user could alternatively use the magnification option of the PDF file menu, but that creates horizontal scrolls as well in the frame.  Also, a user could download the PDF file to be opened in an acrobat reader but that is an  inconvenience imposed on them.  One could also simply embed an .htm version of the word file but such file content was getting rendered all over the place as my documents included equations, both in inline and display modes, sketches made with Word, tables imported from excel, and plots obtained from a MATLAB output, etc.

Figure 1: Embedded PDF file shows up with a small font

So the answer was simply to take Rmarkdown for a spin.  Since our documents are not simply text, it is not a cut-and-paste job with some light editing.  We turned to pandoc for this.  What pandoc is can be summarised by their slogan – “If you need to convert files from one markup format into another, pandoc is your swiss-army knife”.  Pandoc is a free software and is released under the GPL.  The full manual for pandoc is also available.

Here are the steps for how to do the conversion on a Windows 10 machine.  One has to do the conversion though at the command prompt level as I did not see an online converter that does the conversion beyond text and styles, that is, they do not convert equations, images, etc.

      1. Download pandoc on your PC from
      2. Install pandoc as an administrator.
      3. Check if Pandoc is installed.  Go to the search box in your taskbar and enter “cmd” without the quotes.  Run as administrator.  You will get the cmd prompt.  At the prompt enter “pandoc –version” without the quotes.
      4. Go to the command prompt by entering “cmd” without the quotes in the search box in your taskbar.
      5. Go to the directory where the .docx file is stored.  You can do this by use cd and cd.. commands.   See here for a short guide.
      6. Once in the directory, do the following.  Let’s suppose the name of the file is “Chapter01NumericalMethods.docx”.   Type the following at the command prompt.
          • pandoc --extract-media ./Chapter01NumericalMethodsMedia "Chapter01NumericalMethods.docx" -t markdown -o ""
          • The above format extracts the media files as well and puts them in a media directory ./Chapter01NumericalMethodsMedia.  Some files may be of the .wmf format.  These can be opened in MS Paint and saved in an acceptable format such as .png.
          • I always use quotes for file names to avoid errors one gets with spaces in filenames, etc.
          • It is good practice to use a different name for the output markdown file as one may later be converting the markdown files to different formats including pdf, HTML, word, etc.  Note the output markdown file is with an .md suffix as pandoc does not have the output .rmd option.
      7. Rename the .md file as .rmd file
      8. Open the .rmd file in Rstudio to edit.

The above is a recipe for just one file.  I do gather if one has many .docx files, one could write a script to do this in a batch mode.

We will discuss some tricks to light edit the .rmd file in the next blog.  Stay tuned on the journey of this Rmarkdown newbie.  If you know a better way to do this, please let me know – autarkaw at


This post is brought to you by

Canvas quiz times for accommodating students with disabilities

Before the pandemic, university students would go to a student-accessibility-services office to take their scheduled examinations and get their needs accommodated.  These accommodations include additional time, a separate room to read aloud, and quiet environments. During the pandemic, accommodations for online examinations are generally monitored by the instructor, provided they only involve giving extra time.

I have made a mistake or two while hand calculating the assigned due time or the additional time that needs to be input in the learning management system such as CANVAS. To minimize such mistakes, I made an excel file, and it has worked well so far. In this blog, I share the excel file with you.

Use the spreadsheet as you see fit. I have protected the cells in the excel file so that they do not get changed inadvertently – you can always unprotect (go to Review->Unprotect in the excel menu) the excel sheet and make modifications to suit your needs.

The inputs are

  1. starting time of the test or when you want the students to have access to the test
  2. time in minutes you want students to be working on the test
  3. time in minutes given to students for uploading files, if any (needed for submitting handwritten free responses, for instance) and
  4. Time over and above the length of the test to create a window in which the test is available.

The instructions for entering the above inputs are as follows.

  1. Enter the starting times in Row 10 (Columns H thru N): e.g., 2:00 PM. Mind the space between 2:00 and PM.
  2. Put the time in minutes for the test in cell H11
  3. Put the time in minutes for uploading of a file, if any. in cell H12. Enter zero if no uploading of the file is needed.
  4. Put the extra time in minutes in cell H13 for creating a window for the test.

Let’s take an example. I want to give a test that is 40-minutes long that requires students to handwrite free-responses to posed questions. They will be given additional 10 minutes to make a PDF file of the free-responses and upload the file. I want the test to start at 11:40 AM for all students but make it due at 12:45 PM, and hence give a window of 65 minutes within which to complete the test. So the extra time given in minutes to create the window is 15 minutes (65-40-10=15). Based on the example, I would enter 11:40 AM in Row 10 (Columns H thru N), 40 in cell H11, 10 in cell H12, and 15 in cell H13.

The outputs that are needed for the CANVAS LMS are shown in green color. Do not forget to go to publish the quiz, and then go to “Moderate Quiz” to add the extra time for the accommodated students.

Two of the links below are just references to show how to add extra time and add the names of students who get extra time.




This post is brought to you by