If I use the following Quarto document to render a PDF, the copy paste behavior depends on the browser being used to view the resulting document.
I'm using pdflatex here because I regularly generate 200 PDFs with render_quarto
, and this runs much quicker than xelatex. In addition, xelatex also gives me trouble when using a watermark with the 'background' latex package.
---
title: "Untitled"
format:
pdf:
pdf-engine: pdflatex
---
## Quarto
```{r}
my_pipeline <-
mtcars |>
summary()
```
Using the RStudio viewer to copy and paste this code I receive the following after pasting. Where does the 1 come from? How can I preserve the spacing or line breaks?
my_pipeline <-mtcars |>summary()1
Using MS Edge to view the PDF and copy paste the code, I receive the following after pasting. This one does a better job of preserving the line breaks, but it refuses to capture the dash!
my_pipeline <
mtcars |>
summary()
If I use the following Quarto document to render a PDF, the copy paste behavior depends on the browser being used to view the resulting document.
I'm using pdflatex here because I regularly generate 200 PDFs with render_quarto
, and this runs much quicker than xelatex. In addition, xelatex also gives me trouble when using a watermark with the 'background' latex package.
---
title: "Untitled"
format:
pdf:
pdf-engine: pdflatex
---
## Quarto
```{r}
my_pipeline <-
mtcars |>
summary()
```
Using the RStudio viewer to copy and paste this code I receive the following after pasting. Where does the 1 come from? How can I preserve the spacing or line breaks?
my_pipeline <-mtcars |>summary()1
Using MS Edge to view the PDF and copy paste the code, I receive the following after pasting. This one does a better job of preserving the line breaks, but it refuses to capture the dash!
my_pipeline <
mtcars |>
summary()
I do not suggest SumatraPDF (which I support,) as the best for the task, since it suffers ALL the same problems any PDF reader has. Such as poorly defined fonts or no mechanical data in a PDF so:
Text extraction has been removed as a function because it is too variable. However you can use better programmable extractors on the page such as here. But you will need to write custom command line programming using XPDF, Balabolka or Poppler.