In an earlier post, I have made some comparison between using .ipynb notebooks and .py scripts for data analysis in Python. In short, each has their own merits and best use cases.
One of my recent projects requires generating and summarizing a lot of results. For generating results, I have written flexible (e.g. with command line arguments) .py scripts which are clearly more efficient.
After running those scripts, I have hundreds of tables (in csv) and plots (in png) to summarize into reports. On a smaller scale (like just a few plots), it is easy to manually go through these output files and copy-and-paste them into a document, but this becomes infeasible when there are many files.
In this case, Jupyter notebooks come in very handy. While many people run code and interactively examine results in one single notebook, we can also use it to only summarize results without actually running any heavy-duty code (which is done separately and more conveniently with .py scripts).
from IPython.display import display, Image, HTML # Show a table stored as a dataframe display(HTML(df.to_html())) # Show a plot display(Image(filename='some_plot.png'))
Finally, we make use of the
nbconvert package to convert the Jupyter notebook into an HTML report (or other output formats). The
--execute option asks the package to execute all code before generating the output file, while the
--no-input option allows hiding the input code cells.
python3 -m nbconvert input_notebook.ipynb \ --to html --output output_report.html \ --execute --no-input
Of course, the same procedure can be done in
.Rmd and in