I’ve always loved keeping clean organized notes from papers I’m reading and for that Obsidian is my tool of choice. However, with the increasing number of research articles coming up each day on arXiv, I quickly found myself with more open tabs than actual notes…
To palliate the issue, I’ve been experimenting with various LLMs to automatically generate a first “skim-through” research note directly into Obsidian. This resulted in a tool I’m calling PeperNoten ( which is a pun between “PeperNoten”, a typical Dutch delicacy, and “Paper Notes”, which kind of sounds similar).
While it is not perfect and of course LLM outputs are always to take with a grain of salt, it has been quite good so far at helping me building links and discovering more papers as well as identifying core concepts/limitations I might want to focus on before digging into the full PDF.
Features
Here are the main features I’ve focused on in my experimentation:
- Verbosity Level: Four verbosity levels (1: low to 4: high). The current default is 2, which attempts to generate notes for a reader already knowledgeable about the field (e.g., will not include extra definitions unlike higher levels) but with a decent level of details in the methodology section.
- Figure Extraction: Because good notes are visual and aesthetics :)
- Related work: Try to build link to existing works, and adding an explicit arxiv search to retrieve the associated link.
- Gaps and Limitations: Try to identify limitations of the work, in particular when comparing to relevant related works
- Signal from experiments: Extract as much information as possible from experiments, not just the main table; e.g. provide a list of metrics, and a description of ablation results.
Observations
So far I’ve had the most success with Claude-Sonnet-4.5, but Deepseek-V3 has also proved a quite good and more affordable alternative. Here are a list of observations from using PeperNoten in the past few weeks:
- Pros (+):
- Very good at keeping a consistent note format across all entries
- The limitations and gaps sections are often quite insightful, although there is a clear recency bias in the works it compares to
- Similarly, I have discovered a few papers while browsing the related work generated by PeperNoten
- Quite useful to analyze results, in particular ablations, and provide a high-level summary of metrics used and trends observed in experimental results
- Cons (-):
- Sometimes prone to hallucinating papers that sound great but do not exist (although explicitly looking up related work with the arxiv API is a great way to weed them out)
- Somewhat surprisingly, extracting relevant figures/tables is quite challenging and one of the most common failure modes
- Similarly, while the experiments/ablation section is generally quite complete, the table/figures extracted to illustrate it are not necessarily the most relevant ones.
- It's not so easy to find the sweet-spot in terms of verbosity (e.g. wall of text versus right amount of mathematical equations), although introducing different levels of verbosity help with that
An example
Below is an example of PeperNoten-generated notes for our recent MoshiVis paper at CVPR 2026. In particular I found the gaps/limitations/confusions section interesting, as a few of the observations also appaeared in some comments we got during the review process :) …
Code
The code for PeperNoten is stored in the following gist on github