Searchable PDFs and Coding/Note Taking Function

jonathanp · June 25, 2008, 1:28pm

I imagine that, like me, many Endnote users have libraries of articles in PDF or other formats on their hard-drives. It would be wondeful if Endnote were able to search the text within these PDFs, rather than just attaching them to the library as external documents. Indeed, Endnote’s current workflow, in which PDFs are copied (with new names) to a library accessed by Endnote creates problems for users who may be using Acrobat to add notes to PDFs, marking/highlighting passages in the PDFs, etc–having two copies of the PDF means that markups in one copy do not transfer to the original, etc. It is possible to discipline yourself to access PDFs only through Endnote, and that would be worth it if there were added benefits to using Endnote as a gateway for PDFs.

A major problem with using Endnote as a research tool to access these stored documents is that Endnote does not actually search the text of these external documents. It is possible to extract the text and paste it into Endnote, but that is time consuming, less visually attractive, and you lose the functionality of the PDF format.

What I’d like to see is a very robust note-taking and coding option for Word Documents and PDFs (as others have also requested). Here’s how it would work, ideally. You keep all documents organized in the folder accessed by Endnote. Naming of these files is up to the user. When you open a PDF or Word document through Endnote, it opens in what is essentially an internal browser/viewer, which would be a function of Endnote and would look similar to Microsoft Outlook’s email program’s PDF and Word Document preview pane. PDFs and Docs stored internally by Endnote would be searchable through Endnote’s search functions.

HERE’s the most important part: Keywords could be attached not just to entire articles, but indeed to specific sections of text. The user would be able to highlight text, and then drag keywords from a picklist to the text (much like photo-tagging software, or like Atlas.ti or NVIVO). When a record is opened in Endnote, the keywords associated with that record would be easily available and a particular keyword could be double clicked, at which point the first instance of that keyword coding would appear, highlighted, with the option to click again for the next instance, etc. For those references in which detailed keyword coding has not been undertaken, the option to click on the keywords would simply not be present, in which case the user understands that the entire article is coded at ‘insects’ or what have you.

So, 2 main points/suggestions:

PDFs and Documents associated with Endnote should be searchable for text searches or keyword searches through Endnote
It should be possible to code specific sections of articles (or at least specific sections of the current ‘abstract’ and ‘notes’ and ‘research notes’) with keywords, and there should be an intuitive method for clicking through to instances which are coded at those keywords.

What do you think?

jonathanp · June 25, 2008, 5:10pm

Thanks! I overlooked that.

nmadani · June 29, 2008, 8:28pm

Thanks for posting this Jonathan. I also manage my PDFs with EndNote. The ability to at least preview the files within EndNote and altering the layout of the Preview Pane would already go a long way. There may be some concerns about the PDF content search function:

Where do you draw the line wih the searcheable files? PDF, Word, Word Perfect, Open Office, PowerPoint, …
To make the process efficient, these files would need to be indexed. This can be resource intensive and could make the system less responsive and your library files bloated with index data. There are plenty of free file indexing services available and I bet some of them have APIs that EndNote could use to give us the search functions you mentioned.

I think what would really help everyone would be a new bibliographic standard similar to email headers that would be saved as metadata in PDF files. All of the bibliographic reference data could thus be embedded into the PDF file (open any PDF file using Acrobat Readeer and choose Properties -> Additional Metadata from the file menu to get a sense). All you’d then need to do would be to drop your PDF file into your EndNote library, and voila, your reference would be generated automatically! This would require all publishers to agree upon a standard, and any of the current reference standards would probably be adequate.

moreje · March 30, 2009, 1:17pm

I agree… PDF managment with advanced functions (search, extract metadata from pdfs, etc) should be very usefull and make Endnote even more powefull

(a quick look at Papers from the Apple world should give an overview of what it possible in PDFs managment)

J.

jasonr · March 30, 2009, 4:48pm

These are all things that we are working on. These features will very likely not be included in the very next version of EndNote but hopefully soon. The biggest hurdle is licensing and integrating a robust PDF development kit into EndNote. Macintosh-only applications - like “Papers” mentioned in this and other threads - have it much easier as most of this functionality is built in to the Mac OS. For cross-platform applications like EndNote, we need to rely on a third party toolkit that will work for both Windows and Macintosh.

Jason Rollins, the EndNote team

nmadani · March 30, 2009, 11:30pm

Hi Jason:

Thank you for your reply. I think a full-featured PDF management toolkit may be overkill for starters. I am positive that there are free, open-source libraries out there that could be used to at least view PDF files and read associated metadata. This would be something that could easily be incorporated into EndNote’s next version, if the introduction of new features would be the goal … Hopefully, EndNote wasn’t written in Visual Basic

NM

jasonr · March 31, 2009, 1:37pm

We have looked at some of the open-source PDF options and found them to be very limited. We are planning to license the full PDF development toolkit from Adobe; this should give us the most power and flexibility for future PDF-related features for EndNote.

EndNote is written mainly in C++ not Visual Basic.

Jason Rollins, the EndNote team

Topic		Replies	Views
EndNote20: Search the attached PDF's of multiple references EndNote How To	2	920	May 21, 2021
Ability to search for references listed in a PDF bibliography straight from the PDF EndNote Product Suggestions	0	287	November 7, 2012
Two suggestions EndNote Product Suggestions	0	167	March 18, 2013
PDFs managment: metadata import and export, searching results EndNote General	0	263	October 12, 2012
Indexing and search your annotations EndNote How To	2	1576	December 29, 2019

Searchable PDFs and Coding/Note Taking Function

Related topics