OnSong Manual

Adobe PDF

PDF files are a popular option for storing and printing chord charts and lyrics sheets. You may have been using PDF files for years to catalog your digital library. The Adobe PDF file format is great for accurately representing the printed page and is portable between different computer platforms. Let's take a look at some challenges with this file format and ways we can extract text for best results.

Adobe PDF files are displayed "as-is" in OnSong and can't be edited, formatted, or participate in low light mode. While these files may contain text, it is placed on the the virtual page in a way that enables it to be printed, and not easily understood or modified by other apps. In addition, PDF files can also be comprised of graphics or scanned images, or any combination of these. They can also be encrypted, protecting their contents from being extracted. Because of this, every PDF file is different so there's no way to handle perfect conversion into a text-based document.

You can extract the text of a PDF file within OnSong using the Song Editor and tapping on the Extract Text button in the Conversion Toolbar that appears before the on-screen keyboard is revealed. OnSong will attempt to extract the text from the PDF file first, and if no text is available, it will process the file using Optical Character Recognition (OCR). The result will most likely end with text, but you will need to review and tweak the text into a file format that OnSong understands. In addition, if the file was encrypted, the result of the extraction may result in garbled characters. These files are not able to be extracted due to the protection applied to them by the authoring software.

Here are some issues you may have with extracted PDF files:

Bad Spacing

You may find that some text is placed out of order, or with poor spacing. This is because PDF files may text shortcuts to align text using multiple text fragments. OnSong works to place these text fragments in proximity to each other using frame proximity calculations, but there may still be issues that require you to manually correct this.

Chords with Extra Spaces

Every chord chart is created differently depending on the author and the software used. For instance, the original file may have had multiple space characters used to align chords above lyrics. If a variable-width font is used, this may result in many more spaces being used then the lyrics below. Use Fix Alignment Spaces found in the Text Tools Menu found in the Menubar of the Song Editor to bring those chords back closer to their position and then manually adjust as needed.

Compressed Chords

Another problem may be chords that are too close together on a line above the chords. This can happen if chords were originally placed into text boxes and then aligned above chords. You will need to manually align those chords over the corresponding lyrics in the Song Editor.

Garbled Characters

If you attempt to extract text from an encrypted PDF document, it may result in a screen full of characters. You will need to revert the extraction process or cancel out of the Song Editor and find a different way to extract text.

Unrecognized Characters

If OnSong cannot extract the text from the document directly, it may need to submit the document to optical character recognition (OCR). This means that a computer will attempt to "read" the document visually. Depending on the quality of the PDF, this may result in the improper character being used. For instance, if your document had a flat symbol, it may be interpreted as a lowercase letter "b", or if the PDF was scanned, faded text may result in other characters. Review the document and make these manual changes as needed in the Song Editor.

v. 2018.000