- My Forums
- Tiger Rant
- LSU Recruiting
- SEC Rant
- Saints Talk
- Pelicans Talk
- More Sports Board
- Fantasy Sports
- Golf Board
- Soccer Board
- O-T Lounge
- Tech Board
- Home/Garden Board
- Outdoor Board
- Health/Fitness Board
- Movie/TV Board
- Book Board
- Music Board
- Political Talk
- Money Talk
- Fark Board
- Gaming Board
- Travel Board
- Food/Drink Board
- Ticket Exchange
- TD Help Board
Customize My Forums- View All Forums
- Show Left Links
- Topic Sort Options
- Trending Topics
- Recent Topics
- Active Topics
Started By
Message
Smart extract from PDF to Excel
Posted on 6/19/17 at 2:55 pm
Posted on 6/19/17 at 2:55 pm
Ok so I have multiple PDF files (scans) that someone enters manually on an Excel spreadsheet and it's time consuming.
Is there a way to extract the data that I have onto excel? (after OCR of course) What kind of scripts would I need to run?
The data in the original file is "disorganized" and not in a table format, and I would need to only extract certain data.
For example, let's say the original file looks like this, and the excel looks like the file below, that would be the end product.
shite sounds impossible (or at the very least like a lot of work) to me, but I figured I'd ask.
Is there a way to extract the data that I have onto excel? (after OCR of course) What kind of scripts would I need to run?
The data in the original file is "disorganized" and not in a table format, and I would need to only extract certain data.
For example, let's say the original file looks like this, and the excel looks like the file below, that would be the end product.
shite sounds impossible (or at the very least like a lot of work) to me, but I figured I'd ask.
Posted on 6/19/17 at 3:58 pm to castorinho
How clean is the OCR extraction? If it's okay and consistent, you can search through the file and categorize the data by tags or triggers.
For example, if the OCR always has the text "Date" and then an accurate "Date" after that, you can scan the file for "Date" and then write the following data.
For example, if the OCR always has the text "Date" and then an accurate "Date" after that, you can scan the file for "Date" and then write the following data.
Posted on 6/22/17 at 3:57 pm to castorinho
A-PDF Data Extractor: https://www.a-pdf.com/data-extractor/index.htm
Popular
Back to top
Follow TigerDroppings for LSU Football News