30 lines
No EOL
879 B
Markdown
30 lines
No EOL
879 B
Markdown
# pdf-mass-cleanuptools v6
|
|
|
|
Clean up that MetaDataMess
|
|
|
|
## Needs:
|
|
|
|
+ pip install pdf2image anthropic tqdm PyPDF2
|
|
+ sudo apt-get install poppler-utils
|
|
|
|
before running: export ANTHROPIC_API_KEY='your-api-key-here'
|
|
|
|
## Using the tool
|
|
|
|
### Basic usage
|
|
python pdf_processor.py -i /path/to/pdfs -o /path/to/output
|
|
|
|
### Test with a single file
|
|
python pdf_processor.py -i /path/to/pdfs -o /path/to/output --test
|
|
|
|
### Process specific pattern of files
|
|
python pdf_processor.py -i /path/to/pdfs -o /path/to/output --pattern "magazine_*.pdf"
|
|
|
|
### Keep temporary files for inspection
|
|
python pdf_processor.py -i /path/to/pdfs -o /path/to/output --no-cleanup
|
|
|
|
### With MetaData
|
|
python pdf_processor.py -i /path/to/pdfs -o /path/to/output --write-metadata
|
|
|
|
### With MetaData - and skip Backups if you dare
|
|
python pdf_processor.py -i /path/to/pdfs -o /path/to/output --write-metadata --no-backup |