# pdf-mass-cleanuptools v6

Clean up that MetaDataMess

## Needs:

+ pip install pdf2image anthropic tqdm PyPDF2 rich
+ sudo apt-get install poppler-utils 

before running: export ANTHROPIC_API_KEY='your-api-key-here'

## Using the main tool

### Basic usage
python pdf_processor.py -i /path/to/pdfs -o /path/to/output

### Test with a single file
python pdf_processor.py -i /path/to/pdfs -o /path/to/output --test

### Process specific pattern of files
python pdf_processor.py -i /path/to/pdfs -o /path/to/output --pattern "magazine_*.pdf"

### Keep temporary files for inspection
python pdf_processor.py -i /path/to/pdfs -o /path/to/output --no-cleanup

### With MetaData
python pdf_processor.py -i /path/to/pdfs -o /path/to/output --write-metadata

### With MetaData - and skip Backups if you dare
python pdf_processor.py -i /path/to/pdfs -o /path/to/output --write-metadata --no-backup

## Reviewing the metadata

### Just review and save changes to new JSON file
python metadata_reviewer.py results/processing_results.json

### Review and write changes back to PDFs
python metadata_reviewer.py results/processing_results.json --write

### Enable debug logging
python metadata_reviewer.py results/processing_results.json --debug