
A sophisticated document cropping and processing application that uses OpenCV for automatic mask detection and AI-powered binarization. Features include multi-file upload, real-time image editing with rotation and filtering, background processing with progress tracking, and export capabilities to PDF. Built with React, TypeScript, Canvas API, and integrates with external binarization services.
Automating document digitization and improving image quality for better OCR results and document archiving.
The frontend implements a sophisticated image processing pipeline using Canvas API for real-time manipulation. It features dynamic handle radius calculation based on image dimensions, edge dragging for precise crop adjustments, and a loupe component for detailed editing. The system uses EventSource for real-time progress updates during background binarization and implements intelligent caching for filter combinations.
The binarization model uses a modified U-Net architecture with a frozen MobileNetV2 encoder (ImageNet pre-trained) for transfer learning. Training data consists of 224x224 pixel document images with corresponding binary masks from Google Drive. The model implements sophisticated data augmentation including random horizontal/vertical flips, hue adjustments, and innovative synthetic contrast blur generation using 10 random sinusoidal frequencies to simulate realistic document degradation. Training uses Adam optimizer with learning rate 1e-3, binary crossentropy loss, early stopping (patience=10), and learning rate reduction on plateau. The system includes nondeterministic binarization during training (random thresholds 0-255) for improved generalization. For inference, an 8-way test-time augmentation wrapper applies rotations and flips, averaging predictions for robust results. Post-training calibration uses isotonic regression to convert model logits into calibrated probabilities with reliability diagrams. The final model exports to multiple formats (SavedModel, H5, TensorFlow.js) and processes images via 224x224 patches with 4-directional augmentation for handling arbitrary input sizes. GPU optimization includes TF_FORCE_GPU_ALLOW_GROWTH for memory management and batch processing for efficiency.

I'm always open to discussing new projects, creative ideas or opportunities to be part of your visions.