Scene Text Detection-- From Classical Region Proposal Method to Vision-Transformer-Based Approach

https://github.com/zmlxzyh/VIT-DBNet/blob/main/report.pdf