弯曲文字检测
https://github.com/princewang1994/TextSnake.pytorch.git

数据集

SynthText数据集,37G,太大了。
Total-Text可以使用。按照作者写的步骤,处理数据。

训练

python train_textsnake.py  example --dataset total-text --net vgg --cuda False

用了cuda有报错。

(TextSnake) xuehp@haomeiya009:~/git/TextSnake.pytorch$ nohup python train_textsnake.py  example --dataset total-text --net vgg --cuda False --start_iter 180 --resume save/pretrained/textsnake_vgg_180.pth   >> log.4.21.txt &


(0 / 314) - Loss: 0.5268 - tr_loss: 0.1180 - tcl_loss: 0.2552 - sin_loss: 0.0087 - cos_loss: 0.0511 - radii_loss: 0.0939

直接推理(1)

python demo.py $EXPNAME --checkepoch 180 --img_root xuehp_test/
==========Options============
num_workers: 8
batch_size: 4
max_epoch: 200
start_epoch: 0
lr: 0.0001
cuda: True
n_disk: 15
output_dir: output
input_size: 512
max_annotation: 200
max_points: 20
use_hard: True
tr_thresh: 0.6
tcl_thresh: 0.4
post_process_expand: 0.3
post_process_merge: False
exp_name: pretrained
net: vgg
dataset: total-text
resume: None
mgpu: False
save_dir: ./save/
vis_dir: ./vis/
log_dir: ./logs/
loss: CrossEntropyLoss
input_channel: 1
pretrain: False
verbose: True
viz: False
start_iter: 0
lr_adjust: fix
stepvalues: []
weight_decay: 0.0
gamma: 0.1
momentum: 0.9
optim: SGD
display_freq: 50
viz_freq: 50
save_freq: 10
log_freq: 100
val_freq: 100
rescale: 255.0
means: [0.485, 0.456, 0.406]
stds: [0.229, 0.224, 0.225]
checkepoch: 180
img_root: xuehp_test/
device: cuda
=============End=============
Loading from ./save/pretrained/textsnake_vgg_180.pth
Start testing TextSnake.
Traceback (most recent call last):
  File "demo.py", line 110, in <module>
    main()
  File "demo.py", line 92, in main
    inference(detector, test_loader, output_dir)
  File "demo.py", line 36, in inference
    for i, (image, meta) in enumerate(test_loader):
  File "/home/xuehp/anaconda3/envs/TextSnake/lib/python3.6/site-packages/torch/utils/data/dataloader.py", line 345, in __next__
    data = self._next_data()
  File "/home/xuehp/anaconda3/envs/TextSnake/lib/python3.6/site-packages/torch/utils/data/dataloader.py", line 856, in _next_data
    return self._process_data(data)
  File "/home/xuehp/anaconda3/envs/TextSnake/lib/python3.6/site-packages/torch/utils/data/dataloader.py", line 881, in _process_data
    data.reraise()
  File "/home/xuehp/anaconda3/envs/TextSnake/lib/python3.6/site-packages/torch/_utils.py", line 395, in reraise
    raise self.exc_type(msg)
ValueError: Caught ValueError in DataLoader worker process 0.
Original Traceback (most recent call last):
  File "/home/xuehp/anaconda3/envs/TextSnake/lib/python3.6/site-packages/torch/utils/data/_utils/worker.py", line 178, in _worker_loop
    data = fetcher.fetch(index)
  File "/home/xuehp/anaconda3/envs/TextSnake/lib/python3.6/site-packages/torch/utils/data/_utils/fetch.py", line 44, in fetch
    data = [self.dataset[idx] for idx in possibly_batched_index]
  File "/home/xuehp/anaconda3/envs/TextSnake/lib/python3.6/site-packages/torch/utils/data/_utils/fetch.py", line 44, in <listcomp>
    data = [self.dataset[idx] for idx in possibly_batched_index]
  File "/home/xuehp/git/TextSnake.pytorch/dataset/deploy.py", line 22, in __getitem__
    return self.get_test_data(image, image_id=image_id, image_path=image_path)
  File "/home/xuehp/git/TextSnake.pytorch/dataset/dataload.py", line 192, in get_test_data
    image, polygons = self.transform(image)
  File "/home/xuehp/git/TextSnake.pytorch/util/augmentation.py", line 339, in __call__
    return self.augmentation(image, polygons)
  File "/home/xuehp/git/TextSnake.pytorch/util/augmentation.py", line 23, in __call__
    img, pts = t(img, pts)
  File "/home/xuehp/git/TextSnake.pytorch/util/augmentation.py", line 285, in __call__
    image -= self.mean
ValueError: operands could not be broadcast together with shapes (512,512,4) (3,) (512,512,4)

直接推理(2)

使用预训练的模型+公开的测试图片数据

动手-文字检测-TextSnake-冯金伟博客园
动手-文字检测-TextSnake-冯金伟博客园
动手-文字检测-TextSnake-冯金伟博客园
动手-文字检测-TextSnake-冯金伟博客园

直接推理(3)

使用预训练的模型+自己的图片数据
动手-文字检测-TextSnake-冯金伟博客园