弯曲文字检测
https://github.com/princewang1994/TextSnake.pytorch.git
数据集
SynthText数据集,37G,太大了。
Total-Text可以使用。按照作者写的步骤,处理数据。
训练
python train_textsnake.py example --dataset total-text --net vgg --cuda False
用了cuda有报错。
(TextSnake) xuehp@haomeiya009:~/git/TextSnake.pytorch$ nohup python train_textsnake.py example --dataset total-text --net vgg --cuda False --start_iter 180 --resume save/pretrained/textsnake_vgg_180.pth >> log.4.21.txt &
(0 / 314) - Loss: 0.5268 - tr_loss: 0.1180 - tcl_loss: 0.2552 - sin_loss: 0.0087 - cos_loss: 0.0511 - radii_loss: 0.0939
直接推理(1)
python demo.py $EXPNAME --checkepoch 180 --img_root xuehp_test/
==========Options============
num_workers: 8
batch_size: 4
max_epoch: 200
start_epoch: 0
lr: 0.0001
cuda: True
n_disk: 15
output_dir: output
input_size: 512
max_annotation: 200
max_points: 20
use_hard: True
tr_thresh: 0.6
tcl_thresh: 0.4
post_process_expand: 0.3
post_process_merge: False
exp_name: pretrained
net: vgg
dataset: total-text
resume: None
mgpu: False
save_dir: ./save/
vis_dir: ./vis/
log_dir: ./logs/
loss: CrossEntropyLoss
input_channel: 1
pretrain: False
verbose: True
viz: False
start_iter: 0
lr_adjust: fix
stepvalues: []
weight_decay: 0.0
gamma: 0.1
momentum: 0.9
optim: SGD
display_freq: 50
viz_freq: 50
save_freq: 10
log_freq: 100
val_freq: 100
rescale: 255.0
means: [0.485, 0.456, 0.406]
stds: [0.229, 0.224, 0.225]
checkepoch: 180
img_root: xuehp_test/
device: cuda
=============End=============
Loading from ./save/pretrained/textsnake_vgg_180.pth
Start testing TextSnake.
Traceback (most recent call last):
File "demo.py", line 110, in <module>
main()
File "demo.py", line 92, in main
inference(detector, test_loader, output_dir)
File "demo.py", line 36, in inference
for i, (image, meta) in enumerate(test_loader):
File "/home/xuehp/anaconda3/envs/TextSnake/lib/python3.6/site-packages/torch/utils/data/dataloader.py", line 345, in __next__
data = self._next_data()
File "/home/xuehp/anaconda3/envs/TextSnake/lib/python3.6/site-packages/torch/utils/data/dataloader.py", line 856, in _next_data
return self._process_data(data)
File "/home/xuehp/anaconda3/envs/TextSnake/lib/python3.6/site-packages/torch/utils/data/dataloader.py", line 881, in _process_data
data.reraise()
File "/home/xuehp/anaconda3/envs/TextSnake/lib/python3.6/site-packages/torch/_utils.py", line 395, in reraise
raise self.exc_type(msg)
ValueError: Caught ValueError in DataLoader worker process 0.
Original Traceback (most recent call last):
File "/home/xuehp/anaconda3/envs/TextSnake/lib/python3.6/site-packages/torch/utils/data/_utils/worker.py", line 178, in _worker_loop
data = fetcher.fetch(index)
File "/home/xuehp/anaconda3/envs/TextSnake/lib/python3.6/site-packages/torch/utils/data/_utils/fetch.py", line 44, in fetch
data = [self.dataset[idx] for idx in possibly_batched_index]
File "/home/xuehp/anaconda3/envs/TextSnake/lib/python3.6/site-packages/torch/utils/data/_utils/fetch.py", line 44, in <listcomp>
data = [self.dataset[idx] for idx in possibly_batched_index]
File "/home/xuehp/git/TextSnake.pytorch/dataset/deploy.py", line 22, in __getitem__
return self.get_test_data(image, image_id=image_id, image_path=image_path)
File "/home/xuehp/git/TextSnake.pytorch/dataset/dataload.py", line 192, in get_test_data
image, polygons = self.transform(image)
File "/home/xuehp/git/TextSnake.pytorch/util/augmentation.py", line 339, in __call__
return self.augmentation(image, polygons)
File "/home/xuehp/git/TextSnake.pytorch/util/augmentation.py", line 23, in __call__
img, pts = t(img, pts)
File "/home/xuehp/git/TextSnake.pytorch/util/augmentation.py", line 285, in __call__
image -= self.mean
ValueError: operands could not be broadcast together with shapes (512,512,4) (3,) (512,512,4)
直接推理(2)
使用预训练的模型+公开的测试图片数据