Merge remote-tracking branch 'upstream/develop' into develop

own
Bobholamovic 3 years ago
commit 7f186c4b65
  1. 33
      deploy/README.md
  2. 62
      deploy/export/README.md
  3. 59
      deploy/export/export_model.py
  4. 2
      paddlers/__init__.py
  5. 22
      paddlers/custom_models/cd/bit.py
  6. 8
      paddlers/custom_models/cd/changestar.py
  7. 4
      paddlers/custom_models/cd/dsamnet.py
  8. 4
      paddlers/custom_models/cd/dsifn.py
  9. 6
      paddlers/custom_models/cd/layers/attention.py
  10. 16
      paddlers/custom_models/cd/layers/blocks.py
  11. 6
      paddlers/custom_models/cd/snunet.py
  12. 14
      paddlers/custom_models/cd/stanet.py
  13. 1
      paddlers/deploy/__init__.py
  14. 283
      paddlers/deploy/predictor.py
  15. 11
      paddlers/tasks/base.py
  16. 28
      paddlers/tasks/changedetector.py
  17. 6
      paddlers/tasks/load_model.py
  18. 13
      paddlers/tasks/utils/infer_nets.py
  19. 4
      tutorials/train/semantic_segmentation/data/.gitignore
  20. 91
      tutorials/train/semantic_segmentation/deeplabv3p.py
  21. 54
      tutorials/train/semantic_segmentation/deeplabv3p_resnet50_multi_channel.py
  22. 58
      tutorials/train/semantic_segmentation/farseg_test.py
  23. 89
      tutorials/train/semantic_segmentation/unet.py
  24. 55
      tutorials/train/semantic_segmentation/unet_multi_channel.py

@ -0,0 +1,33 @@
# Python部署
PaddleRS已经集成了基于Python的高性能预测(prediction)接口。在安装PaddleRS后,可参照如下代码示例执行预测。
## 部署模型导出
在服务端部署模型时需要首先将训练过程中保存的模型导出为部署格式,具体的导出步骤请参考文档[部署模型导出](/deploy/export/README.md)。
## 预测接口调用
* **基本使用**
以下是一个调用PaddleRS Python预测接口的实例。首先构建`Predictor`对象,然后调用`Predictor`的`predict()`方法执行预测。
```python
import paddlers as pdrs
# 将导出模型所在目录传入Predictor的构造方法中
predictor = pdrs.deploy.Predictor('./inference_model')
# img_file参数指定输入图像路径
result = predictor.predict(img_file='test.jpg')
```
* **在预测过程中评估模型预测速度**
加载模型后,对前几张图片的预测速度会较慢,这是因为程序刚启动时需要进行内存、显存初始化等步骤。通常,在处理20-30张图片后,模型的预测速度能够达到稳定值。基于这一观察,**如果需要评估模型的预测速度,可通过指定预热轮数`warmup_iters`对模型进行预热**。此外,**为获得更加精准的预测速度估计值,可指定重复`repeats`次预测后计算平均耗时**。
```python
import paddlers as pdrs
predictor = pdrs.deploy.Predictor('./inference_model')
result = predictor.predict(img_file='test.jpg',
warmup_iters=100,
repeats=100)
```

@ -0,0 +1,62 @@
# 部署模型导出
## 目录
* [模型格式说明](#1)
* [训练模型格式](#11)
* [部署模型格式](#12)
* [部署模型导出](#2)
## <h2 id="1">模型格式说明</h2>
### <h3 id="11">训练模型格式</h3>
使用PaddleRS训练模型,输出目录中主要包含四个文件:
-`model.pdopt`,包含训练过程中使用到的优化器的状态参数;
-`model.pdparams`,包含模型的权重参数;
-`model.yml`,模型的配置文件(包括预处理参数、模型规格参数等);
-`eval_details.json`,包含验证阶段模型取得的指标。
需要注意的是,由于训练阶段使用模型的动态图版本,因此将上述格式的模型权重参数和配置文件直接用于部署往往效率不高。本项目建议将模型导出为专用的部署格式,在部署阶段使用静态图版本的模型以达到更高的推理效率。
### <h3 id="12">部署模型格式</h3>
在服务端部署模型时,需要将训练过程中保存的模型导出为专用的格式。具体而言,在部署阶段,使用下述五个文件描述训练好的模型:
-`model.pdmodel`,记录模型的网络结构;
-`model.pdiparams`,包含模型权重参数;
-`model.pdiparams.info`,包含模型权重名称;
-`model.yml`,模型的配置文件(包括预处理参数、模型规格参数等);
-`pipeline.yml`,流程配置文件。
## <h2 id="2">部署模型导出</h2>
使用如下指令导出部署格式的模型:
```commandline
python deploy/export/export_model.py --model_dir=./output/deeplabv3p/best_model/ --save_dir=./inference_model/
```
其中,`--model_dir`选项和`--save_dir`选项分别指定存储训练格式模型和部署格式模型的目录。例如,在上面的例子中,`./inference_model/`目录下将生成`model.pdmodel`、`model.pdiparams`、`model.pdiparams.info`、`model.yml`和`pipeline.yml`五个文件。
`deploy/export/export_model.py`脚本包含三个命令行选项:
| 参数 | 说明 |
| ---- | ---- |
| --model_dir | 待导出的训练格式模型存储路径,例如`./output/deeplabv3p/best_model/`。 |
| --save_dir | 导出的部署格式模型存储路径,例如`./inference_model/`。 |
| --fixed_input_shape | 固定导出模型的输入张量形状。默认值为None,表示使用任务默认输入张量形状。 |
当使用TensorRT执行模型推理时,需固定模型的输入张量形状。此时,可通过`--fixed_input_shape`选项来指定输入形状,具体有两种形式:`[w,h]`或者`[n,c,w,h]`。例如,指定`--fixed_input_shape`为`[224,224]`时,实际的输入张量形状可视为`[-1,3,224,224]`(-1表示可以为任意正整数,通道数默认为3);若想同时固定输入数据在batch维度的大小为1、通道数为4,则可将该选项设置为`[1,4,224,224]`。
完整命令示例:
```commandline
python deploy/export_model.py --model_dir=./output/deeplabv3p/best_model/ --save_dir=./inference_model/ --fixed_input_shape=[224,224]
```
对于`--fixed_input_shape`选项,**请注意**:
-在推理阶段若需固定分类模型的输入形状,请保持其与训练阶段的输入形状一致。
-对于检测模型中的YOLO/PPYOLO系列模型,请保证输入影像的`w`和`h`有相同取值、且均为32的倍数;指定`--fixed_input_shape`时,R-CNN模型的`w`和`h`也均需为32的倍数。
-指定`[w,h]`时,请使用半角逗号(`,`)分隔`w`和`h`,二者之间不允许存在空格等其它字符。
-将`w`和`h`设得越大,则模型在推理过程中的耗时和内存/显存占用越高。不过,如果`w`和`h`过小,则可能对模型的精度存在较大负面影响。

@ -0,0 +1,59 @@
# Copyright (c) 2022 PaddlePaddle Authors. All Rights Reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
import os
import argparse
from ast import literal_eval
from paddlers.tasks import load_model
def get_parser():
parser = argparse.ArgumentParser()
parser.add_argument('--model_dir', '-m', type=str, default=None, help='model directory path')
parser.add_argument('--save_dir', '-s', type=str, default=None, help='path to save inference model')
parser.add_argument('--fixed_input_shape', '-fs', type=str, default=None,
help="export inference model with fixed input shape: [w,h] or [n,c,w,h]")
return parser
if __name__ == '__main__':
parser = get_parser()
args = parser.parse_args()
# Get input shape
fixed_input_shape = None
if args.fixed_input_shape is not None:
# Try to interpret the string as a list.
fixed_input_shape = literal_eval(args.fixed_input_shape)
# Check validaty
if not isinstance(fixed_input_shape, list):
raise ValueError("fixed_input_shape should be of None or list type.")
if len(fixed_input_shape) not in (2, 4):
raise ValueError("fixed_input_shape contains an incorrect number of elements.")
if fixed_input_shape[-1] <= 0 or fixed_input_shape[-2] <= 0:
raise ValueError("the input width and height must be positive integers.")
if len(fixed_input_shape)==4 and fixed_input_shape[1] <= 0:
raise ValueError("the number of input channels must be a positive integer.")
# Set environment variables
os.environ['PADDLEX_EXPORT_STAGE'] = 'True'
os.environ['PADDLESEG_EXPORT_STAGE'] = 'True'
# Load model from directory
model = load_model(args.model_dir)
# Do dynamic-to-static cast
# XXX: Invoke a protected (single underscore) method outside of subclasses.
model._export_inference_model(args.save_dir, fixed_input_shape)

@ -21,4 +21,4 @@ env_info = get_environ_info()
log_level = 2 log_level = 2
from . import tasks, datasets, transforms, utils, tools, models from . import tasks, datasets, transforms, utils, tools, models, deploy

@ -71,7 +71,7 @@ class BIT(nn.Layer):
dec_depth=8, dec_depth=8,
dec_head_dim=8, dec_head_dim=8,
**backbone_kwargs): **backbone_kwargs):
super().__init__() super(BIT, self).__init__()
# TODO: reduce hard-coded parameters # TODO: reduce hard-coded parameters
DIM = 32 DIM = 32
@ -197,7 +197,7 @@ class BIT(nn.Layer):
class Residual(nn.Layer): class Residual(nn.Layer):
def __init__(self, fn): def __init__(self, fn):
super().__init__() super(Residual, self).__init__()
self.fn = fn self.fn = fn
def forward(self, x, **kwargs): def forward(self, x, **kwargs):
@ -206,7 +206,7 @@ class Residual(nn.Layer):
class Residual2(nn.Layer): class Residual2(nn.Layer):
def __init__(self, fn): def __init__(self, fn):
super().__init__() super(Residual2, self).__init__()
self.fn = fn self.fn = fn
def forward(self, x1, x2, **kwargs): def forward(self, x1, x2, **kwargs):
@ -215,7 +215,7 @@ class Residual2(nn.Layer):
class PreNorm(nn.Layer): class PreNorm(nn.Layer):
def __init__(self, dim, fn): def __init__(self, dim, fn):
super().__init__() super(PreNorm, self).__init__()
self.norm = nn.LayerNorm(dim) self.norm = nn.LayerNorm(dim)
self.fn = fn self.fn = fn
@ -225,7 +225,7 @@ class PreNorm(nn.Layer):
class PreNorm2(nn.Layer): class PreNorm2(nn.Layer):
def __init__(self, dim, fn): def __init__(self, dim, fn):
super().__init__() super(PreNorm2, self).__init__()
self.norm = nn.LayerNorm(dim) self.norm = nn.LayerNorm(dim)
self.fn = fn self.fn = fn
@ -235,7 +235,7 @@ class PreNorm2(nn.Layer):
class FeedForward(nn.Sequential): class FeedForward(nn.Sequential):
def __init__(self, dim, hidden_dim, dropout_rate=0.): def __init__(self, dim, hidden_dim, dropout_rate=0.):
super().__init__( super(FeedForward, self).__init__(
nn.Linear(dim, hidden_dim), nn.Linear(dim, hidden_dim),
nn.GELU(), nn.GELU(),
nn.Dropout(dropout_rate), nn.Dropout(dropout_rate),
@ -249,7 +249,7 @@ class CrossAttention(nn.Layer):
head_dim=64, head_dim=64,
dropout_rate=0., dropout_rate=0.,
apply_softmax=True): apply_softmax=True):
super().__init__() super(CrossAttention, self).__init__()
inner_dim = head_dim * n_heads inner_dim = head_dim * n_heads
self.n_heads = n_heads self.n_heads = n_heads
@ -288,12 +288,12 @@ class CrossAttention(nn.Layer):
class SelfAttention(CrossAttention): class SelfAttention(CrossAttention):
def forward(self, x): def forward(self, x):
return super().forward(x, x) return super(SelfAttention, self).forward(x, x)
class TransformerEncoder(nn.Layer): class TransformerEncoder(nn.Layer):
def __init__(self, dim, depth, n_heads, head_dim, mlp_dim, dropout_rate): def __init__(self, dim, depth, n_heads, head_dim, mlp_dim, dropout_rate):
super().__init__() super(TransformerEncoder, self).__init__()
self.layers = nn.LayerList([]) self.layers = nn.LayerList([])
for _ in range(depth): for _ in range(depth):
self.layers.append( self.layers.append(
@ -322,7 +322,7 @@ class TransformerDecoder(nn.Layer):
mlp_dim, mlp_dim,
dropout_rate, dropout_rate,
apply_softmax=True): apply_softmax=True):
super().__init__() super(TransformerDecoder, self).__init__()
self.layers = nn.LayerList([]) self.layers = nn.LayerList([])
for _ in range(depth): for _ in range(depth):
self.layers.append( self.layers.append(
@ -349,7 +349,7 @@ class Backbone(nn.Layer, KaimingInitMixin):
arch='resnet18', arch='resnet18',
pretrained=True, pretrained=True,
n_stages=5): n_stages=5):
super().__init__() super(Backbone, self).__init__()
expand = 1 expand = 1
strides = (2, 1, 2, 1, 1) strides = (2, 1, 2, 1, 1)

@ -28,7 +28,7 @@ class _ChangeStarBase(nn.Layer):
def __init__(self, seg_model, num_classes, mid_channels, inner_channels, def __init__(self, seg_model, num_classes, mid_channels, inner_channels,
num_convs, scale_factor): num_convs, scale_factor):
super().__init__() super(_ChangeStarBase, self).__init__(_ChangeStarBase, self)
self.extract = seg_model self.extract = seg_model
self.detect = ChangeMixin( self.detect = ChangeMixin(
@ -63,7 +63,7 @@ class _ChangeStarBase(nn.Layer):
class ChangeMixin(nn.Layer): class ChangeMixin(nn.Layer):
def __init__(self, in_ch, out_ch, mid_ch, num_convs, scale_factor): def __init__(self, in_ch, out_ch, mid_ch, num_convs, scale_factor):
super().__init__() super(ChangeMixin, self).__init__(ChangeMixin, self)
convs = [Conv3x3(in_ch, mid_ch, norm=True, act=True)] convs = [Conv3x3(in_ch, mid_ch, norm=True, act=True)]
convs += [ convs += [
Conv3x3( Conv3x3(
@ -112,7 +112,7 @@ class ChangeStar_FarSeg(_ChangeStarBase):
# TODO: Configurable FarSeg model # TODO: Configurable FarSeg model
class _FarSegWrapper(nn.Layer): class _FarSegWrapper(nn.Layer):
def __init__(self, seg_model): def __init__(self, seg_model):
super().__init__() super(_FarSegWrapper, self).__init__()
self._seg_model = seg_model self._seg_model = seg_model
self._seg_model.cls_pred_conv = Identity() self._seg_model.cls_pred_conv = Identity()
@ -131,7 +131,7 @@ class ChangeStar_FarSeg(_ChangeStarBase):
seg_model = FarSeg(out_ch=mid_channels) seg_model = FarSeg(out_ch=mid_channels)
super().__init__( super(ChangeStar_FarSeg, self).__init__(
seg_model=_FarSegWrapper(seg_model), seg_model=_FarSegWrapper(seg_model),
num_classes=num_classes, num_classes=num_classes,
mid_channels=mid_channels, mid_channels=mid_channels,

@ -41,7 +41,7 @@ class DSAMNet(nn.Layer):
""" """
def __init__(self, in_channels, num_classes, ca_ratio=8, sa_kernel=7): def __init__(self, in_channels, num_classes, ca_ratio=8, sa_kernel=7):
super().__init__() super(DSAMNet, self).__init__()
WIDTH = 64 WIDTH = 64
@ -90,7 +90,7 @@ class DSAMNet(nn.Layer):
class DSLayer(nn.Sequential): class DSLayer(nn.Sequential):
def __init__(self, in_ch, out_ch, itm_ch, **convd_kwargs): def __init__(self, in_ch, out_ch, itm_ch, **convd_kwargs):
super().__init__( super(DSLayer, self).__init__(
nn.Conv2DTranspose( nn.Conv2DTranspose(
in_ch, itm_ch, kernel_size=3, padding=1, **convd_kwargs), in_ch, itm_ch, kernel_size=3, padding=1, **convd_kwargs),
make_norm(itm_ch), make_norm(itm_ch),

@ -41,7 +41,7 @@ class DSIFN(nn.Layer):
""" """
def __init__(self, num_classes, use_dropout=False): def __init__(self, num_classes, use_dropout=False):
super().__init__() super(DSIFN, self).__init__()
self.encoder1 = self.encoder2 = VGG16FeaturePicker() self.encoder1 = self.encoder2 = VGG16FeaturePicker()
@ -191,7 +191,7 @@ class DSIFN(nn.Layer):
class VGG16FeaturePicker(nn.Layer): class VGG16FeaturePicker(nn.Layer):
def __init__(self, indices=(3, 8, 15, 22, 29)): def __init__(self, indices=(3, 8, 15, 22, 29)):
super().__init__() super(VGG16FeaturePicker, self).__init__()
features = list(vgg16(pretrained=True).features)[:30] features = list(vgg16(pretrained=True).features)[:30]
self.features = nn.LayerList(features) self.features = nn.LayerList(features)
self.features.eval() self.features.eval()

@ -33,7 +33,7 @@ class ChannelAttention(nn.Layer):
""" """
def __init__(self, in_ch, ratio=8): def __init__(self, in_ch, ratio=8):
super().__init__() super(ChannelAttention, self).__init__()
self.avg_pool = nn.AdaptiveAvgPool2D(1) self.avg_pool = nn.AdaptiveAvgPool2D(1)
self.max_pool = nn.AdaptiveMaxPool2D(1) self.max_pool = nn.AdaptiveMaxPool2D(1)
self.fc1 = Conv1x1(in_ch, in_ch // ratio, bias=False, act=True) self.fc1 = Conv1x1(in_ch, in_ch // ratio, bias=False, act=True)
@ -59,7 +59,7 @@ class SpatialAttention(nn.Layer):
""" """
def __init__(self, kernel_size=7): def __init__(self, kernel_size=7):
super().__init__() super(SpatialAttention, self).__init__()
self.conv = BasicConv(2, 1, kernel_size, bias=False) self.conv = BasicConv(2, 1, kernel_size, bias=False)
def forward(self, x): def forward(self, x):
@ -85,7 +85,7 @@ class CBAM(nn.Layer):
""" """
def __init__(self, in_ch, ratio=8, kernel_size=7): def __init__(self, in_ch, ratio=8, kernel_size=7):
super().__init__() super(CBAM, self).__init__()
self.ca = ChannelAttention(in_ch, ratio=ratio) self.ca = ChannelAttention(in_ch, ratio=ratio)
self.sa = SpatialAttention(kernel_size=kernel_size) self.sa = SpatialAttention(kernel_size=kernel_size)

@ -51,7 +51,7 @@ class BasicConv(nn.Layer):
norm=False, norm=False,
act=False, act=False,
**kwargs): **kwargs):
super().__init__() super(BasicConv, self).__init__()
seq = [] seq = []
if kernel_size >= 2: if kernel_size >= 2:
seq.append(nn.Pad2D(kernel_size // 2, mode=pad_mode)) seq.append(nn.Pad2D(kernel_size // 2, mode=pad_mode))
@ -87,7 +87,7 @@ class Conv1x1(BasicConv):
norm=False, norm=False,
act=False, act=False,
**kwargs): **kwargs):
super().__init__( super(Conv1x1, self).__init__(
in_ch, in_ch,
out_ch, out_ch,
1, 1,
@ -107,7 +107,7 @@ class Conv3x3(BasicConv):
norm=False, norm=False,
act=False, act=False,
**kwargs): **kwargs):
super().__init__( super(Conv3x3, self).__init__(
in_ch, in_ch,
out_ch, out_ch,
3, 3,
@ -127,7 +127,7 @@ class Conv7x7(BasicConv):
norm=False, norm=False,
act=False, act=False,
**kwargs): **kwargs):
super().__init__( super(Conv7x7, self).__init__(
in_ch, in_ch,
out_ch, out_ch,
7, 7,
@ -140,12 +140,12 @@ class Conv7x7(BasicConv):
class MaxPool2x2(nn.MaxPool2D): class MaxPool2x2(nn.MaxPool2D):
def __init__(self, **kwargs): def __init__(self, **kwargs):
super().__init__(kernel_size=2, stride=(2, 2), padding=(0, 0), **kwargs) super(MaxPool2x2, self).__init__(kernel_size=2, stride=(2, 2), padding=(0, 0), **kwargs)
class MaxUnPool2x2(nn.MaxUnPool2D): class MaxUnPool2x2(nn.MaxUnPool2D):
def __init__(self, **kwargs): def __init__(self, **kwargs):
super().__init__(kernel_size=2, stride=(2, 2), padding=(0, 0), **kwargs) super(MaxUnPool2x2, self).__init__(kernel_size=2, stride=(2, 2), padding=(0, 0), **kwargs)
class ConvTransposed3x3(nn.Layer): class ConvTransposed3x3(nn.Layer):
@ -156,7 +156,7 @@ class ConvTransposed3x3(nn.Layer):
norm=False, norm=False,
act=False, act=False,
**kwargs): **kwargs):
super().__init__() super(ConvTransposed3x3, self).__init__()
seq = [] seq = []
seq.append( seq.append(
nn.Conv2DTranspose( nn.Conv2DTranspose(
@ -185,7 +185,7 @@ class Identity(nn.Layer):
"""A placeholder identity operator that accepts exactly one argument.""" """A placeholder identity operator that accepts exactly one argument."""
def __init__(self, *args, **kwargs): def __init__(self, *args, **kwargs):
super().__init__() super(Identity, self).__init__()
def forward(self, x): def forward(self, x):
return x return x

@ -39,7 +39,7 @@ class SNUNet(nn.Layer, KaimingInitMixin):
""" """
def __init__(self, in_channels, num_classes, width=32): def __init__(self, in_channels, num_classes, width=32):
super().__init__() super(SNUNet, self).__init__()
filters = (width, width * 2, width * 4, width * 8, width * 16) filters = (width, width * 2, width * 4, width * 8, width * 16)
@ -142,7 +142,7 @@ class SNUNet(nn.Layer, KaimingInitMixin):
class ConvBlockNested(nn.Layer): class ConvBlockNested(nn.Layer):
def __init__(self, in_ch, out_ch, mid_ch): def __init__(self, in_ch, out_ch, mid_ch):
super().__init__() super(ConvBlockNested, self).__init__()
self.act = nn.ReLU() self.act = nn.ReLU()
self.conv1 = nn.Conv2D(in_ch, mid_ch, kernel_size=3, padding=1) self.conv1 = nn.Conv2D(in_ch, mid_ch, kernel_size=3, padding=1)
self.bn1 = make_norm(mid_ch) self.bn1 = make_norm(mid_ch)
@ -163,7 +163,7 @@ class ConvBlockNested(nn.Layer):
class Up(nn.Layer): class Up(nn.Layer):
def __init__(self, in_ch, use_conv=False): def __init__(self, in_ch, use_conv=False):
super().__init__() super(Up, self).__init__()
if use_conv: if use_conv:
self.up = nn.Conv2DTranspose(in_ch, in_ch, 2, stride=2) self.up = nn.Conv2DTranspose(in_ch, in_ch, 2, stride=2)
else: else:

@ -46,7 +46,7 @@ class STANet(nn.Layer):
""" """
def __init__(self, in_channels, num_classes, att_type='BAM', ds_factor=1): def __init__(self, in_channels, num_classes, att_type='BAM', ds_factor=1):
super().__init__() super(STANet, self).__init__()
WIDTH = 64 WIDTH = 64
@ -94,7 +94,7 @@ def build_sta_module(in_ch, att_type, ds):
class Backbone(nn.Layer, KaimingInitMixin): class Backbone(nn.Layer, KaimingInitMixin):
def __init__(self, in_ch, arch, pretrained=True, strides=(2, 1, 2, 2, 2)): def __init__(self, in_ch, arch, pretrained=True, strides=(2, 1, 2, 2, 2)):
super().__init__() super(Backbone, self).__init__()
if arch == 'resnet18': if arch == 'resnet18':
self.resnet = resnet.resnet18( self.resnet = resnet.resnet18(
@ -148,7 +148,7 @@ class Backbone(nn.Layer, KaimingInitMixin):
class Decoder(nn.Layer, KaimingInitMixin): class Decoder(nn.Layer, KaimingInitMixin):
def __init__(self, f_ch): def __init__(self, f_ch):
super().__init__() super(Decoder, self).__init__()
self.dr1 = Conv1x1(64, 96, norm=True, act=True) self.dr1 = Conv1x1(64, 96, norm=True, act=True)
self.dr2 = Conv1x1(128, 96, norm=True, act=True) self.dr2 = Conv1x1(128, 96, norm=True, act=True)
self.dr3 = Conv1x1(256, 96, norm=True, act=True) self.dr3 = Conv1x1(256, 96, norm=True, act=True)
@ -183,7 +183,7 @@ class Decoder(nn.Layer, KaimingInitMixin):
class BAM(nn.Layer): class BAM(nn.Layer):
def __init__(self, in_ch, ds): def __init__(self, in_ch, ds):
super().__init__() super(BAM, self).__init__()
self.ds = ds self.ds = ds
self.pool = nn.AvgPool2D(self.ds) self.pool = nn.AvgPool2D(self.ds)
@ -220,7 +220,7 @@ class BAM(nn.Layer):
class PAMBlock(nn.Layer): class PAMBlock(nn.Layer):
def __init__(self, in_ch, scale=1, ds=1): def __init__(self, in_ch, scale=1, ds=1):
super().__init__() super(PAMBlock, self).__init__()
self.scale = scale self.scale = scale
self.ds = ds self.ds = ds
@ -280,7 +280,7 @@ class PAMBlock(nn.Layer):
class PAM(nn.Layer): class PAM(nn.Layer):
def __init__(self, in_ch, ds, scales=(1, 2, 4, 8)): def __init__(self, in_ch, ds, scales=(1, 2, 4, 8)):
super().__init__() super(PAM, self).__init__()
self.stages = nn.LayerList( self.stages = nn.LayerList(
[PAMBlock( [PAMBlock(
@ -296,7 +296,7 @@ class PAM(nn.Layer):
class Attention(nn.Layer): class Attention(nn.Layer):
def __init__(self, att): def __init__(self, att):
super().__init__() super(Attention, self).__init__()
self.att = att self.att = att
def forward(self, x1, x2): def forward(self, x1, x2):

@ -0,0 +1 @@
from .predictor import Predictor

@ -0,0 +1,283 @@
# Copyright (c) 2022 PaddlePaddle Authors. All Rights Reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
import os.path as osp
import numpy as np
from paddle.inference import Config
from paddle.inference import create_predictor
from paddle.inference import PrecisionType
from paddlers.tasks import load_model
from paddlers.utils import logging, Timer
class Predictor(object):
def __init__(self,
model_dir,
use_gpu=False,
gpu_id=0,
cpu_thread_num=1,
use_mkl=True,
mkl_thread_num=4,
use_trt=False,
use_glog=False,
memory_optimize=True,
max_trt_batch_size=1,
trt_precision_mode='float32'):
"""
创建Paddle Predictor
Args:
model_dir: 模型路径必须是导出的部署或量化模型
use_gpu: 是否使用GPU默认为False
gpu_id: 使用GPU的ID默认为0
cpu_thread_num使用cpu进行预测时的线程数默认为1
use_mkl: 是否使用mkldnn计算库CPU情况下使用默认为False
mkl_thread_num: mkldnn计算线程数默认为4
use_trt: 是否使用TensorRT默认为False
use_glog: 是否启用glog日志, 默认为False
memory_optimize: 是否启动内存优化默认为True
max_trt_batch_size: 在使用TensorRT时配置的最大batch size默认为1
trt_precision_mode在使用TensorRT时采用的精度可选值['float32', 'float16']默认为'float32'
"""
self.model_dir = model_dir
self._model = load_model(model_dir, with_net=False)
if trt_precision_mode.lower() == 'float32':
trt_precision_mode = PrecisionType.Float32
elif trt_precision_mode.lower() == 'float16':
trt_precision_mode = PrecisionType.Float16
else:
logging.error(
"TensorRT precision mode {} is invalid. Supported modes are float32 and float16."
.format(trt_precision_mode),
exit=True)
self.predictor = self.create_predictor(
use_gpu=use_gpu,
gpu_id=gpu_id,
cpu_thread_num=cpu_thread_num,
use_mkl=use_mkl,
mkl_thread_num=mkl_thread_num,
use_trt=use_trt,
use_glog=use_glog,
memory_optimize=memory_optimize,
max_trt_batch_size=max_trt_batch_size,
trt_precision_mode=trt_precision_mode)
self.timer = Timer()
def create_predictor(self,
use_gpu=True,
gpu_id=0,
cpu_thread_num=1,
use_mkl=True,
mkl_thread_num=4,
use_trt=False,
use_glog=False,
memory_optimize=True,
max_trt_batch_size=1,
trt_precision_mode=PrecisionType.Float32):
config = Config(
osp.join(self.model_dir, 'model.pdmodel'),
osp.join(self.model_dir, 'model.pdiparams'))
if use_gpu:
# 设置GPU初始显存(单位M)和Device ID
config.enable_use_gpu(200, gpu_id)
config.switch_ir_optim(True)
if use_trt:
if self._model.model_type == 'segmenter':
logging.warning(
"Semantic segmentation models do not support TensorRT acceleration, "
"TensorRT is forcibly disabled.")
elif 'RCNN' in self._model.__class__.__name__:
logging.warning(
"RCNN models do not support TensorRT acceleration, "
"TensorRT is forcibly disabled.")
else:
config.enable_tensorrt_engine(
workspace_size=1 << 10,
max_batch_size=max_trt_batch_size,
min_subgraph_size=3,
precision_mode=trt_precision_mode,
use_static=False,
use_calib_mode=False)
else:
config.disable_gpu()
config.set_cpu_math_library_num_threads(cpu_thread_num)
if use_mkl:
if self._model.__class__.__name__ == 'MaskRCNN':
logging.warning(
"MaskRCNN does not support MKL-DNN, MKL-DNN is forcibly disabled"
)
else:
try:
# cache 10 different shapes for mkldnn to avoid memory leak
config.set_mkldnn_cache_capacity(10)
config.enable_mkldnn()
config.set_cpu_math_library_num_threads(mkl_thread_num)
except Exception as e:
logging.warning(
"The current environment does not support MKL-DNN, MKL-DNN is disabled."
)
pass
if not use_glog:
config.disable_glog_info()
if memory_optimize:
config.enable_memory_optim()
config.switch_use_feed_fetch_ops(False)
predictor = create_predictor(config)
return predictor
def preprocess(self, images, transforms):
preprocessed_samples = self._model._preprocess(
images, transforms, to_tensor=False)
if self._model.model_type == 'classifier':
preprocessed_samples = {'image': preprocessed_samples[0]}
elif self._model.model_type == 'segmenter':
preprocessed_samples = {
'image': preprocessed_samples[0],
'ori_shape': preprocessed_samples[1]
}
elif self._model.model_type == 'detector':
pass
elif self._model.model_type == 'changedetector':
preprocessed_samples = {
'image': preprocessed_samples[0],
'image2': preprocessed_samples[1],
'ori_shape': preprocessed_samples[2]
}
else:
logging.error(
"Invalid model type {}".format(self._model.model_type),
exit=True)
return preprocessed_samples
def postprocess(self, net_outputs, topk=1, ori_shape=None, transforms=None):
if self._model.model_type == 'classifier':
true_topk = min(self._model.num_classes, topk)
preds = self._model._postprocess(net_outputs[0], true_topk)
elif self._model.model_type in ('segmenter', 'changedetector'):
label_map, score_map = self._model._postprocess(
net_outputs,
batch_origin_shape=ori_shape,
transforms=transforms.transforms)
preds = [{
'label_map': l,
'score_map': s
} for l, s in zip(label_map, score_map)]
elif self._model.model_type == 'detector':
net_outputs = {
k: v
for k, v in zip(['bbox', 'bbox_num', 'mask'], net_outputs)
}
preds = self._model._postprocess(net_outputs)
else:
logging.error(
"Invalid model type {}.".format(self._model.model_type),
exit=True)
return preds
def raw_predict(self, inputs):
""" 接受预处理过后的数据进行预测
Args:
inputs(dict): 预处理过后的数据
"""
input_names = self.predictor.get_input_names()
for name in input_names:
input_tensor = self.predictor.get_input_handle(name)
input_tensor.copy_from_cpu(inputs[name])
self.predictor.run()
output_names = self.predictor.get_output_names()
net_outputs = list()
for name in output_names:
output_tensor = self.predictor.get_output_handle(name)
net_outputs.append(output_tensor.copy_to_cpu())
return net_outputs
def _run(self, images, topk=1, transforms=None):
self.timer.preprocess_time_s.start()
preprocessed_input = self.preprocess(images, transforms)
self.timer.preprocess_time_s.end(iter_num=len(images))
self.timer.inference_time_s.start()
net_outputs = self.raw_predict(preprocessed_input)
self.timer.inference_time_s.end(iter_num=1)
self.timer.postprocess_time_s.start()
results = self.postprocess(
net_outputs,
topk,
ori_shape=preprocessed_input.get('ori_shape', None),
transforms=transforms)
self.timer.postprocess_time_s.end(iter_num=len(images))
return results
def predict(self,
img_file,
topk=1,
transforms=None,
warmup_iters=0,
repeats=1):
""" 图片预测
Args:
img_file(List[np.ndarray or str], str or np.ndarray):
对于场景分类图像复原目标检测和语义分割任务来说该参数可为单一图像路径或是解码后的排列格式为H, W, C
且具有float32类型的BGR图像表示为numpy的ndarray形式或者是一组图像路径或np.ndarray对象构成的列表对于变化检测
任务来说该参数可以为图像路径二元组分别表示前后两个时相影像路径或是两幅图像组成的二元组或者是上述两种二元组
之一构成的列表
topk(int): 场景分类模型预测时使用表示预测前topk的结果默认值为1
transforms (paddlex.transforms): 数据预处理操作默认值为None, 即使用`model.yml`中保存的数据预处理操作
warmup_iters (int): 预热轮数用于评估模型推理以及前后处理速度若大于1会预先重复预测warmup_iters而后才开始正式的预测及其速度评估默认为0
repeats (int): 重复次数用于评估模型推理以及前后处理速度若大于1会预测repeats次取时间平均值默认值为1
"""
if repeats < 1:
logging.error("`repeats` must be greater than 1.", exit=True)
if transforms is None and not hasattr(self._model, 'test_transforms'):
raise Exception("Transforms need to be defined, now is None.")
if transforms is None:
transforms = self._model.test_transforms
if isinstance(img_file, tuple) and len(img_file) != 2:
raise ValueError(
f"A change detection model accepts exactly two input images, but there are {len(img_file)}."
)
if isinstance(img_file, (str, np.ndarray, tuple)):
images = [img_file]
else:
images = img_file
for _ in range(warmup_iters):
self._run(images=images, topk=topk, transforms=transforms)
self.timer.reset()
for _ in range(repeats):
results = self._run(images=images, topk=topk, transforms=transforms)
self.timer.repeats = repeats
self.timer.img_num = len(images)
self.timer.info(average=True)
if isinstance(img_file, (str, np.ndarray)):
results = results[0]
return results
def batch_predict(self, image_list, **params):
return self.predict(img_file=image_list, **params)

@ -33,7 +33,7 @@ from paddlers.utils import (seconds_to_hms, get_single_card_bs, dict2str,
_get_shared_memory_size_in_M, EarlyStop) _get_shared_memory_size_in_M, EarlyStop)
import paddlers.utils.logging as logging import paddlers.utils.logging as logging
from .slim.prune import _pruner_eval_fn, _pruner_template_input, sensitive_prune from .slim.prune import _pruner_eval_fn, _pruner_template_input, sensitive_prune
from .utils.infer_nets import InferNet from .utils.infer_nets import InferNet, InferCDNet
class BaseModel: class BaseModel:
@ -580,13 +580,16 @@ class BaseModel:
return pipeline_info return pipeline_info
def _build_inference_net(self): def _build_inference_net(self):
infer_net = self.net if self.model_type == 'detector' else InferNet( if self.model_type == 'detector':
self.net, self.model_type) infer_net = self.net
elif self.model_type == 'changedetector':
infer_net = InferCDNet(self.net)
else:
infer_net = InferNet(self.net, self.model_type)
infer_net.eval() infer_net.eval()
return infer_net return infer_net
def _export_inference_model(self, save_dir, image_shape=None): def _export_inference_model(self, save_dir, image_shape=None):
save_dir = osp.join(save_dir, 'inference_model')
self.test_inputs = self._get_test_inputs(image_shape) self.test_inputs = self._get_test_inputs(image_shape)
infer_net = self._build_inference_net() infer_net = self._build_inference_net()

@ -98,11 +98,11 @@ class BaseChangeDetector(BaseModel):
else: else:
image_shape = [None, 3, -1, -1] image_shape = [None, 3, -1, -1]
self.fixed_input_shape = image_shape self.fixed_input_shape = image_shape
input_spec = [ return [
InputSpec( InputSpec(
shape=image_shape, name='image', dtype='float32') shape=image_shape, name='image', dtype='float32'), InputSpec(
shape=image_shape, name='image2', dtype='float32')
] ]
return input_spec
def run(self, net, inputs, mode): def run(self, net, inputs, mode):
net_out = net(inputs[0], inputs[1]) net_out = net(inputs[0], inputs[1])
@ -532,22 +532,26 @@ class BaseChangeDetector(BaseModel):
def _preprocess(self, images, transforms, to_tensor=True): def _preprocess(self, images, transforms, to_tensor=True):
arrange_transforms( arrange_transforms(
model_type=self.model_type, transforms=transforms, mode='test') model_type=self.model_type, transforms=transforms, mode='test')
batch_im = list() batch_im1, batch_im2 = list(), list()
batch_ori_shape = list() batch_ori_shape = list()
for im in images: for im1, im2 in images:
sample = {'image': im} sample = {'image_t1': im1, 'image_t2': im2}
if isinstance(sample['image'], str): if isinstance(sample['image_t1'], str) or \
isinstance(sample['image_t2'], str):
sample = ImgDecoder(to_rgb=False)(sample) sample = ImgDecoder(to_rgb=False)(sample)
ori_shape = sample['image'].shape[:2] ori_shape = sample['image'].shape[:2]
im = transforms(sample)[0] im1, im2 = transforms(sample)[:2]
batch_im.append(im) batch_im1.append(im1)
batch_im2.append(im2)
batch_ori_shape.append(ori_shape) batch_ori_shape.append(ori_shape)
if to_tensor: if to_tensor:
batch_im = paddle.to_tensor(batch_im) batch_im1 = paddle.to_tensor(batch_im1)
batch_im2 = paddle.to_tensor(batch_im2)
else: else:
batch_im = np.asarray(batch_im) batch_im1 = np.asarray(batch_im1)
batch_im2 = np.asarray(batch_im2)
return batch_im, batch_ori_shape return batch_im1, batch_im2, batch_ori_shape
@staticmethod @staticmethod
def get_transforms_shape_info(batch_ori_shape, transforms): def get_transforms_shape_info(batch_ori_shape, transforms):

@ -61,12 +61,6 @@ def load_model(model_dir, **params):
model_info = yaml.load(f.read(), Loader=yaml.Loader) model_info = yaml.load(f.read(), Loader=yaml.Loader)
f.close() f.close()
version = model_info['version']
if int(version.split('.')[0]) < 2:
raise Exception(
'Current version is {}, a model trained by PaddleRS={} cannot be load.'.
format(paddlers.__version__, version))
status = model_info['status'] status = model_info['status']
with_net = params.get('with_net', True) with_net = params.get('with_net', True)
if not with_net: if not with_net:

@ -43,3 +43,16 @@ class InferNet(paddle.nn.Layer):
outputs = self.postprocessor(net_outputs) outputs = self.postprocessor(net_outputs)
return outputs return outputs
class InferCDNet(paddle.nn.Layer):
def __init__(self, net):
super(InferCDNet, self).__init__()
self.net = net
self.postprocessor = PostProcessor('changedetector')
def forward(self, x1, x2):
net_outputs = self.net(x1, x2)
outputs = self.postprocessor(net_outputs)
return outputs

@ -0,0 +1,4 @@
*.zip
*.tar.gz
rsseg/
optic/

@ -0,0 +1,91 @@
#!/usr/bin/env python
# 图像分割模型DeepLab V3+训练示例脚本
# 执行此脚本前,请确认已正确安装PaddleRS库
import paddlers as pdrs
from paddlers import transforms as T
# 下载文件存放目录
DOWNLOAD_DIR = './data/rsseg/'
# 数据集存放目录
DATA_DIR = './data/rsseg/remote_sensing_seg/'
# 训练集`file_list`文件路径
TRAIN_FILE_LIST_PATH = './data/rsseg/remote_sensing_seg/train.txt'
# 验证集`file_list`文件路径
EVAL_FILE_LIST_PATH = './data/rsseg/remote_sensing_seg/val.txt'
# 数据集类别信息文件路径
LABEL_LIST_PATH = './data/rsseg/remote_sensing_seg/labels.txt'
# 实验目录,保存输出的模型权重和结果
EXP_DIR = './output/deeplabv3p/'
# 影像波段数量
NUM_BANDS = 10
# 下载和解压多光谱地块分类数据集
seg_dataset = 'https://paddleseg.bj.bcebos.com/dataset/remote_sensing_seg.zip'
pdrs.utils.download_and_decompress(seg_dataset, path=DOWNLOAD_DIR)
# 定义训练和验证时使用的数据变换(数据增强、预处理等)
# 使用Compose组合多种变换方式。Compose中包含的变换将按顺序串行执行
# API说明:https://github.com/PaddleCV-SIG/PaddleRS/blob/develop/docs/apis/transforms.md
train_transforms = T.Compose([
# 将影像缩放到512x512大小
T.Resize(target_size=512),
# 以50%的概率实施随机水平翻转
T.RandomHorizontalFlip(prob=0.5),
# 将数据归一化到[-1,1]
T.Normalize(
mean=[0.5] * NUM_BANDS, std=[0.5] * NUM_BANDS),
])
eval_transforms = T.Compose([
T.Resize(target_size=512),
# 验证阶段与训练阶段的数据归一化方式必须相同
T.Normalize(
mean=[0.5] * NUM_BANDS, std=[0.5] * NUM_BANDS),
])
# 分别构建训练和验证所用的数据集
train_dataset = pdrs.datasets.SegDataset(
data_dir=DATA_DIR,
file_list=TRAIN_FILE_LIST_PATH,
label_list=LABEL_LIST_PATH,
transforms=train_transforms,
num_workers=0,
shuffle=True)
eval_dataset = pdrs.datasets.SegDataset(
data_dir=DATA_DIR,
file_list=EVAL_FILE_LIST_PATH,
label_list=LABEL_LIST_PATH,
transforms=eval_transforms,
num_workers=0,
shuffle=False)
# 构建DeepLab V3+模型,使用ResNet-50作为backbone
# 目前已支持的模型请参考:https://github.com/PaddleCV-SIG/PaddleRS/blob/develop/docs/apis/model_zoo.md
# 模型输入参数请参考:https://github.com/PaddleCV-SIG/PaddleRS/blob/develop/paddlers/tasks/segmenter.py
model = pdrs.tasks.DeepLabV3P(
input_channel=NUM_BANDS,
num_classes=len(train_dataset.labels),
backbone='ResNet50_vd')
# 执行模型训练
model.train(
num_epochs=10,
train_dataset=train_dataset,
train_batch_size=4,
eval_dataset=eval_dataset,
save_interval_epochs=5,
# 每多少次迭代记录一次日志
log_interval_steps=50,
save_dir=EXP_DIR,
# 初始学习率大小
learning_rate=0.01,
# 是否使用early stopping策略,当精度不再改善时提前终止训练
early_stop=False,
# 是否启用VisualDL日志功能
use_vdl=True,
# 指定从某个检查点继续训练
resume_checkpoint=None)

@ -1,54 +0,0 @@
import os
os.environ['CUDA_VISIBLE_DEVICES'] = '0'
import paddlers as pdrs
from paddlers import transforms as T
# 下载和解压多光谱地块分类数据集
dataset = 'https://paddleseg.bj.bcebos.com/dataset/remote_sensing_seg.zip'
pdrs.utils.download_and_decompress(dataset, path='./data')
# 定义训练和验证时的transforms
channel = 10
train_transforms = T.Compose([
T.Resize(target_size=512),
T.RandomHorizontalFlip(),
T.Normalize(
mean=[0.5] * channel, std=[0.5] * channel),
])
eval_transforms = T.Compose([
T.Resize(target_size=512),
T.Normalize(
mean=[0.5] * channel, std=[0.5] * channel),
])
# 定义训练和验证所用的数据集
train_dataset = pdrs.datasets.SegDataset(
data_dir='./data/remote_sensing_seg',
file_list='./data/remote_sensing_seg/train.txt',
label_list='./data/remote_sensing_seg/labels.txt',
transforms=train_transforms,
num_workers=0,
shuffle=True)
eval_dataset = pdrs.datasets.SegDataset(
data_dir='./data/remote_sensing_seg',
file_list='./data/remote_sensing_seg/val.txt',
label_list='./data/remote_sensing_seg/labels.txt',
transforms=eval_transforms,
num_workers=0,
shuffle=False)
# 初始化模型,并进行训练
# 可使用VisualDL查看训练指标
num_classes = len(train_dataset.labels)
model = pdrs.tasks.DeepLabV3P(input_channel=channel, num_classes=num_classes, backbone='ResNet50_vd')
model.train(
num_epochs=10,
train_dataset=train_dataset,
train_batch_size=4,
eval_dataset=eval_dataset,
learning_rate=0.01,
save_dir='output/deeplabv3p_r50vd')

@ -1,58 +0,0 @@
import os
os.environ['CUDA_VISIBLE_DEVICES'] = '0'
import paddlers as pdrs
from paddlers import transforms as T
# 下载和解压视盘分割数据集
optic_dataset = 'https://bj.bcebos.com/paddlex/datasets/optic_disc_seg.tar.gz'
pdrs.utils.download_and_decompress(optic_dataset, path='./')
# 定义训练和验证时的transforms
# API说明:https://github.com/PaddlePaddle/paddlers/blob/develop/docs/apis/transforms/transforms.md
train_transforms = T.Compose([
T.Resize(target_size=512),
T.RandomHorizontalFlip(),
T.Normalize(
mean=[0.5, 0.5, 0.5], std=[0.5, 0.5, 0.5]),
])
eval_transforms = T.Compose([
T.Resize(target_size=512),
T.Normalize(
mean=[0.5, 0.5, 0.5], std=[0.5, 0.5, 0.5]),
])
# 定义训练和验证所用的数据集
# API说明:https://github.com/PaddlePaddle/paddlers/blob/develop/docs/apis/datasets.md
train_dataset = pdrs.datasets.SegDataset(
data_dir='optic_disc_seg',
file_list='optic_disc_seg/train_list.txt',
label_list='optic_disc_seg/labels.txt',
transforms=train_transforms,
num_workers=0,
shuffle=True)
eval_dataset = pdrs.datasets.SegDataset(
data_dir='optic_disc_seg',
file_list='optic_disc_seg/val_list.txt',
label_list='optic_disc_seg/labels.txt',
transforms=eval_transforms,
num_workers=0,
shuffle=False)
# 初始化模型,并进行训练
# 可使用VisualDL查看训练指标,参考https://github.com/PaddlePaddle/paddlers/blob/develop/docs/visualdl.md
num_classes = len(train_dataset.labels)
model = pdrs.tasks.FarSeg(num_classes=num_classes)
# API说明:https://github.com/PaddlePaddle/paddlers/blob/develop/docs/apis/models/semantic_segmentation.md
# 各参数介绍与调整说明:https://github.com/PaddlePaddle/paddlers/blob/develop/docs/parameters.md
model.train(
num_epochs=10,
train_dataset=train_dataset,
train_batch_size=4,
eval_dataset=eval_dataset,
learning_rate=0.01,
pretrain_weights=None,
save_dir='output/farseg')

@ -0,0 +1,89 @@
#!/usr/bin/env python
# 图像分割模型UNet训练示例脚本
# 执行此脚本前,请确认已正确安装PaddleRS库
import paddlers as pdrs
from paddlers import transforms as T
# 下载文件存放目录
DOWNLOAD_DIR = './data/rsseg/'
# 数据集存放目录
DATA_DIR = './data/rsseg/remote_sensing_seg/'
# 训练集`file_list`文件路径
TRAIN_FILE_LIST_PATH = './data/rsseg/remote_sensing_seg/train.txt'
# 验证集`file_list`文件路径
EVAL_FILE_LIST_PATH = './data/rsseg/remote_sensing_seg/val.txt'
# 数据集类别信息文件路径
LABEL_LIST_PATH = './data/rsseg/remote_sensing_seg/labels.txt'
# 实验目录,保存输出的模型权重和结果
EXP_DIR = './output/unet/'
# 影像波段数量
NUM_BANDS = 10
# 下载和解压多光谱地块分类数据集
seg_dataset = 'https://paddleseg.bj.bcebos.com/dataset/remote_sensing_seg.zip'
pdrs.utils.download_and_decompress(seg_dataset, path=DOWNLOAD_DIR)
# 定义训练和验证时使用的数据变换(数据增强、预处理等)
# 使用Compose组合多种变换方式。Compose中包含的变换将按顺序串行执行
# API说明:https://github.com/PaddleCV-SIG/PaddleRS/blob/develop/docs/apis/transforms.md
train_transforms = T.Compose([
# 将影像缩放到512x512大小
T.Resize(target_size=512),
# 以50%的概率实施随机水平翻转
T.RandomHorizontalFlip(prob=0.5),
# 将数据归一化到[-1,1]
T.Normalize(
mean=[0.5] * NUM_BANDS, std=[0.5] * NUM_BANDS),
])
eval_transforms = T.Compose([
T.Resize(target_size=512),
# 验证阶段与训练阶段的数据归一化方式必须相同
T.Normalize(
mean=[0.5] * NUM_BANDS, std=[0.5] * NUM_BANDS),
])
# 分别构建训练和验证所用的数据集
train_dataset = pdrs.datasets.SegDataset(
data_dir=DATA_DIR,
file_list=TRAIN_FILE_LIST_PATH,
label_list=LABEL_LIST_PATH,
transforms=train_transforms,
num_workers=0,
shuffle=True)
eval_dataset = pdrs.datasets.SegDataset(
data_dir=DATA_DIR,
file_list=EVAL_FILE_LIST_PATH,
label_list=LABEL_LIST_PATH,
transforms=eval_transforms,
num_workers=0,
shuffle=False)
# 构建UNet模型
# 目前已支持的模型请参考:https://github.com/PaddleCV-SIG/PaddleRS/blob/develop/docs/apis/model_zoo.md
# 模型输入参数请参考:https://github.com/PaddleCV-SIG/PaddleRS/blob/develop/paddlers/tasks/segmenter.py
model = pdrs.tasks.UNet(
input_channel=NUM_BANDS, num_classes=len(train_dataset.labels))
# 执行模型训练
model.train(
num_epochs=10,
train_dataset=train_dataset,
train_batch_size=4,
eval_dataset=eval_dataset,
save_interval_epochs=5,
# 每多少次迭代记录一次日志
log_interval_steps=50,
save_dir=EXP_DIR,
# 初始学习率大小
learning_rate=0.01,
# 是否使用early stopping策略,当精度不再改善时提前终止训练
early_stop=False,
# 是否启用VisualDL日志功能
use_vdl=True,
# 指定从某个检查点继续训练
resume_checkpoint=None)

@ -1,55 +0,0 @@
import os
os.environ['CUDA_VISIBLE_DEVICES'] = '0'
import paddlers as pdrs
from paddlers import transforms as T
# 下载和解压多光谱地块分类数据集
dataset = 'https://paddleseg.bj.bcebos.com/dataset/remote_sensing_seg.zip'
pdrs.utils.download_and_decompress(dataset, path='./data')
# 定义训练和验证时的transforms
channel = 10
train_transforms = T.Compose([
T.Resize(target_size=512),
T.RandomHorizontalFlip(),
T.Normalize(
mean=[0.5] * channel, std=[0.5] * channel),
])
eval_transforms = T.Compose([
T.Resize(target_size=512),
T.Normalize(
mean=[0.5] * channel, std=[0.5] * channel),
])
# 定义训练和验证所用的数据集
train_dataset = pdrs.datasets.SegDataset(
data_dir='./data/remote_sensing_seg',
file_list='./data/remote_sensing_seg/train.txt',
label_list='./data/remote_sensing_seg/labels.txt',
transforms=train_transforms,
num_workers=0,
shuffle=True)
eval_dataset = pdrs.datasets.SegDataset(
data_dir='./data/remote_sensing_seg',
file_list='./data/remote_sensing_seg/val.txt',
label_list='./data/remote_sensing_seg/labels.txt',
transforms=eval_transforms,
num_workers=0,
shuffle=False)
# 初始化模型,并进行训练
# 可使用VisualDL查看训练指标
num_classes = len(train_dataset.labels)
model = pdrs.tasks.UNet(input_channel=channel, num_classes=num_classes)
model.train(
num_epochs=20,
train_dataset=train_dataset,
train_batch_size=4,
eval_dataset=eval_dataset,
learning_rate=0.01,
save_dir='output/unet',
use_vdl=True)
Loading…
Cancel
Save