日本熟妇hd丰满老熟妇,中文字幕一区二区三区在线不卡 ,亚洲成片在线观看,免费女同在线一区二区

基于KSpeed的ResNet50訓練

本文以ResNet50的圖片分類模型訓練為例,為您介紹KSpeedCV領域加速圖片數據的加載實踐。ResNet50模型是基于NVIDIA官方開源代碼DeepLearningExamples中的實現。使用KSpeed需要在原來的代碼上做一點改動,改動的地方可以通過git patch的方式適配到ResNet50模型中,改動細節在文末接入KSpeed關鍵模塊說明進行了簡要說明。

代碼準備

  • 代碼base庫:

https://github.com/NVIDIA/DeepLearningExamples/commit/174b3d40bfc26f2adcf252676d38d6d5ffa7cbdc

git clone https://github.com/NVIDIA/DeepLearningExamples.git

cd DeepLearningExamples

git checkout master

git reset --hard 174b3d40bfc26f2adcf252676d38d6d5ffa7cbdc
  • 接入KSpeed代碼

#保持在DeepLearningExamples目錄下
wget http://kspeed-release.oss-cn-beijing.aliyuncs.com/kspeed_resnet50.patch

git apply kspeed_resnet50.patch

運行環境配置

啟動訓練容器命令如下:

docker run -it --gpus all --name=resnet50_kspeed_test --net=host --ipc host --device=/dev/infiniband/ --ulimit memlock=-1:-1 -v /{path-to-imagenet}:/{path-to-imagenet-in-docker} -v /{path-to-DeepLearningExamples}:/{path-to-DeepLearningExamples-in-docker} eflo-registry.cn-beijing.cr.aliyuncs.com/eflo/ngc-pytorch-kspeed-22.05-py38:v2.2.0
說明

上述命令中

  • {path-to-imagenet} 表示物理機中imagenet數據集所在路徑;

  • {path-to-imagenet-in-docker} 表示用戶將數據集映射到容器中的路徑;

  • {path-to-DeepLearningExamples}表示物理機中模型訓練代碼所在路徑;

  • {path-to-DeepLearningExamples-in-docker}表示模型訓練代碼映射到容器中的路徑;

以上路徑需要用戶自己設置。

imagenet數據集目錄結構如下所示:

imagenet
├── train
│   ├── n01440764
│   │  ├── n01440764_10026.JPEG
│   │  ├── n01440764_10027.JPEG
│   │  └── ......
│   ├── n01443537
│   └── ......         
└── val                
    ├── n01440764
    │  ├── ILSVRC2012_val_00000293.JPEG
    │  ├── ILSVRC2012_val_00002138.JPEG
    │  └── ......
    ├── n01443537
    └── ......       

數據集獲取方式參考:https://github.com/NVIDIA/DeepLearningExamples/tree/master/PyTorch/Classification/ConvNets/resnet50v1.5

運行模型訓練

#保持在DeepLearningExamples目錄下
cd ./PyTorch/Classification/ConvNets

#單機八卡 baseline
bash ./resnet50v1.5/training/AMP/DGXA100_resnet50_AMP_multi.sh pytorch {path-to-imagenet-in-docker}

#單機八卡 kspeed
bash ./resnet50v1.5/training/AMP/DGXA100_resnet50_AMP_multi.sh kspeed {path-to-imagenet-in-docker}

#單機八卡 dali+kspeed
bash ./resnet50v1.5/training/AMP/DGXA100_resnet50_AMP_multi.sh dali-kspeed {path-to-imagenet-in-docker}
說明
  • 上述命令中

{path-to-imagenet-in-docker}表示imagenet數據集在容器中的路徑,需要與啟動容器時設置的路徑保持一致。

  • 執行KSpeed測試前,需要確保已經部署好kspeed服務。

接入KSpeed關鍵模塊說明

增加kspeeddataloader模塊文件

新增文件DeepLearningExamples/PyTorch/Classification/ConvNets/image_classification/kspeeddataloader.py,主要實現了包括基于KSpeedPytorch Dataloader和基于KSpeedDali Dataloader。

基于KSpeedPytorch Dataloader

實現基于KSpeedPytorch Dataloader,只需修改Dataset,然后結合Pytorch原生的SamplerDataloader即可。核心代碼如下:

  • 導入kspeeddataset模塊

import kspeed.utils.data.kspeeddataset as KSpeedDataset
  • torchvison.datasets.ImageFolder替換為KSpeedDataset.KSpeedImageFolder,從而可以使用KSpeed數據加載加速能力

train_dataset = KSpeedDataset.KSpeedImageFolder(
        traindir, None, workers, kspeed_iplist,
        "admin", "admin", transforms.Compose(transforms_list),
    )

val_dataset = KSpeedDataset.KSpeedImageFolder(
        valdir, None, workers, kspeed_iplist,
        "admin", "admin",
        transforms.Compose(
            [
                transforms.Resize(
                    image_size + crop_padding, interpolation=interpolation
                ),
                transforms.CenterCrop(image_size),
            ]
        ),
    )
  • 實現get_kspeed_train_loaderget_kspeed_val_loader方法,詳見kspeeddataloader.py 16~72行和74~128

基于KSpeedDali Dataloader

實現基于KSpeedDali Dataloader,只需修改Dali pipeline的輸入數據源為一個外部數據源KSpeedCallable即可。核心代碼如下:

  • KSpeedCallable

KSpeedCallable對象繼承KSpeedDataset.KSpeedFolder,在kspeeddataloader.py 164~179行中176行,通過self.dataset.getBIN(path)讀取imagenet數據集樣本。

def __call__(self, sample_info):
        
    if self.dataset is None:
        self.load()
    if sample_info.iteration >= self.full_iters:
        raise StopIteration()
    if self.last_seen_epoch != sample_info.epoch_idx:
        self.last_seen_epoch = sample_info.epoch_idx
        self.perm = np.random.default_rng(seed=42 + sample_info.epoch_idx).permutation(len(self.files))
    idx = self.perm[sample_info.idx_in_epoch + self.shard_offset]
        
    path = os.path.join(self.root, self.files[idx])
    dout = self.dataset.getBIN(path)
    sample = np.frombuffer(dout, dtype=np.uint8)
    label = np.int32([self.labels[idx]])
    return sample, label
  • 基于KSpeedCallableDali Pipeline

kspeeddataloader.py223~229行中,使用KSpeedCallable作為Dali Pipeline的外部數據源獲取數據集樣本。

if kspeed:
    images, labels = fn.external_source(source=kscallable,
                        num_outputs=2,
                        batch=False, 
                        parallel=True, 
                        dtype=[types.UINT8, types.INT32], 
                        device='cpu')

增加DATA_BACKEND_CHOICES選項

DeepLearningExamples/PyTorch/Classification/ConvNets/image_classification/dataloaders.py 40行,將原來的DATA_BACKEND_CHOICES = ["pytorch", "syntetic"], 修改如下:

DATA_BACKEND_CHOICES = ["pytorch", "syntetic", "kspeed", "dali-kspeed",  "dali"]

增加args.data_backend選項

在文件DeepLearningExamples/PyTorch/Classification/ConvNets/main.py512~520行,將如下代碼添加到args.data_backend的分支當中:

elif args.data_backend == "kspeed":
    get_train_loader = get_kspeed_train_loader
    get_val_loader = get_kspeed_val_loader
elif args.data_backend == "dali":
    get_train_loader = get_dali_kspeed_train_loader(dali_cpu=True, kspeed=False)
    get_val_loader = get_dali_kspeed_val_loader(dali_cpu=True, kspeed=False)
elif args.data_backend == "dali-kspeed":
    get_train_loader = get_dali_kspeed_train_loader(dali_cpu=True)
    get_val_loader = get_dali_kspeed_val_loader(dali_cpu=True)