Using SOLT for instance and semantic segmentation

In this tutorial, we will shortly demonstrate how to use SOLT in instance and semenatic segmentation tasks.

To run this notebook, please download train images from Kaggle Data Science Bowl’18 page, and place them into Data/ds_bowl_stage_1.

%matplotlib inline
import numpy as np
import matplotlib.pyplot as plt
from matplotlib import patches
import cv2
import os
import glob
import json

import solt
import solt.transforms as slt
def get_masks(img_fname):
    img_id = img_fname.split(os.path.sep)[3]
    masks_fnames = glob.glob(os.path.join('Data', 'ds_bowl_stage_1','stage1_train', img_id, 'masks', '*.png'))
    masks = []
    for msk_fname in masks_fnames:
        masks.append(cv2.imread(msk_fname, 0))
    return masks

def vis_img_instances(img, masks):
    m = np.zeros((masks[0].shape), dtype=int)
    for j, msk in enumerate(masks):
        m[msk == 255]=(j+1)

    fig = plt.figure(figsize=(6,6))
    ax = fig.add_subplot(1,1,1)
    ax.imshow(, m==0), cmap='nipy_spectral', alpha=0.8)

Loading the data

imgs_fnames = glob.glob(os.path.join('Data', 'ds_bowl_stage_1','stage1_train', '*', 'images', '*.png'))
fname = imgs_fnames[213]
masks = get_masks(fname)
img = cv2.imread(fname)
img = cv2.cvtColor(img, cv2.COLOR_BGR2RGB)
vis_img_instances(img, masks)

Defining a pipeline

In case if we want to augment every instance for such models as Mask-RCNN, we easily do it using a DataContainer

stream = solt.Stream([
    slt.Rotate(angle_range=(-90, 90), p=1, interpolation='bicubic'),
    slt.Crop(200, crop_mode='c'),
    slt.Crop(192, crop_mode='r')
  interpolation: null
  optimize_stack: false
  padding: null
  - rotate:
      - -90
      - 90
      ignore_state: true
      - bicubic
      - inherit
      p: 1
      - z
      - inherit
  - pad:
      - 200
      - 200
      - z
      - inherit
  - crop:
      crop_mode: c
  - crop:
      crop_mode: r

Augmentation results

In the example below, we use all the masks in the augmentation process because of the instance segmentation problem. Here, we will also demonstrate the power of the dict API:

for i in range(10):
    res_dc = stream({'image': img, 'masks': masks}, return_torch=False)
    img_res =[0]
    masks_res =[1:]
    vis_img_instances(img_res, masks_res)
[ ]: