Prediction

Intuition

Raster image can have very large spatial resolution. It is impossible to input whole raster image as an input to model because of constrain about memory and time, instead approch like split raster into sub-patch and make prediction independent together then mosiac back to original size is often use. But the problem is at the boundary of each patch will lost information of adjacent patch and cause error in convolutional operation.

Figure : Example of boundary error when prediction raster image by sub-patch

Another apporch that solve this problem is use averaging sliding window technique. We implement by make a prediction by some fixed sized window. Then step the window by the number of stride , some pixel will be overlap from previous prediction. We process by accumulate the prediction and using using extra accumulator array to count the number of redundant prediction. After prediction is finised. We then divide the accumulate prediction result with accumulator number of each pixel to get average prediction. Results mitigate the error at the boundary.

Figure 1 : Initial state of output and accumulator, no prediction takes place yet.

Figure 2 : First window prediction, accumulator count the number to total prediction

Figure 3 : Sliding window move by stride number, some pixel make prediction agian, accumulate the prediction and count total number of prediction.

Figure 5 : By the end of operation , divide total accumulate prediction with the number of prediction to get average result.

Prediction

Prediction on one model

predict_model.py implement averaging sliding window technique as state before.

Right now we should have total of n candidate ensemble. predict_model.py use make prediction for each model by using raster file as input directly and produce raster output.

Input :

Sentinel 2 + 1 stack rasterimage
14 channel .tif or tiff same band arrangement as training image

Output :

1 channel output prediction, raster image (.tif)
1 channel variance of output prediction, raster image (.tif)

Usage

python3  predict_model.py  [--model_path  MODEL_DIRECTORY] [--input INPUT_FILE_PATH] [--output OUTPUT_FILE_PATH]

Note : MODEL_DIRECTORY , Specify the path in which the candidate ensemble model best_weights.pt is stored

You can define window size and stride in optional argument.

More stride will get more accurately result but increase time complexity.

More window size speed up inference time if you have enough computational resource.

optinal arguments:
  --channels num_ch         Number of input_channels , default=14
  --bilinear BL             Use bilinear upsampling,  default=True 
  --window_size WINDOW      Define size of window , default = 2096

  --stride STIRDE           Step size for window sliding  , default = 2000

Merge ensemsble prediction

To produce final output of ensemble prediction. We implement methodology state in uncertianty section.

Figure : Inference flow of the ensemble model.

Prepare all output prediction and variance of all candidate ensemble

Figure : collection of output prediction and variance of all candidate ensemble

Stack output prediction and varaince

Since output and total variance equation require pixel wise summation and average operation. If computational resource is limited, we can calculate by subset of array.

For simplicity of demonstration, AGB output and varince has 1 dimensional so we stack whole image together

We can use numpy and rasterio(for reading raster)

import numpy as np
import os
import glob
import rasterio as rio

rootdir = path/to/emsemble/collection

predict_pattern = "abg**_pred**"  
variance_pattern  = "abg**_var**"  

predict_file  = glob.glob(os.path.join(rootdir, predict_pattern))
variance_file  = glob.glob(os.path.join(rootdir, variance_pattern))

predicts_img = np.array([rio.open(file).read() for file in predict_file])
variances_img = np.array([rio.open(file).read() for file in variance_file])

Pixel-wise average output predictions

Avarage of output can be implement be numpy.mean

Output Equation : output prediction of the model , calculate by averaging each pixel for all M candidate

# final predictions (average over model ensemble)
pred_ensemble = np.mean(predicts_img, axis=0)

Calculate Pixel-wise epistemic uncertainty (model uncertainty) of ensemble model

Can be done by calculate pixel-wise varince of all output predictions

epistemic_var = np.var(predicts_img, axis=0)

Calculate Pixel-wise aleatoric uncertainty (data uncertainty) of ensemble model

Can be done by calculate pixel-wise avarage on all variances of ensemble

aleatoric_var = np.mean(variances_img, axis=0)

Combine epistemic and aleatoric uncertainty

Pixel-wise summation epistemic and aleatoric uncertainty according to total variance formula

Total Variance Equation : Epistemic (terms 1 and 2) + Aleatoric Uncertainty (term 3) Uncertainty

predictive_var = epistemic_var + aleatoric_var

Write result numpy array to raster image.

Using rasterio to get raster metadata from any output prediction(same geodata)

img_tmp = rio.open('path/to/emsemble/collection/abg1_pred.tif')
out_profile = img_tmp.profile.copy()
out_profile.update(count=1)

dst_prediction = rio.open('path/to/emsemble/output/ensemble_pred_AGB.tif', 'w', **out_profile)
dst_var = rio.open('path/to/emsemble/output/ensemble_variance_AGB.tif', 'w', **out_profile)

dst_prediction.write(pred_ensemble)
dst_var.write(predictive_var)

dst_prediction.close()
dst_var.close()