(AWS WorkShop)Machine Learning Operation for Incremental Training

MorrisLin
3 min readMay 7, 2021

https://github.com/catwhiskers/mlops_incremental_learning

  1. Training
  2. retraining — Incremental training

why?

  • Model drift : Differences between training and testing data
  • Robustness : People get affected by ML models will deliberately alter their response
  • Ground truth not available during training time : User behaviors are not predictable

Problems

  • Labeling tools maintenance
  • Passing human labeled results around manually
  • Triggering retraining manually
  • How to mange models
  • Updating endpoints without down time

like NLP have many mission to labeling like content or positive / negative

This workshop will create this arch

  • Training
  • Deploy to Endpoint
  • Submit a A2I augmented AI workflow
  • Retraining

Switch to SageMaker and create notebook instance

Create notebook instance
Setting
Open Jupyter
Open Terminal
cd ~/SageMaker/git clone https://github.com/catwhiskers/mlops_incremental_learning.git

you can see data in Jupyter

then go to SageMaker Studio and create one

then go to Ground Truth > Labeling workforces

  • Amazon Mechanical Turk: price to some body on Mechanical Turk
  • Private: for employee
  • Vendor: for profession vendor

execute Jupyter 02-a2i-object-detection-and-retraining.ipynb file

and do with file code

you can see pipeline on Amazon SageMaker Studio

AWS SageMaker Studio

and open 03-prepare-lambda-functions.ipynb on Jupyter

and run code on Jupyter will create Lambda, you can see Lambda code in your AWS lambda

and will create SQS, you can see in your AWS SQS

AWS SQS
AWS SQS

need to follow Jupyter guide to set up IAM role

AWS Lambda

--

--

MorrisLin

Back-end engineer turn into Blockchain software engineer