Build, Train, and Deploy a Machine Learning Model with Amazon SageMaker
Step-by-step guide to using SageMaker notebook instances to train an XGBoost binary classification model and deploy it as a real-time inference endpoint.
October 10, 2021 · 3 min read · By Kshitiz Regmi
Amazon SageMaker is a fully managed ML platform that enables data scientists and ML engineers to prepare, build, train, and deploy models quickly — without managing infrastructure. It supports Jupyter notebooks, TensorFlow, PyTorch, XGBoost, and more.
Why Amazon SageMaker?
Traditional ML workflows require:
- Setting up and managing Jupyter environments
- Provisioning GPU/CPU compute for training jobs
- Building, containerizing, and maintaining inference servers
SageMaker handles all of this. A SageMaker Notebook Instance is a managed Jupyter environment backed by EC2 — write code, SageMaker handles compute.
What You'll Build
A binary classification model (customer churn prediction) using SageMaker's built-in XGBoost algorithm, trained on S3 data and deployed as a real-time HTTPS endpoint.
Step 1: Create a Notebook Instance
- Open the SageMaker console → Notebook instances → Create notebook instance
- Name:
ml-tutorial - Instance type:
ml.t3.medium - IAM role: Create new (allow S3 access)
- Click Create notebook instance and wait for status
InService
Step 2: Set Up the Session
import boto3
import sagemaker
from sagemaker import get_execution_role
session = sagemaker.Session()
role = get_execution_role()
bucket = session.default_bucket()
prefix = "xgboost-churn"
print(f"Role: {role}")
print(f"Bucket: {bucket}")
Step 3: Prepare and Upload Data to S3
SageMaker training jobs read data from S3. The XGBoost built-in algorithm expects CSV data with the target column first:
import pandas as pd
from sklearn.model_selection import train_test_split
# Assume df is your churn dataset
# Target: 'churn' (0/1), features: all other columns
df = pd.read_csv("churn.csv")
# Move target column to front (SageMaker XGBoost requirement)
cols = ['churn'] + [c for c in df.columns if c != 'churn']
df = df[cols]
train_df, test_df = train_test_split(df, test_size=0.2, random_state=42)
train_df.to_csv("train.csv", index=False, header=False)
test_df.to_csv("test.csv", index=False, header=False)
# Upload to S3
train_path = session.upload_data("train.csv", bucket=bucket, key_prefix=f"{prefix}/train")
test_path = session.upload_data("test.csv", bucket=bucket, key_prefix=f"{prefix}/test")
print("Train data:", train_path)
print("Test data:", test_path)
Step 4: Configure the XGBoost Estimator
SageMaker provides built-in algorithms as Docker images:
from sagemaker.estimator import Estimator
xgboost_image = sagemaker.image_uris.retrieve(
"xgboost", session.boto_region_name, "1.5-1"
)
estimator = Estimator(
image_uri=xgboost_image,
role=role,
instance_count=1,
instance_type="ml.m5.xlarge",
output_path=f"s3://{bucket}/{prefix}/output",
sagemaker_session=session,
)
estimator.set_hyperparameters(
objective="binary:logistic",
num_round=100,
max_depth=5,
eta=0.2,
eval_metric="auc",
subsample=0.8,
colsample_bytree=0.8,
)
Step 5: Train
from sagemaker.inputs import TrainingInput
train_input = TrainingInput(train_path, content_type="text/csv")
val_input = TrainingInput(test_path, content_type="text/csv")
estimator.fit({"train": train_input, "validation": val_input})
SageMaker:
- Spins up an
ml.m5.xlargetraining instance - Pulls the XGBoost container
- Downloads data from S3
- Runs training, logging metrics every round
- Saves the model artifact to S3
- Terminates the training instance
You're billed only for the training duration.
Step 6: Deploy as a Real-Time Endpoint
predictor = estimator.deploy(
initial_instance_count=1,
instance_type="ml.m5.large",
)
This creates a persistent HTTPS endpoint. Inference:
from sagemaker.serializers import CSVSerializer
from sagemaker.deserializers import JSONDeserializer
predictor.serializer = CSVSerializer()
# Send a feature row (without label)
result = predictor.predict("45,1,200,3,1,2500,0,1")
print(result) # e.g., b'0.73' — probability of churn
Step 7: Clean Up
predictor.delete_endpoint()
Endpoints are billed by the hour even when idle. Always delete after testing.
SageMaker Architecture
Notebook Instance
│
├── Training Job (ml.m5.xlarge — spun up and terminated automatically)
│ │
│ └── Model artifact → S3
│
└── Inference Endpoint (ml.m5.large — persistent until deleted)
Key Advantages Over DIY
| DIY | SageMaker | |
|---|---|---|
| Infra setup | Manual EC2 + Docker | Fully managed |
| Distributed training | Complex | instance_count > 1 |
| Model registry | Build yourself | Built-in |
| Monitoring | Custom | CloudWatch + Model Monitor |
| Autoscaling | Manual | Built-in with target tracking |
What's Next
- SageMaker Pipelines — orchestrate multi-step ML workflows
- SageMaker Model Monitor — detect data drift and model degradation
- SageMaker Experiments — track hyperparameter experiments
- SageMaker Studio — fully integrated web-based IDE for ML