Nepali Cash Detection and Recognition Using TensorFlow and CNN

Transfer learning with InceptionV3 to detect and classify 7 Nepali banknote denominations, achieving 93% accuracy. Updated in 2023 with more classes and a larger dataset.

April 1, 2019 · 4 min read · By Kshitiz Regmi

This project detects and classifies Nepali banknotes using transfer learning with the InceptionV3 architecture, achieving 93.03% accuracy on both test and validation data. The model was originally built in 2019 and updated in 2023 to include additional denominations and a larger training dataset.

Problem Statement

Programmatic banknote identification has real-world applications:

Assistive technology for the visually impaired
ATM and cash-handling automation
Financial fraud detection

For Nepali currency, the challenge is identifying 7+ denominations with varying colors, sizes, and security features under different lighting conditions.

Dataset

7 banknote classes: Rs. 5, Rs. 10, Rs. 20, Rs. 50, Rs. 100, Rs. 500, Rs. 1000

import os

data_dir = "nepali_notes/"
classes = sorted(os.listdir(data_dir))
print("Classes:", classes)
# ['rs1000', 'rs100', 'rs10', 'rs20', 'rs500', 'rs50', 'rs5']

for cls in classes:
    count = len(os.listdir(os.path.join(data_dir, cls)))
    print(f"  {cls}: {count} images")

Data Preprocessing and Augmentation

InceptionV3 expects 299×299 RGB inputs. We apply augmentation to improve generalization:

from tensorflow.keras.preprocessing.image import ImageDataGenerator

# Augmentation for training
train_datagen = ImageDataGenerator(
    rescale=1./255,
    rotation_range=20,
    width_shift_range=0.2,
    height_shift_range=0.2,
    shear_range=0.1,
    zoom_range=0.1,
    horizontal_flip=True,
    validation_split=0.2
)

# No augmentation for validation — only rescale
val_datagen = ImageDataGenerator(rescale=1./255, validation_split=0.2)

train_generator = train_datagen.flow_from_directory(
    data_dir,
    target_size=(299, 299),
    batch_size=64,
    class_mode='categorical',
    subset='training'
)

val_generator = val_datagen.flow_from_directory(
    data_dir,
    target_size=(299, 299),
    batch_size=64,
    class_mode='categorical',
    subset='validation'
)

Transfer Learning with InceptionV3

Why InceptionV3?

InceptionV3 is pre-trained on ImageNet (1.2M images, 1000 classes). Its convolutional backbone has already learned rich visual features — edges, textures, shapes, and complex patterns. For a domain-specific task like banknote classification, we only need to:

Freeze the pre-trained convolutional layers
Replace the top classification head with our 7-class output

from tensorflow.keras.applications import InceptionV3
from tensorflow.keras import layers, Model
import tensorflow as tf

# Load InceptionV3 without the top classification layer
base_model = InceptionV3(
    input_shape=(299, 299, 3),
    include_top=False,
    weights='imagenet'
)

# Freeze all base model weights
base_model.trainable = False

# Add a custom classification head
x = base_model.output
x = layers.GlobalAveragePooling2D()(x)      # 2048-dim feature vector
x = layers.Dense(256, activation='relu')(x)
x = layers.Dropout(0.5)(x)                  # Regularization
output = layers.Dense(7, activation='softmax')(x)  # 7 Nepali note classes

model = Model(inputs=base_model.input, outputs=output)

model.compile(
    optimizer=tf.keras.optimizers.Adam(learning_rate=1e-3),
    loss='categorical_crossentropy',
    metrics=['accuracy']
)

print(f"Total params:     {model.count_params():,}")
print(f"Trainable params: {sum(tf.size(v).numpy() for v in model.trainable_variables):,}")
# Total params: 21,987,031
# Trainable params: 526,087  (only the head we added)

Training

from tensorflow.keras.callbacks import EarlyStopping, ReduceLROnPlateau, ModelCheckpoint

callbacks = [
    EarlyStopping(monitor='val_accuracy', patience=5, restore_best_weights=True),
    ReduceLROnPlateau(monitor='val_loss', factor=0.5, patience=3, min_lr=1e-6),
    ModelCheckpoint('best_model.h5', monitor='val_accuracy', save_best_only=True),
]

history = model.fit(
    train_generator,
    epochs=10,
    batch_size=64,
    validation_data=val_generator,
    callbacks=callbacks
)

Epoch 1/10 — loss: 1.2341, accuracy: 0.6812, val_accuracy: 0.8754
Epoch 5/10 — loss: 0.2143, accuracy: 0.9321, val_accuracy: 0.9187
Epoch 10/10 — loss: 0.1421, accuracy: 0.9498, val_accuracy: 0.9303

Training and validation accuracy Training and validation loss

Results

Metric	Training	Validation
Accuracy	95.0%	93.03%
Precision	—	93.03%
Recall	—	93.03%
F1-Score	—	93.03%

Balanced performance across all 7 denominations with no significant class bias.

Confusion matrix

Inference

from tensorflow.keras.preprocessing import image
import numpy as np

class_labels = {v: k for k, v in train_generator.class_indices.items()}

def predict_banknote(img_path: str):
    img = image.load_img(img_path, target_size=(299, 299))
    x = image.img_to_array(img) / 255.0
    x = np.expand_dims(x, axis=0)
    
    predictions = model.predict(x, verbose=0)
    class_idx = np.argmax(predictions[0])
    confidence = predictions[0][class_idx]
    
    print(f"Predicted: {class_labels[class_idx]}")
    print(f"Confidence: {confidence:.2%}")
    
    # Show top 3 predictions
    top3 = np.argsort(predictions[0])[::-1][:3]
    for idx in top3:
        print(f"  {class_labels[idx]}: {predictions[0][idx]:.2%}")

predict_banknote("test_note.jpg")
# Predicted: rs100
# Confidence: 97.34%
#   rs100: 97.34%
#   rs50: 2.11%
#   rs500: 0.55%

Streamlit app — home screen Inference example 1 Inference example 2

Fine-Tuning (Optional)

For higher accuracy, unfreeze the top layers of InceptionV3 and fine-tune with a lower learning rate:

# Unfreeze the top 50 layers
for layer in base_model.layers[-50:]:
    layer.trainable = True

# Recompile with a very low learning rate
model.compile(
    optimizer=tf.keras.optimizers.Adam(learning_rate=1e-5),
    loss='categorical_crossentropy',
    metrics=['accuracy']
)

# Fine-tune for a few more epochs
model.fit(train_generator, epochs=5, validation_data=val_generator)

Fine-tuning can push validation accuracy above 95% when sufficient data is available.

Key Takeaways

Transfer learning dramatically reduces data requirements — InceptionV3 pretrained on ImageNet generalizes well to banknote classification with a modest dataset.
Freeze first, fine-tune later — training only the head first establishes a good baseline cheaply, then fine-tuning squeezes out the last few percentage points.
299×299 is non-negotiable for InceptionV3 — use MobileNetV2 (224×224, fewer params) if inference speed is the priority.
Data augmentation is critical — rotation, shifts, and flips simulate real-world capture conditions and prevent overfitting on small per-class datasets.

2023 Update

The model was retrained in 2023 with additional classes (Rs. 2, new Rs. 500, new Rs. 1000 designs) and a significantly larger dataset, maintaining comparable accuracy across all denominations.