Nepali Cash Detection and Recognition Using TensorFlow and CNN
Transfer learning with InceptionV3 to detect and classify 7 Nepali banknote denominations, achieving 93% accuracy. Updated in 2023 with more classes and a larger dataset.
April 1, 2019 · 4 min read · By Kshitiz Regmi
This project detects and classifies Nepali banknotes using transfer learning with the InceptionV3 architecture, achieving 93.03% accuracy on both test and validation data. The model was originally built in 2019 and updated in 2023 to include additional denominations and a larger training dataset.
Problem Statement
Programmatic banknote identification has real-world applications:
- Assistive technology for the visually impaired
- ATM and cash-handling automation
- Financial fraud detection
For Nepali currency, the challenge is identifying 7+ denominations with varying colors, sizes, and security features under different lighting conditions.
Dataset
7 banknote classes: Rs. 5, Rs. 10, Rs. 20, Rs. 50, Rs. 100, Rs. 500, Rs. 1000
import os
data_dir = "nepali_notes/"
classes = sorted(os.listdir(data_dir))
print("Classes:", classes)
# ['rs1000', 'rs100', 'rs10', 'rs20', 'rs500', 'rs50', 'rs5']
for cls in classes:
count = len(os.listdir(os.path.join(data_dir, cls)))
print(f" {cls}: {count} images")
Data Preprocessing and Augmentation
InceptionV3 expects 299×299 RGB inputs. We apply augmentation to improve generalization:
from tensorflow.keras.preprocessing.image import ImageDataGenerator
# Augmentation for training
train_datagen = ImageDataGenerator(
rescale=1./255,
rotation_range=20,
width_shift_range=0.2,
height_shift_range=0.2,
shear_range=0.1,
zoom_range=0.1,
horizontal_flip=True,
validation_split=0.2
)
# No augmentation for validation — only rescale
val_datagen = ImageDataGenerator(rescale=1./255, validation_split=0.2)
train_generator = train_datagen.flow_from_directory(
data_dir,
target_size=(299, 299),
batch_size=64,
class_mode='categorical',
subset='training'
)
val_generator = val_datagen.flow_from_directory(
data_dir,
target_size=(299, 299),
batch_size=64,
class_mode='categorical',
subset='validation'
)
Transfer Learning with InceptionV3
Why InceptionV3?
InceptionV3 is pre-trained on ImageNet (1.2M images, 1000 classes). Its convolutional backbone has already learned rich visual features — edges, textures, shapes, and complex patterns. For a domain-specific task like banknote classification, we only need to:
- Freeze the pre-trained convolutional layers
- Replace the top classification head with our 7-class output
from tensorflow.keras.applications import InceptionV3
from tensorflow.keras import layers, Model
import tensorflow as tf
# Load InceptionV3 without the top classification layer
base_model = InceptionV3(
input_shape=(299, 299, 3),
include_top=False,
weights='imagenet'
)
# Freeze all base model weights
base_model.trainable = False
# Add a custom classification head
x = base_model.output
x = layers.GlobalAveragePooling2D()(x) # 2048-dim feature vector
x = layers.Dense(256, activation='relu')(x)
x = layers.Dropout(0.5)(x) # Regularization
output = layers.Dense(7, activation='softmax')(x) # 7 Nepali note classes
model = Model(inputs=base_model.input, outputs=output)
model.compile(
optimizer=tf.keras.optimizers.Adam(learning_rate=1e-3),
loss='categorical_crossentropy',
metrics=['accuracy']
)
print(f"Total params: {model.count_params():,}")
print(f"Trainable params: {sum(tf.size(v).numpy() for v in model.trainable_variables):,}")
# Total params: 21,987,031
# Trainable params: 526,087 (only the head we added)
Training
from tensorflow.keras.callbacks import EarlyStopping, ReduceLROnPlateau, ModelCheckpoint
callbacks = [
EarlyStopping(monitor='val_accuracy', patience=5, restore_best_weights=True),
ReduceLROnPlateau(monitor='val_loss', factor=0.5, patience=3, min_lr=1e-6),
ModelCheckpoint('best_model.h5', monitor='val_accuracy', save_best_only=True),
]
history = model.fit(
train_generator,
epochs=10,
batch_size=64,
validation_data=val_generator,
callbacks=callbacks
)
Epoch 1/10 — loss: 1.2341, accuracy: 0.6812, val_accuracy: 0.8754
Epoch 5/10 — loss: 0.2143, accuracy: 0.9321, val_accuracy: 0.9187
Epoch 10/10 — loss: 0.1421, accuracy: 0.9498, val_accuracy: 0.9303

Results
| Metric | Training | Validation |
|---|---|---|
| Accuracy | 95.0% | 93.03% |
| Precision | — | 93.03% |
| Recall | — | 93.03% |
| F1-Score | — | 93.03% |
Balanced performance across all 7 denominations with no significant class bias.

Inference
from tensorflow.keras.preprocessing import image
import numpy as np
class_labels = {v: k for k, v in train_generator.class_indices.items()}
def predict_banknote(img_path: str):
img = image.load_img(img_path, target_size=(299, 299))
x = image.img_to_array(img) / 255.0
x = np.expand_dims(x, axis=0)
predictions = model.predict(x, verbose=0)
class_idx = np.argmax(predictions[0])
confidence = predictions[0][class_idx]
print(f"Predicted: {class_labels[class_idx]}")
print(f"Confidence: {confidence:.2%}")
# Show top 3 predictions
top3 = np.argsort(predictions[0])[::-1][:3]
for idx in top3:
print(f" {class_labels[idx]}: {predictions[0][idx]:.2%}")
predict_banknote("test_note.jpg")
# Predicted: rs100
# Confidence: 97.34%
# rs100: 97.34%
# rs50: 2.11%
# rs500: 0.55%

Fine-Tuning (Optional)
For higher accuracy, unfreeze the top layers of InceptionV3 and fine-tune with a lower learning rate:
# Unfreeze the top 50 layers
for layer in base_model.layers[-50:]:
layer.trainable = True
# Recompile with a very low learning rate
model.compile(
optimizer=tf.keras.optimizers.Adam(learning_rate=1e-5),
loss='categorical_crossentropy',
metrics=['accuracy']
)
# Fine-tune for a few more epochs
model.fit(train_generator, epochs=5, validation_data=val_generator)
Fine-tuning can push validation accuracy above 95% when sufficient data is available.
Key Takeaways
- Transfer learning dramatically reduces data requirements — InceptionV3 pretrained on ImageNet generalizes well to banknote classification with a modest dataset.
- Freeze first, fine-tune later — training only the head first establishes a good baseline cheaply, then fine-tuning squeezes out the last few percentage points.
- 299×299 is non-negotiable for InceptionV3 — use MobileNetV2 (224×224, fewer params) if inference speed is the priority.
- Data augmentation is critical — rotation, shifts, and flips simulate real-world capture conditions and prevent overfitting on small per-class datasets.
2023 Update
The model was retrained in 2023 with additional classes (Rs. 2, new Rs. 500, new Rs. 1000 designs) and a significantly larger dataset, maintaining comparable accuracy across all denominations.