FAI-CLS (FocoosAI Classification)#
Overview#
Fai-cls is a versatile image classification model developed by FocoosAI that can utilize any backbone architecture for feature extraction. This model is designed for both single-label and multi-label image classification tasks, offering flexibility in architecture choices and training configurations.
The model employs a simple yet effective approach: a configurable backbone extracts features from input images, followed by a classification head that produces class predictions. This design enables easy adaptation to different domains and datasets while maintaining high performance and computational efficiency.
Available Models#
Currently, you can find 3 fai-cls models on the Focoos Hub, all trained on COCO dataset for image classification.
| Model Name | Architecture | Domain (Classes) | Dataset | Metric | FPS Nvidia-T4 |
|---|---|---|---|---|---|
| fai-cls-n-coco | Classification (STDC-Small) | Common Objects (80) | COCO | F1: 48.66 Precision: 58.48 Recall: 41.66 |
- |
| fai-cls-s-coco | Classification (STDC-Small) | Common Objects (80) | COCO | F1: 61.92 Precision: 68.69 Recall: 56.37 |
- |
| fai-cls-m-coco | Classification (STDC-Large) | Common Objects (80) | COCO | F1: 66.98 Precision: 73.00 Recall: 61.88 |
- |
Supported dataset#
-
ROBOFLOW_COCO (multi-class)
Neural Network Architecture#
The FAI-CLS architecture consists of two main components:
Backbone#
- Purpose: Feature extraction from input images
- Design: Configurable backbone network (ResNet, EfficientNet, STDC, etc.)
- Output: High-level feature representations
- Feature Selection: Uses specified feature level (default: "res5" for highest-level features)
- Flexibility: Supports any backbone that provides the required output shape
Classification Head#
- Architecture: Multi-layer perceptron (MLP) with configurable depth
-
Components:
-
Global Average Pooling (AdaptiveAvgPool2d) for spatial dimension reduction
- Flatten layer to convert 2D features to 1D
- Linear layers with ReLU activation
- Dropout for regularization
- Final linear layer for class predictions
-
Configurations:
-
Single Layer: Direct mapping from features to classes
- Two Layer: Hidden layer with ReLU and dropout for better feature transformation
Configuration Parameters#
Core Model Parameters#
num_classes(int): Number of classification classesbackbone_config(BackboneConfig): Backbone network configuration
Architecture Configuration#
hidden_dim(int, default=512): Hidden layer dimension for two-layer classifierdropout_rate(float, default=0.2): Dropout probability for regularizationfeatures(str, default="res5"): Feature level to extract from backbonenum_layers(int, default=2): Number of classification layers (1 or 2)
Loss Configuration#
use_focal_loss(bool, default=False): Use focal loss instead of cross-entropyfocal_alpha(float, default=0.75): Alpha parameter for focal lossfocal_gamma(float, default=2.0): Gamma parameter for focal losslabel_smoothing(float, default=0.0): Label smoothing factormulti_label(bool, default=False): Enable multi-label classification
Supported Tasks#
Single-Label Classification#
- Output: Single class prediction per image
-
Use Cases:
- Image categorization (animals, objects, scenes)
- Medical image diagnosis
- Quality control in manufacturing
- Content moderation
- Agricultural crop classification
-
Loss: Cross-entropy or focal loss
- Configuration: Set
multi_label=False
Multi-Label Classification#
- Output: Multiple class predictions per image
-
Use Cases:
- Multi-object recognition
- Image tagging and annotation
- Scene attribute recognition
- Medical condition classification
- Content-based image retrieval
-
Loss: Binary cross-entropy with logits
- Configuration: Set
multi_label=True
Model Outputs#
Training Output (ClassificationModelOutput)#
logits(torch.Tensor): Shape [B, num_classes] - Raw class predictionsloss(Optional[dict]): Training loss including:loss_cls: Classification loss (cross-entropy, focal, or BCE)
Inference Output#
For each detected object:
conf(float): Confidence scorecls_id(int): Class identifierlabel(Optional[str]): Human-readable class name
Losses#
The model supports multiple loss function configurations:
Cross-Entropy Loss (Default)#
- Use Case: Standard single-label classification
- Features: Optional label smoothing for better generalization
- Activation: Softmax for probability distribution
Binary Cross-Entropy Loss#
- Use Case: Multi-label classification tasks
- Features: Independent probability for each class
- Activation: Sigmoid for per-class probabilities
Architecture Variants#
Single-Layer Classifier#
1 | |
Two-Layer Classifier#
1 | |
Training Strategies#
Standard Training#
- Use cross-entropy loss with appropriate learning rate scheduling
- Apply data augmentation for better generalization
- Monitor validation accuracy for early stopping
Imbalanced Data#
- Enable focal loss with appropriate α and γ parameters
- Consider class weighting strategies
- Use stratified sampling for validation
Multi-Label Scenarios#
- Set
multi_label=Truein configuration - Use appropriate evaluation metrics (F1-score, mAP)
- Consider threshold optimization for final predictions
This flexible architecture makes FAI-CLS suitable for a wide range of image classification applications, from simple binary classification to complex multi-label scenarios, while maintaining computational efficiency and ease of use.
Quick Start with Pre-trained Model#
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 | |
Custom Model Configuration#
Single-Label Classification Setup#
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 | |
Multi-Label Classification Setup#
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 | |