Skip to content

FAI-CLS (FocoosAI Classification)#

Overview#

FAI-CLS is a versatile image classification model developed by FocoosAI that can utilize any backbone architecture for feature extraction. This model is designed for both single-label and multi-label image classification tasks, offering flexibility in architecture choices and training configurations.

The model employs a simple yet effective approach: a configurable backbone extracts features from input images, followed by a classification head that produces class predictions. This design enables easy adaptation to different domains and datasets while maintaining high performance and computational efficiency.

Neural Network Architecture#

The FAI-CLS architecture consists of two main components:

Backbone#

  • Purpose: Feature extraction from input images
  • Design: Configurable backbone network (ResNet, EfficientNet, STDC, etc.)
  • Output: High-level feature representations
  • Feature Selection: Uses specified feature level (default: "res5" for highest-level features)
  • Flexibility: Supports any backbone that provides the required output shape

Classification Head#

  • Architecture: Multi-layer perceptron (MLP) with configurable depth
  • Components:
  • Global Average Pooling (AdaptiveAvgPool2d) for spatial dimension reduction
  • Flatten layer to convert 2D features to 1D
  • Linear layers with ReLU activation
  • Dropout for regularization
  • Final linear layer for class predictions
  • Configurations:
  • Single Layer: Direct mapping from features to classes
  • Two Layer: Hidden layer with ReLU and dropout for better feature transformation

Configuration Parameters#

Core Model Parameters#

  • num_classes (int): Number of classification classes
  • backbone_config (BackboneConfig): Backbone network configuration

Architecture Configuration#

  • hidden_dim (int, default=512): Hidden layer dimension for two-layer classifier
  • dropout_rate (float, default=0.2): Dropout probability for regularization
  • features (str, default="res5"): Feature level to extract from backbone
  • num_layers (int, default=2): Number of classification layers (1 or 2)

Loss Configuration#

  • use_focal_loss (bool, default=False): Use focal loss instead of cross-entropy
  • focal_alpha (float, default=0.75): Alpha parameter for focal loss
  • focal_gamma (float, default=2.0): Gamma parameter for focal loss
  • label_smoothing (float, default=0.0): Label smoothing factor
  • multi_label (bool, default=False): Enable multi-label classification

Supported Tasks#

Single-Label Classification#

  • Output: Single class prediction per image
  • Use Cases:
  • Image categorization (animals, objects, scenes)
  • Medical image diagnosis
  • Quality control in manufacturing
  • Content moderation
  • Agricultural crop classification
  • Loss: Cross-entropy or focal loss
  • Configuration: Set multi_label=False

Multi-Label Classification#

  • Output: Multiple class predictions per image
  • Use Cases:
  • Multi-object recognition
  • Image tagging and annotation
  • Scene attribute recognition
  • Medical condition classification
  • Content-based image retrieval
  • Loss: Binary cross-entropy with logits
  • Configuration: Set multi_label=True

Model Outputs#

Training Output (ClassificationModelOutput)#

  • logits (torch.Tensor): Shape [B, num_classes] - Raw class predictions
  • loss (Optional[dict]): Training loss including:
    • loss_cls: Classification loss (cross-entropy, focal, or BCE)

Inference Output#

For each detected object:

  • conf (float): Confidence score
  • cls_id (int): Class identifier
  • label (Optional[str]): Human-readable class name

Losses#

The model supports multiple loss function configurations:

Cross-Entropy Loss (Default)#

  • Use Case: Standard single-label classification
  • Features: Optional label smoothing for better generalization
  • Activation: Softmax for probability distribution

Focal Loss#

  • Use Case: Imbalanced datasets with hard-to-classify examples
  • Parameters:
  • Alpha (α): Controls importance of rare class
  • Gamma (γ): Focuses learning on hard examples
  • Benefits: Improved performance on imbalanced datasets

Binary Cross-Entropy Loss#

  • Use Case: Multi-label classification tasks
  • Features: Independent probability for each class
  • Activation: Sigmoid for per-class probabilities

Architecture Variants#

Single-Layer Classifier#

1
AdaptiveAvgPool2d(1) → Flatten → Dropout → Linear(features → num_classes)
- Benefits: Faster inference, fewer parameters - Use Case: Simple datasets or when computational efficiency is critical

Two-Layer Classifier#

1
AdaptiveAvgPool2d(1) → Flatten → Linear(features → hidden_dim) → ReLU → Dropout → Linear(hidden_dim → num_classes)
- Benefits: Better feature transformation, improved accuracy - Use Case: Complex datasets requiring more sophisticated feature processing

Training Strategies#

Standard Training#

  • Use cross-entropy loss with appropriate learning rate scheduling
  • Apply data augmentation for better generalization
  • Monitor validation accuracy for early stopping

Imbalanced Data#

  • Enable focal loss with appropriate α and γ parameters
  • Consider class weighting strategies
  • Use stratified sampling for validation

Multi-Label Scenarios#

  • Set multi_label=True in configuration
  • Use appropriate evaluation metrics (F1-score, mAP)
  • Consider threshold optimization for final predictions

This flexible architecture makes FAI-CLS suitable for a wide range of image classification applications, from simple binary classification to complex multi-label scenarios, while maintaining computational efficiency and ease of use.

Example Usage#

Single-Label Classification Setup#

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
from focoos.models.fai_cls.config import ClassificationConfig
from focoos.models.fai_cls.modelling import FAIClassification
from focoos.nn.backbone.resnet import ResnetConfig

# Configure with ResNet backbone
backbone_config = ResnetConfig(
    model_type="resnet",
    depth=50,
    pretrained=True,
)

config = ClassificationConfig(
    backbone_config=backbone_config,
    num_classes=1000,  # ImageNet classes
    resolution=224,
    num_layers=2,
    hidden_dim=512,
    dropout_rate=0.2,
)

# Create model
model = FAIClassification(config)

Multi-Label Classification Setup#

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
from focoos.models.fai_cls.config import ClassificationConfig
from focoos.models.fai_cls.modelling import FAIClassification
from focoos.nn.backbone.resnet import ResnetConfig

# Configure with ResNet backbone
backbone_config = ResnetConfig(
    model_type="resnet",
    depth=50,
    pretrained=True,
)

config = ClassificationConfig(
    backbone_config=backbone_config,
    num_classes=1000,  # ImageNet classes
    resolution=224,
    num_layers=2,
    hidden_dim=512,
    dropout_rate=0.2,
    multi_label=True,
)

# Create model
model = FAIClassification(config)