ML Engineer
We are looking for a talented Machine Learning Engineer with expertise in Natural Language Processing (NLP) and Generative Modeling for a technology company with more than 30 years of experience in software development in the field of security and automation. Its products are used worldwide for data analysis and verification.
Strongly required to be familiar with:
  1. Multi-Modal Learning Fundamentals
Core Concepts: Understanding modalities: Text, image, video, audio embeddings
Alignment techniques: Contrastive learning (e.g., CLIP), cross-attention
Fusion methods: Late fusion vs. early fusion, transformer-based fusion

2. Model Architectures & Techniques:
Vision-Language Models (VLMs): CLIP, BLIP, Flamingo, LLaVA, GPT-4V, Contrastive learning for joint embeddings
Video Understanding Models: Video Transformers (e.g., TimeSformer, ViViT), Temporal modeling (3D CNNs, optical flow)
Multi-Modal LLMs (MLLMs): OpenAI GPT-4V, Google Gemini, Meta CM3leon, Instruction tuning for multi-modal tasks
- Preprocessing & Feature Extraction:
Text: Tokenization (SentencePiece, BPE), embeddings (BERT, RoBERTa)
Images: CNNs (ResNet, ViT), DINOv2, SAM (Segment Anything)
Video: Frame sampling, optical flow, spatio-temporal features

3. Training & Fine-Tuning Multi-Modal Models
- Training Strategies:
Multi-task learning (e.g., captioning + VQA)
Efficient fine-tuning: LoRA, QLoRA, adapter layers
Self-supervised learning: Masked autoencoding (MAE), contrastive loss
- Optimization & Scaling:
Mixed-precision training (FP16/FP8)
Gradient checkpointing for memory efficiency
Distributed training (multi-GPU, TPU pods)


Nice to know:
Natural Language Processing (NLP):
Text preprocessing, tokenization, and embeddings.
Understanding of transformer architectures (e.g., GPT, BERT, T5).
Fine-tuning pre-trained models for specific tasks.
Prompt engineering and optimization.
Knowledge of attention mechanisms and positional encoding.
Model Training and Optimization:
Distributed training techniques (e.g., data parallelism, model parallelism).
Hyperparameter tuning and optimization.
Evaluation and Metrics:
Familiarity with NLP evaluation metrics (e.g., BLEU, ROUGE, perplexity).
Ability to design and implement custom evaluation pipelines.
Generative Modeling:
Understanding of diffusion processes.
Knowledge of noise schedules and reverse diffusion.
Experience with generative adversarial networks (GANs) and variational autoencoders (VAEs) as complementary techniques.
Training diffusion models from scratch or fine-tuning pre-trained models.
Knowledge of loss functions specific to diffusion models (e.g., noise prediction loss)