
SatffPick
Computer Vision / AI Engineer
- VLM
- AI
- LoRA
- Computer Vision
- AWS
- YOLO
- PyTorch
- Английский — B2 — Средне-продвинутый
About the Role
Staffpick is seeking a talented Mid-Level AI Engineer on behalf of one of our innovative clients – the creators of an intelligent camera system that acts as a virtual doorman.
The product is – an intelligent camera system that acts as a virtual doorman, detecting visitors, understanding context, and interacting naturally. We need a Senior
AI engineer with strong skills in video-based deep learning models to implement our first working demo and lay the foundation for our edge-to-cloud AI pipeline.
You’ll work directly with the founders to:
- Integrate open-source vision-language models (VLMs) such as Florence-2 and SmolVLM2 into a live camera pipeline.
- Build event-driven triggers from video, audio, and face recognition.
- Fine-tune edge models (LoRA) for security/doorman-specific use cases.
- Prepare the system for eventual edge deployment on an Ambarella CV72 SoC.
Responsibilities
- Select, integrate, and optimize video understanding models (detection, captioning, VQA).
- Implement event detection from camera feeds (object/person/package recognition, audio events).
- Fine-tune edge models for domain-specific scenarios using LoRA or similar methods.
- Develop cloud-based inference pipeline for demo (GPU-enabled servers).
- Work with backend engineers to connect AI pipeline to user-facing applications.
- Maintain and document AI model training/inference workflows.
Requirements
- 3 + years experience in machine learning / computer vision.
- Proficiency with PyTorch, ONNX, and model quantization/optimization.
- Experience with video or multi-modal models (e.g., YOLO, MobileNet, VLMs).
- Familiarity with LoRA fine-tuning and domain adaptation.
- Knowledge of cloud deployment of ML models (AWS/GCP/Azure).
- Comfortable working with open-source weights and adapting them for product needs.
- Strong problem-solving skills and ability to work in a startup environment.
* Fluent English Mandatory
Nice to Have
- Experience with Ambarella, NVIDIA Jetson, QCOM, or other edge AI hardware. - Speech-to-text and audio classification experience.
- Familiarity with FastAPI and real-time streaming APIs.
What We Offer
- Competitive salary in Almaty market.
- Opportunity to own a critical component of a deep-tech startup from day one.
- Flexible work arrangements (hybrid possible).
- Direct collaboration with experienced founders and access to cutting-edge AI tools.