SatffPick

Computer Vision / AI Engineer

От 3 800 USD
  • Алматы
  • Полная занятость
  • Удаленная работа
  • От 3 до 6 лет
  • VLM
  • AI
  • LoRA
  • Computer Vision
  • AWS
  • YOLO
  • PyTorch
  • Английский — B2 — Средне-продвинутый

About the Role

Staffpick is seeking a talented Mid-Level AI Engineer on behalf of one of our innovative clients – the creators of an intelligent camera system that acts as a virtual doorman.

The product is – an intelligent camera system that acts as a virtual doorman, detecting visitors, understanding context, and interacting naturally. We need a Senior

AI engineer with strong skills in video-based deep learning models to implement our first working demo and lay the foundation for our edge-to-cloud AI pipeline.

You’ll work directly with the founders to:

- Integrate open-source vision-language models (VLMs) such as Florence-2 and SmolVLM2 into a live camera pipeline.

- Build event-driven triggers from video, audio, and face recognition.

- Fine-tune edge models (LoRA) for security/doorman-specific use cases.

- Prepare the system for eventual edge deployment on an Ambarella CV72 SoC.

Responsibilities

- Select, integrate, and optimize video understanding models (detection, captioning, VQA).

- Implement event detection from camera feeds (object/person/package recognition, audio events).

- Fine-tune edge models for domain-specific scenarios using LoRA or similar methods.

- Develop cloud-based inference pipeline for demo (GPU-enabled servers).

- Work with backend engineers to connect AI pipeline to user-facing applications.

- Maintain and document AI model training/inference workflows.

Requirements

- 3 + years experience in machine learning / computer vision.

- Proficiency with PyTorch, ONNX, and model quantization/optimization.

- Experience with video or multi-modal models (e.g., YOLO, MobileNet, VLMs).

- Familiarity with LoRA fine-tuning and domain adaptation.

- Knowledge of cloud deployment of ML models (AWS/GCP/Azure).

- Comfortable working with open-source weights and adapting them for product needs.

- Strong problem-solving skills and ability to work in a startup environment.

* Fluent English Mandatory

Nice to Have

- Experience with Ambarella, NVIDIA Jetson, QCOM, or other edge AI hardware. - Speech-to-text and audio classification experience.

- Familiarity with FastAPI and real-time streaming APIs.

What We Offer

- Competitive salary in Almaty market.

- Opportunity to own a critical component of a deep-tech startup from day one.

- Flexible work arrangements (hybrid possible).

- Direct collaboration with experienced founders and access to cutting-edge AI tools.