Senior Deep Learning Engineer
AI21 Labs
Posted on Jul 3, 2024
Senior Deep Learning Engineer
- Engineering
- Tel Aviv, Israel
- Full-time
Description
Our team is looking for a Deep Learning Engineer.
AI21 is one of the few companies to have trained multi-billion parameter Large Language Models (LLMs), a feat that involves the most advanced engineering (large scale distributed training on thousands of cores). Serving these LLMs efficiently requires cutting-edge technology as well. As a deep learning engineer on the team, you will be responsible for maintaining and improving our training infrastructure, developing/scaling/testing new ideas, and adapting our code to run on and best utilize the newest and most advanced hardware accelerators.
Role and Responsibilities
- Develop Large Language Models as part of our applied research projects and in support of AI21 Platform, including designing, implementing and training massive-scale deep language models
- Implement, optimize, scale and test new cutting edge ideas and architectures
- Perform large-scale evaluations and comparisons of trained models across a range of benchmarks, as well as adding support for new benchmarks
Requirements
- B.Sc. in computer science, software engineering or equivalent
- Self learner, and proven record of ability to remove technical road-blocks
- 5+ years experience developing software for production systems and/or internal infrastructure/tools
- Prior experience working with cloud computing platforms (e.g. AWS, GCP, Docker, Kubernetes)
- Skilled at writing production-grade Python code
- Hands-on experience in deep learning and machine learning (TensorFlow/PyTorch..)
- Any one of the following:
- Optimization of deep learning model training (E.g. parallelization, megatron, deepspeed, FSDP)
- or -
- Custom kernel experience (C++/CUDA and/or Triton)
- or -
- Distributed Systems, in particular distributed deep learning training/serving