标题: 英特尔中国研究院招聘Model Parallelism Performance Analyze and Optimize实习生 [打印本页] 作者: BrianLiu 时间: 2023-10-10 15:33 标题: 英特尔中国研究院招聘Model Parallelism Performance Analyze and Optimize实习生 Qualifications(要求):
- Strong implementation skills in at least one of C++, C, Python.
- Solid experience with PyTorch DDP.
- Familiar with cuda/dpcpp development and relative profiling tools (Nsight Systems/Vtune).
- Experience with Megatron-LM/DeepSpeed/FSDP should be preferred.
- Good at team working.
- Internship of at least 6 months.
Job description(工作内容):
- analyze performance based on different model parallelism implementations.
- advise for model/data parallelism configuration at target environment (GPU and CPU).
- develop projection model to estimate training performance under target environment.
- implement/optimize kernel based on cuda/Triton/oneDNN.