Open Positions

At the moment we have openings in my group for one postdoctoral researcher position and for a research software engineer position.

For prospective PhD students, we are always looking for bright and motivated students to work together with on tackling complex problems in the general area of distributed systems (with emphsis on ML systems). If you are looking to do a PhD with me, thank you for your interest, but please read this first. If you don't I will know, and I'm afraid I will have to ignore your message.

Postdoctoral Researcher

We are looking for motivated applicants interested in working on optimizing distributed machine learning (ML) systems.

Today's ML solutions are achieving remarkable success in many fields reliant on big data because of their ability to learn complex models. Meanwhile, the number of parameters in ML models is growing, with current solutions already exceeding millions to billions of parameters. Running ML algorithms in parallel across multiple resources aids with scaling to such large models. This is known as distributed ML, a practice that is computationally expensive and requires tens or hundreds of nodes and GPUs to run the algorithms. However, the problem of efficiently training complex machine learning models at scale remains challenging. For instance, one of the main scalability bottlenecks in distributed ML is the communication between the many nodes due to limited bandwidth resources for transferring large model updates.

The research will focus on creating scalable algorithms, techniques and systems for (deep) ML systems. Candidates should be motivated to work at the intersection of machine learning, (optimization) algorithms, distributed systems and networking.

As an example of work in this research area, we recently proposed a paradigm shift based on in-network aggregation for addressing the communication needs of distributed ML. In the context of DAIET, we developed SwitchML, an efficient communication library that uses programmable network switches to efficiently aggregation model updates directly in the network.

We are seeking a highly motivated candidate with a willingness to experiment and explore and have international publications in top-tier venues.

To apply, you will need to have a strong computing or engineering background, experience in building and working with large software systems and tools, and proven knowledge in at least one of the following areas:

You must have a PhD (or equivalent) in an area pertinent to the subject area, i.e. Computer Science or Computer Engineering.
You will have excellent communication skills and be able to organize your own work with minimal supervision and prioritize work to meet deadlines. Preference will be given to applicants with a proven research record and publications in the relevant areas. All applicants must be fluent in spoken and written English.

To apply, email the following materials to .

The position is available immediately. The initial appointment will be for 1 year, which may be renewed in the role of a KAUST post-doctoral fellowship.
We evaluate candidates on an ongoing basis, until the position is filled, so please submit the materials as soon as they are available.

For further details on the position and the project, please contact Prof. Marco Canini at .

Research Software Engineer

We are looking for motivated applicants interested in working on optimizing distributed machine learning (ML) systems. Take a look at our SwitchML system: if you would love building such a system, we want to hear from you.

Today's ML solutions are achieving remarkable success in many fields reliant on big data because of their ability to learn complex models. Meanwhile, the number of parameters in ML models is growing, with current solutions already exceeding millions to billions of parameters. Running ML algorithms in parallel across multiple resources aids with scaling to such large models. This is known as distributed ML, a practice that is computationally expensive and requires tens or hundreds of nodes and GPUs to run the algorithms. However, the problem of efficiently training complex machine learning models at scale remains challenging. For instance, one of the main scalability bottlenecks in distributed ML is the communication between the many nodes due to limited bandwidth resources for transferring large model updates.

We're looking for motivated applicants who have a strong background in distributed systems and networking, love hacking, and have a proven track record of building systems, contributing to open source projects, etc. Candidates have the opportunity to be involved in all aspects of system design, from early-stage prototyping to testing and deployment.
Experience with machine learning frameworks, modern software development and prototyping skills will be considered as a plus.

You will have excellent communication skills and be able to organize your own work with minimal supervision and prioritize work to meet deadlines. You will also be working with several researchers and PhD students working on related projects. All applicants must be fluent in spoken and written English.

To apply, email the following materials to .

The position is available immediately. The initial appointment will be for 1 year, which may be renewed in the role of a KAUST research engineer.
We evaluate candidates on an ongoing basis, until the position is filled, so please submit the materials as soon as they are available.

For further details on the position and the project, please contact Prof. Marco Canini at .

Last updated: Tuesday, 14-May-2019 19:26:11 | © 2012-2019 Marco Canini, all rights reserved