|
Post-Doctoral Research Visit F/M Cooperative Inference Strategies - San Francisco California
Company: Inria Location: San Francisco, California
Posted On: 11/20/2024
Post-Doctoral Research Visit F/M Cooperative Inference StrategiesLevel of qualifications required : PhD or equivalentFunction : Post-Doctoral Research VisitAbout the research centre or Inria departmentThe Inria centre at Universit-- C--te d'Azur includes 37 research teams and 8 support services. The centre's staff (about 500 people) is made up of scientists of different nationalities, engineers, technicians and administrative staff. The teams are mainly located on the university campuses of Sophia Antipolis and Nice as well as Montpellier, in close collaboration with research and higher education laboratories and establishments (Universit-- C--te d'Azur, CNRS, INRAE, INSERM ...), but also with the regional economic players.With a presence in the fields of computational neuroscience and biology, data science and modeling, software engineering and certification, as well as collaborative robotics, the Inria Centre at Universit-- C--te d'Azur is a major player in terms of scientific excellence through its results and collaborations at both European and international levels.ContextThis Post-Doctoral position is funded by the challenge Inria-Nokia Bell Labs: LearnNet (Learning Networks).AssignmentIntroductionAn increasing number of applications rely on complex inference tasks based on machine learning (ML). Currently, two options exist to run such tasks: either served directly by the end device (e.g., smartphones, IoT equipment, smart vehicles) or offloaded to a remote cloud. Both options may be unsatisfactory for many applications: local models may have inadequate accuracy, while the cloud may fail to meet delay constraints. In [SSCN+24], researchers from the Inria NEO and Nokia AIRL teams presented the novel idea of inference delivery networks (IDNs), networks of computing nodes that coordinate to satisfy ML inference requests achieving the best trade-off between latency and accuracy. IDNs bridge the dichotomy between device and cloud execution by integrating inference delivery at the various tiers of the infrastructure continuum (access, edge, regional data center, cloud). Nodes with heterogeneous capabilities can store a set of monolithic machine-learning models with different computational/memory requirements and different accuracy and inference requests that can be forwarded to other nodes if the local answer is not considered accurate enough.Research goalGiven an AI model's placement in an IDN, we will study inference delivery strategies to be implemented at each node in this task. For example, a simple inference delivery strategy is to provide the inference from the local AI model if this seems to be accurate enough or to forward the input to a more accurate model at a different node if the inference quality improvement (e.g., in terms of accuracy) compensates for the additional delay or resource consumption. Besides this serve-locally-or-forward policy, we will investigate more complex inference delivery strategies, which may allow inferences from models at different clients to be combined. To this purpose, we will rely on ensemble learning approaches [MS22] like bagging [Bre96] or boosting [Sch99], adapting them to IDN distinct characteristics. For example, in an IDN, models may or may not be trained jointly, may be trained on different datasets, and have different architectures, ruling out some ensemble learning techniques. Moreover, queries to remote models incur a cost, which leads to prefer ensemble learning techniques that do not require joint evaluation of all available models.In an IDN, models could be jointly trained on local datasets using federated learning algorithms [KMA+21]. We will study how the selected inference delivery strategy may require changes to such algorithms to consider the statistical heterogeneity induced by the delivery strategy itself. For example, nodes with more sophisticated models will receive inference requests for difficult samples from nodes with simpler and less accurate models, leading to a change in the data distribution seen at inference with respect to that of the local dataset. Some preliminary results about the training for early-exit networks in this context are in [KSR+24].Main activitiesResearch.If the selected candidate is interested, he/she may be involved in students' supervision (master and PhD level) and teaching activities.SkillsCandidates must hold a Ph.D. in Applied Mathematics, Computer Science or a closely related discipline. Candidates must also show evidence of research productivity (e.g. papers, patents, presentations, etc.) at the highest level. We prefer candidates who have a strong mathematical background (on optimization, statistical learning or privacy) and in general are keen on using mathematics to model real problems and get insights. The candidate should also be knowledgeable on machine learning and have good programming skills. Previous experiences with PyTorch or TensorFlow is a plus.Benefits package |
|