Research topics

Component-Based and Service-Oriented development techniques take a step towards software productivity and reusability by decoupling the various facets of software and promoting models and programs off-the-shelf. Model-Driven Engineering allows one to capitalize the know-how into models away from technological platform while maintaining traceability with running applications. Besides large scale infrastructures become a seamless support for the execution of complex applications needing a lot of resources distributed all over the world, through principles of software as a service, cloud computing and Grid computing. Such infrastructures support the execution of complex software applications that manipulate huge data sets and very complex workflows while needing a lot of resources distributed all over the world.

Our challenge is to capitalize on these paradigms in order to offer business-oriented environments that master and hide this software complexity. Our main targeted business domain is medical image analysis, but other business applications with very large underlying infrastructures are also considered.
To achieve this objective we have to face two main issues : modeling service architectures and modeling performance. The first one aims to be close to user requirements and resources while the second one's objective is to control the infrastructure.

Modeling of service architectures

Modelling Service Architecture addresses the design of workflows and their deployement on large-scale infrastructures, but also the specification of functional and extra-functional properties (QoS). Our research work currently focuses on the three following challenges.

Workflow expressiveness, evolution and optimization

Flow management is a very active research area which received special intention from the distributed computing community over the last years. In many scientific areas, such as the medical image analysis, complex data processing procedures are needed to analyze huge amounts of data. We aim at helping the final users to exploit production grids by:

  • Making the parallelism transparent to the user.
  • Formalizing the evolutions and adaptations of workflows.
  • Enabling the reuse of already existing ressources (components, services, concerns…).

Software adaptability and quality of service

The composition and execution of reusable entities (service, workflow…) on large and ever-changing infrastructures raise the challenge of maintaining both business continuity and flexible software development. This leads to providing appropriate architectures to reconcile verification techniques as early as possible (design, deployment, execution times) with adaptation techniques (change detection, verification and execution).

  • Platform-independent modelling of functional and extra-functional properties including QoS: Service Level Agreement (SLA), software contracts , variability in QoS properties…
  • SLA-related monitoring and consistency issues, integration of contract mechanisms into platforms, SLA negotiation…
  • Dynamic adaptation and autonomic computing, i.e. providing appropriate models, at runtime, so that software adaptation loops (monitoring, analyzing, planning, executing) can be based on SLA and fine-grained information.

Variability modeling and large scale infrastructure

A transversal concern of this challenge is the variability of the business domain (service, workflow, constraints, QoS…). In order to take into account this manifold variability, we are looking for adapting current approaches for modeling variabiity to large-scale infrastructure.

Modeling performance

Production grids are characterized by high and non-stationary load as well as a large geographical extension. As a consequence, latency, measured as the duration between the beginning of a job submission and the time it starts executing, can be very high and prone to large variations. As an example, on the EGEE production grid, the average latency is in the order of 5 minutes with standard deviation also in the order of 5 minutes. This variability is known to highly impact application performance and thus it has to be taken into account.

We are modeling grid infrastructures performance and we are studying different submissions strategies in order to face the latency problem, from a user point of view. We are also studying the impact of execution context parameters and frequent updates of the parameters of our probabilistic models.