Session Types for Reliable Distributed Systems

Distributed software systems are an essential part of the infrastructure of modern society. Such systems typically comprise diverse software components deployed across networks of hosts. Ensuring their reliability is challenging, as software components must correctly communicate and synchronise with each other, and any of the hardware or software components may fail. Failure and service "outage" is extremely costly, with worldwide financial losses due to software failures in 2017 estimated at US$1.7tn, up from US$1.1tn in 2016.

Failures can occur at all levels of the system stack: hardware, operating systems, networks, software, and users. Here we focus on using advanced programming language technologies to enable the software level to handle failures that arise from any level of the stack. Our aim is to provide software-level reliability for distributed systems by combining fault prevention with fault tolerance. The key objective is to combine the communication-structuring mechanism of session types with the scalability and fault-tolerance of actor-based software architectures.

The result will be a well-founded theory of reliable actor programming, supported by a collection of libraries and tools, and validated on a range of case studies. Key aims are to deliver tools that provide lightweight support for developers – e.g. warning of potential issues – and to allow developers to continue to use established idioms. By doing so we aim to deliver a step change in the engineering of reliable distributed software systems.

STARDUST is funded by EPSRC (grants EP/T014512/1, EP/T014628/1, EP/T014709/1) between 1st October 2020 and 30th September 2024.


Approach.

The research approach of STARDUST is organised into three strands: Principles, Programming, and Practice.

Principles

We will develop foundational theories of reliable session types for distributed actor systems. Our models will capture the two prevalent means for failure recovery in actor-based languages: supervision trees and timeouts.

Programming

We will design, engineer and evaluate usable, open-source session type implementations and supporting tools for Erlang and Scala with Akka. We will develop type checking support for the enhanced session developed in the Principles strand, as well as a tool to help users to design session types for existing systems. These tools will be integrated with appropriate IDEs.

Practice

The theory and tools will be guided by, and evaluated against, a set of case studies provided by our industrial partners. The application domains are: industrial automation (Actyx); server engineering (Erlang Solutions); microservices (Lightbend); testing tools (Quviq); and organisation simulation (Tata).

Project approach: Reliable session types, programming, and practice

Mailing Lists.

There are two mailing lists for STARDUST: stardust-announce, for announcements, and stardust-list, for more general discussion.

To subscribe, send an email to sympa@lists.cent.gla.ac.uk with the following in the Subject line:

SUB LISTNAME NAME

where LISTNAME is one of stardust-announce or stardust-list, and NAME is your first name followed by your last name.

To unsubscribe, send an email to sympa@lists.cent.gla.ac.uk with nothing in the Subject line. In the message body put the following:

UNSUBSCRIBE LISTNAME EMAIL

where EMAIL is an optional email address, useful if different from your "From:" address. If you normally have an automatic signature at the end of your email, please put the word QUIT just before your signature.

People.

Glasgow

Oxford

Kent


Publications.

2024

  • Laura Bocchi, Andy King and Maurizio Murgia. Asynchronous Subtyping by Trace Relaxation. TACAS 2024, to appear.

2023

2022

2021