PhD Opportunities

The main streams of our research are within exciting and important areas in software engineering and the digital arts/humanities:

Search Based Software Engineering

Search Based Software Engineering (SBSE) uses advanced computational search (for example genetic algorithms and evolutionary computation and other meta and hyper heuristics) to solve complex software engineering problems. Prof Harman was instrumental in founding this field of Software Engineering and coined the term “Search Based Software Engineering” in 2001. Since then, there has been an explosion of research activity and interest in this area. There are more than 1000 SBSE authors from more than 300 institutions spread over more than 40 countries (as of July 2012). SBSE can be thought of as the application of advanced AI techniques to software engineering, with a particular focus on optimisation. A recent tutorial paper on SBSE can be found here:

CREST has worked with ABB, Daimler, Ericsson, Google, IBM, Microsoft and Motorola on the development of SBSE so PhD students have many opportunities to work with companies should they wish to do so. Prof Harman has been awarded a programme grant (£6.7m, 2012-2018) from the EPSRC so there are also opportunities for students to continue their work after successful PhD completion through postdoctoral positions. We currently have 6 post docs working on SBSE, 4 of whom were former PhD students working on SBSE.

PhD topics in this area cover the complete spectrum of activity in software engineering (in its traditional and emergent forms). CREST is particularly well known for work on SBSE applications to software analysis and testing, refactoring, management and requirements engineering. However, as a founder of the field, Prof Harman is always willing to discuss SBSE projects with PhD candidates who are interested in developing new application areas as well as these more established areas. He can be contacted via email. He has given many keynote talks and invited papers that set out open problems and research agendas for exciting developments in SBSE, all of which are available on his publications page ( He is always willing to discuss potential PhD projects on SBSE with prospective students via email.

Other CREST staff are also very active in SBSE: Dr Yoo is interested in SBSE for regression testing and the interface between testing and information theory. Dr Krinke is interested in SBSE for clone detection. Dr Gold is interested in SBSE for program comprehension and program analysis and musicology.

Mark Harman


Testing Software Product Lines

Software product lines are sets features that are combined into software products that satisfy a specific need from a customer or a specific market. The number of products that can be generated range from a few to thousands of products. Due to the tangled components and complex configuration, verification and validation of product lines is challenging. Testing is one approach to verification and validation of the products and/or the product line. The challenges in testing product lines that have to be faced and solved are that it may not be possible to test each possible product individually and that there exist no possibility to test a product line independent of products.

Jens Krinke

Dependence Analysis and Change Impact Analysis

Software systems can be seen of collections of parts that may or may not depend on each other. These range from statements and functions that are linked to each other, over large components like classes or packages, to non-source code artefacts like requirements and models that are linked to each other or to other artefacts. In this research stream we analyse software systems to identify the dependencies between the different parts and use them to solve software engineering problems. One example of such a problems is change impact analysis where the question is which elements of a software system are affected by a (potential) change to an element of the system. Change impact analysis is an important technique that is extremely useful during software maintenance, for example it can be used to establish which test cases have to be rerun after a change has been applied to a system.

Jens Krinke

Source Code Provenance

With the availability of source code available to be reused and the huge number of developers involved in large projects, it gets more and more important to establish the provenance of the current code within a project.  This touches questions like “where is this code coming from”, “who has modified the code”, or “where has this code been reused”.  Such questions arise for example to establish if one is allowed to use the software according to its license or just to figure out who can answer questions about that code best. Approaches to answer such questions use large-scale string matching and software repository mining.

Jens Krinke

Secure Information Flow

A lot effort in making software secure goes into a kind of “fire-fighting”, i.e. discovering and closing vulnerabilities in code. However, even if all vulnerabilities have been eliminated, code still may not be secure. It may contain covert channels that allow attacks on confidentiality of information, integrity of information, and the privacy and anonymity of the user. Such covert channels leaking information may be within the logic of the program or be so-called “side channels” such as measuring execution times or heat dissipation over different runs. The challenge of constructing security critical code for contemporary software is enormous. Semantics based program analysis and type systems have a large role to play in guaranteeing end-to-end information security for networked systems.

David Clark

Semantic Analysis of Found Code

Found code is code that the analyser did not write. It may be source code or binaries. The task is to understand what it does and the (common) motivation is defence against malicious behaviours. Techniques include reverse engineering, program analysis, testing, and use of SMT solvers. There are a whole range of problem domains in this area, for example malware classification and techniques for combatting packing, encryption and rewriting engines.

David Clark

Testing Information Transformers

The hypothesis for this research is that the answers to some long standing questions in the theory of testing programs can be found or at least improved by viewing programs as transformers of information and using techniques from information theory and the measurement of information flow in programs to build an information theoretic theory of testing which answers questions such as the following. How do I select the test suite? When is it adequate? What does adequate mean in terms of information? How do I order the tests? There is a security aspect to this research. An improvement in testing generally will improve the “fire fighting” of vulnerabilities and exploits, particularly when harnessed with information about attack vector templates.

David Clark

Human-Computer Music Performance

There are many professional and amateur ensembles that perform popular music (e.g. jazz, rock, folk, music theatre, and contemporary church music) and would benefit from a computer stepping in when a human musician is absent and unable to play.  Popular music has a steady beat and reasonably well-defined structures (e.g. chord patterns), but typically involves improvisation at many levels including sectionalised scores (re-arrangeable during performance), and improvised generation of the musical surface.   Live interactive performance in this genre is thus a complex and interesting domain in which to undertake research.  Many disciplines and research methods are needed to tackle this problem including ethnography, computer vision, natural user interfaces, computer music (generation, machine listening, music information retrieval, representation), musicology, music performance, and real-time systems.   Applications are thus welcome from potential students who have experience and expertise from a range of backgrounds.

Nicolas Gold

Computational Musicology

Computers have many applications in musicology, ranging from relatively simple applications to support musicologists in answering particular questions, through to complex models and algorithms to support entire research areas.  Previous work in CREST has focused on computationally-enhanced studies of musical performance (piano performance, shaping in music) and continues by applying our expertise in information theory and search to problems in music analysis.  Much of this work is collaborative (e.g. with the AHRC Centre for Musical Performance as Creative Practice and the UCL Centre for Digital Humanities).

Nicolas Gold

Interdisciplinary Source Code Analysis

Combining CREST’s expertise in source code analysis and our interest in interdisciplinary applications, this strand of research focuses on the challenges posed by non-traditional source code (e.g. Max/MSP patches) and the opportunities available to enhance practice and research by developing new ways to analyse and develop in such languages.  We have worked on clone detection in graphical data-flow languages like Max/MSP and Pure Data and welcome applications from potential students with interests in arts computing and/or source code analysis.

Nicolas Gold

Probabilistic Modelling of S/W Testing

Most of the existing white-box testing techniques rely on structural adequacy criteria, such as statement or path coverage. The purpose of defining test adequacy criteria is to achieve a balance between effectiveness and efficiency of testing. However, structural criteria fail to scale up for the systems that can really benefit from better testing, simply because fine-grained metrics like code coverage lose the relevance for large and complex systems. This strand of research combines the existing expertise of CREST – information theory and testing – to form a probabilistic view of the software testing, allowing us to assess and predict the system’s reliability with confidence.

Gamification of Software Engineering

The success of Search-Based Software Engineering (SBSE) bears an interesting observation: many software engineering tasks can be viewed as combinatorial optimization. This research will investigate whether it is possible to create entertaining gaming experience, essentially by extending meta-heuristics for SBSE to more interactive ones. The grand challenge is to completely encapsulate the original software engineering problem and to present a playable game instead, seeking human insights into the problem solving. The research will also consider whether any non-conventional user interface and/or visualization can help specific software engineering tasks.

This page was last modified on 11 Jul 2018.