Copyright is owned by the Author of the thesis.  Permission is given for 
a copy to be downloaded by an individual for the purpose of research and 
private study only.  The thesis may not be reproduced elsewhere without 
the permission of the Author. 
 

TOWARDS EXPLAINING

BLACKBOX MODELS USING

GENETIC NETWORK

PROGRAMMING

A thesis presented in partial fulfilment of the requirements for the

degree of

Master of Information Science

in

Computer Science

at Massey University, Albany,

New Zealand.

Haotian Zhu

2024


Contents

Abstract ix

Acknowledgements x

1 Introduction 1

1.1 Overview of the Current State of Technology . . . . . . . . . . . . . . . 1

1.1.1 Explainable AI . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1

1.1.2 Evolutionary Algorithm . . . . . . . . . . . . . . . . . . . . . . . 2

1.2 Scope of Research and Problem Domain . . . . . . . . . . . . . . . . . . 3

1.2.1 Scope of research . . . . . . . . . . . . . . . . . . . . . . . . . . . 3

1.2.2 Problem Domain: The Taxi Problem . . . . . . . . . . . . . . . . 4

1.3 Research Objectives . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5

1.4 Significance of the Research . . . . . . . . . . . . . . . . . . . . . . . . . 6

1.5 Research Methodology . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7

1.6 Structure of the Thesis . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8

2 Background and Literature Review 10

2.1 Genetic Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10

2.1.1 Life-Cycle . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10

2.1.2 Chromosome Structure . . . . . . . . . . . . . . . . . . . . . . . 11

2.1.3 Evaluation and selection . . . . . . . . . . . . . . . . . . . . . . . 12

2.1.4 Evolution Operation . . . . . . . . . . . . . . . . . . . . . . . . . 12

2.2 Genetic Programming . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13

2.2.1 Life-Cycle . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13

2.2.2 Chromosome Structure . . . . . . . . . . . . . . . . . . . . . . . 14

2.2.3 Evaluation and Selection . . . . . . . . . . . . . . . . . . . . . . . 14

2.2.4 Evolution Operations . . . . . . . . . . . . . . . . . . . . . . . . 14

2.3 Genetic Network Programming . . . . . . . . . . . . . . . . . . . . . . . 16

2.3.1 Chromosome Structure . . . . . . . . . . . . . . . . . . . . . . . 16

2.3.2 Function Library . . . . . . . . . . . . . . . . . . . . . . . . . . . 17

ii


2.3.3 Chromosome Transaction . . . . . . . . . . . . . . . . . . . . . . 17

2.3.4 Evaluation and Selection . . . . . . . . . . . . . . . . . . . . . . . 17

2.3.5 Evolution Operator . . . . . . . . . . . . . . . . . . . . . . . . . . 17

2.3.6 Evaluation and Selection . . . . . . . . . . . . . . . . . . . . . . . 17

2.3.7 Evolution Operator . . . . . . . . . . . . . . . . . . . . . . . . . . 18

2.4 Literature Review . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20

2.4.1 Improved GNP Operators and Structure . . . . . . . . . . . . . . 20

2.4.2 GNP Combined with Other Algorithms . . . . . . . . . . . . . . 22

2.4.2.1 GNP with EDAs . . . . . . . . . . . . . . . . . . . . . . 22

2.4.2.2 GNP with Reinforcement Learning . . . . . . . . . . . . 24

2.4.2.3 GNP with Ant Colony Optimization . . . . . . . . . . . 25

2.4.2.4 GNP with ACO Life Cycle . . . . . . . . . . . . . . . . 26

2.4.3 Multi-goal Path-finding problem with evolution algorithm . . . . 27

2.4.3.1 Travelling Salesman Problem . . . . . . . . . . . . . . . 28

2.4.3.2 Robotic path planing . . . . . . . . . . . . . . . . . . . 29

2.5 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31

3 General Strategies 33

3.1 Related Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33

3.2 Top-Bottom Approach . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35

3.2.1 ”Black Box” Conception . . . . . . . . . . . . . . . . . . . . . . . 35

3.2.2 Bridging ”Black Box” and Computational Graphs . . . . . . . . 36

3.2.3 Benefits in GNP Research . . . . . . . . . . . . . . . . . . . . . . 37

3.3 Sources of Ground Truth for the Problem . . . . . . . . . . . . . . . . . 37

3.3.1 Approach 1: Handcrafted Solution Dataset . . . . . . . . . . . . 38

3.3.2 Approach 2: GNP Solution Dataset . . . . . . . . . . . . . . . . 38

3.4 Divide and Conquer . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39

3.4.1 Introduction to the Customized Strategy . . . . . . . . . . . . . 39

3.4.2 Complementing the Top-Bottom Approach . . . . . . . . . . . . 39

3.4.3 Implementation in GNP . . . . . . . . . . . . . . . . . . . . . . . 39

3.5 Hand-crafting a GNP Solution . . . . . . . . . . . . . . . . . . . . . . . 40

3.5.1 Design Principle . . . . . . . . . . . . . . . . . . . . . . . . . . . 41

3.5.2 Decision-Making Process Outline: . . . . . . . . . . . . . . . . . . 41

3.5.3 Graph representation . . . . . . . . . . . . . . . . . . . . . . . . . 43

3.5.4 Implementation . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44

4 Using a Black Box as a Fitness Function in a GNP 47

4.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47

4.2 GNP Life-Cycle . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48

iii


4.3 Individual Structure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49

4.3.1 Phenotype Structure . . . . . . . . . . . . . . . . . . . . . . . . . 51

4.3.2 Genotype Structure . . . . . . . . . . . . . . . . . . . . . . . . . 52

4.3.3 Function Library: . . . . . . . . . . . . . . . . . . . . . . . . . . 53

4.3.4 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57

4.4 Fitness function . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57

4.4.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57

4.4.2 Related Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57

4.4.3 Structure of the fitness function . . . . . . . . . . . . . . . . . . . 62

4.4.4 Running Individual Structure . . . . . . . . . . . . . . . . . . . . 68

4.4.5 Fitness equation . . . . . . . . . . . . . . . . . . . . . . . . . . . 75

4.5 Parent Selection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77

4.6 Evolution Operators . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 78

4.6.1 Crossover . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 78

4.6.2 Mutation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79

4.6.3 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81

5 Black Box System to GNP Conversion 82

5.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 82

5.2 Dataset Generation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 83

5.2.1 Hand-crafted solution dataset . . . . . . . . . . . . . . . . . . . . 85

5.2.2 Pattern Identification . . . . . . . . . . . . . . . . . . . . . . . . 86

5.3 Sub-path Extraction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 86

5.3.1 Function find subsequences of length . . . . . . . . . . . . . 86

5.3.2 Function findUniqueSubsequences . . . . . . . . . . . . . . . . 88

5.4 GNP solution construction . . . . . . . . . . . . . . . . . . . . . . . . . . 92

6 Experiments and Analysis of Results 97

6.1 Hand-crafting a GNP Solution . . . . . . . . . . . . . . . . . . . . . . . 97

6.1.1 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 97

6.1.2 Experiment Design . . . . . . . . . . . . . . . . . . . . . . . . . . 97

6.1.3 Result . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 99

6.2 Using a Black Box System as Fitness Function in a GNP . . . . . . . . 99

6.2.1 Learning Ability of the Proposed GNP variant . . . . . . . . . . 99

6.3 Experiment on Proposed GNP vs Original GNP . . . . . . . . . . . . . 104

6.3.1 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 104

6.3.2 Experiment Design . . . . . . . . . . . . . . . . . . . . . . . . . . 104

6.3.3 Target: . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 104

6.3.4 Training Results: . . . . . . . . . . . . . . . . . . . . . . . . . . . 104

iv


6.4 GNP solution refinement . . . . . . . . . . . . . . . . . . . . . . . . . . . 106

6.4.1 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 106

6.4.2 Experiment Design . . . . . . . . . . . . . . . . . . . . . . . . . . 106

6.4.3 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 108

6.5 Experiment on Finding the Minimum Training Patterns . . . . . . . . . 108

6.5.1 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 108

6.5.2 Experiment Design . . . . . . . . . . . . . . . . . . . . . . . . . . 108

6.5.3 Observations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 109

6.5.4 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 109

6.6 Limitation of this research . . . . . . . . . . . . . . . . . . . . . . . . . . 111

7 Conclusion and Future Work 112

7.1 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 112

7.2 Future Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 114

7.2.1 Scalability Studies . . . . . . . . . . . . . . . . . . . . . . . . . . 114

7.2.2 Error Handling . . . . . . . . . . . . . . . . . . . . . . . . . . . . 114

7.2.3 Reduce the Need for a Priori Knowledge . . . . . . . . . . . . . . 114

7.2.4 Interpretability and Transparency Enhancements . . . . . . . . . 114

Bibliography 115

v


List of Figures

1.1 Classification of metaheuristic Algorithms . . . . . . . . . . . . . . . . . 2

1.2 Taxi Problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5

2.1 GA evolution life cycle . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11

2.2 GA chromosome structure . . . . . . . . . . . . . . . . . . . . . . . . . . 12

2.3 GA Crossover . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13

2.4 GA Mutation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13

2.5 GP Chromosome . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14

2.6 GP evolution operations . . . . . . . . . . . . . . . . . . . . . . . . . . . 15

2.7 GNP Chromosome . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16

2.8 GNP Crossover . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19

2.9 GNP Mutation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19

2.10 GNP-RL chromosome structure . . . . . . . . . . . . . . . . . . . . . . . 25

2.11 GNP-RL transection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25

2.12 GNP vs ACO . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26

2.13 GNP-ACO life-cycle . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26

2.14 Chromosome structure . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30

2.15 RTP-GA life cycle . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31

3.1 The continuous decision-making cycle of an agent in the Taxi Problem. . 41

3.2 The visual representation of a computer program as a directed graph. . 44

3.3 Function library. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45

3.4 Function library. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46

4.1 Proposed GNP variant life cycle. . . . . . . . . . . . . . . . . . . . . . . 48

4.2 GNP graph . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50

4.3 Graph structure for ”pick-up” sub-problem. . . . . . . . . . . . . . . . . 51

4.4 Graph structure for ”drop-off” sub-problem. . . . . . . . . . . . . . . . . 52

4.5 Genotype Structure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52

4.6 Taxi Map . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69

4.7 Runing the individual . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70

vi


4.8 Runing the individual with black box . . . . . . . . . . . . . . . . . . . . 75

5.1 GNP solution Refinement process. . . . . . . . . . . . . . . . . . . . . . 82

5.2 Dataset Generation process. . . . . . . . . . . . . . . . . . . . . . . . . . 83

5.3 Dataset sample. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 84

5.4 Example of a single path in the handcraft solution dataset . . . . . . . . 87

5.5 Unique sub-paths . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 92

5.6 Final constructed graph. . . . . . . . . . . . . . . . . . . . . . . . . . . . 96

5.7 Graph visualization. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 96

6.1 Hand-crafted solution test result . . . . . . . . . . . . . . . . . . . . . . 99

6.2 GNP parameter settings . . . . . . . . . . . . . . . . . . . . . . . . . . . 100

6.3 Proposed GNP variant test result . . . . . . . . . . . . . . . . . . . . . . 101

6.4 Average Population Fitness vs. Environment Configuration . . . . . . . 101

6.5 Environment 1: pick-up . . . . . . . . . . . . . . . . . . . . . . . . . . . 102

6.6 Environment 1: drop off . . . . . . . . . . . . . . . . . . . . . . . . . . . 102

6.7 Environment 2: pick-up . . . . . . . . . . . . . . . . . . . . . . . . . . . 102

6.8 Environment 2: drop off . . . . . . . . . . . . . . . . . . . . . . . . . . . 102

6.9 Environment 3: pick-up . . . . . . . . . . . . . . . . . . . . . . . . . . . 102

6.10 Environment 3: drop off . . . . . . . . . . . . . . . . . . . . . . . . . . . 102

6.11 Environment 4: pick-up . . . . . . . . . . . . . . . . . . . . . . . . . . . 103

6.12 Environment 4: drop off . . . . . . . . . . . . . . . . . . . . . . . . . . . 103

6.13 Environment 5: pick-up . . . . . . . . . . . . . . . . . . . . . . . . . . . 103

6.14 Environment 5: drop off . . . . . . . . . . . . . . . . . . . . . . . . . . . 103

6.15 Environment 6: pick-up . . . . . . . . . . . . . . . . . . . . . . . . . . . 103

6.16 Environment 6: drop off . . . . . . . . . . . . . . . . . . . . . . . . . . . 103

6.17 Original GNP vs Proposed GNP for pick-up sub-problem . . . . . . . . 105

6.18 Original GNP vs Proposed GNP for drop-off sub-problem . . . . . . . . 105

6.19 Sample Original GNP vs Proposed GNP graph solution for drop-off sub-

problem (Environment 11) . . . . . . . . . . . . . . . . . . . . . . . . . . 105

6.20 Proposed Methodology test result . . . . . . . . . . . . . . . . . . . . . . 109

6.21 Incorrect computer graph . . . . . . . . . . . . . . . . . . . . . . . . . . 110

vii


List of Algorithms

1 Hand-crafted Solution . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45

2 Fitness evaluation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62

3 A* Pathfinding for distance calculation . . . . . . . . . . . . . . . . . . . 67

4 Graph Traversal Function . . . . . . . . . . . . . . . . . . . . . . . . . . 73

5 Tournament Selection . . . . . . . . . . . . . . . . . . . . . . . . . . . . 78

6 Mutate Node Connections . . . . . . . . . . . . . . . . . . . . . . . . . . 80

7 Handcraft Dataset Generation . . . . . . . . . . . . . . . . . . . . . . . . 85

8 Reconstruct GNP solution from Paths . . . . . . . . . . . . . . . . . . . 93

9 Find Unique Paths in a Graph . . . . . . . . . . . . . . . . . . . . . . . 107

10 Find Positions of sub-paths in a complete path . . . . . . . . . . . . . . 107

11 Extract and Concatenate Sub-arrays . . . . . . . . . . . . . . . . . . . . 107

viii


Abstract

With the emergence of deep learning systems capable of learning intricate robot ma-

noeuvres, control, team coordination, and planning both from data and through inter-

action with the environment, numerous complex and non-linear problems have found

solutions. However, these systems often function as ’black boxes,’ lacking the ability to

provide human-interpretable solutions. This study addresses the interpretability chal-

lenge in the field of Explainable AI by employing a ’black box’ system as a target model

and subsequently transforming it into a computational graph, equivalent to a Genetic

Network Program (GNP). Furthermore, we present a methodology for refining and re-

ducing the size of the GNP solution. Lastly, we also test if we could utilize the black

box system as a guide to the fitness function in a GNP architecture. To illustrate its

efficacy, we use a multi-goal path-finding problem from the OpenAI Gym framework.

The experimental results demonstrate the efficacy of the converted and refined GNP

solution in successfully addressing the taxi problem across 500 environments, constitut-

ing a comprehensive dataset. Notably, the refined GNP solution exhibits no redundant

or unnecessary nodes. Despite the research’s focused scope, centering on a single agent

with multiple goals, the algorithms introduced in this study lay the groundwork for the

development of more sophisticated and interpretable algorithms. These advancements

are poised to tackle more intricate challenges in the future.

ix


Acknowledgements

I would like to extend my deepest gratitude to my supervisor, Dr. Napoleon Reyes,

for his invaluable guidance, patience, and support throughout this research journey.

He put in a lot of effort during the research. He made slides, provided the research

direction, and had meetings with me every week, each lasting at least one and a half

hours. His insights and expertise have been fundamental to the completion of this

thesis. I am profoundly grateful to my family for their unwavering love, understanding,

and encouragement. Their support has been my strength. Moreover, I would also like to

extend my thanks to my colleague, for their support and encouragement. Balancing full-

time study with my responsibilities as a full-time developer was a challenging journey.

x


Chapter 1

Introduction

1.1 Overview of the Current State of Technology

1.1.1 Explainable AI

Explainable AI (XAI) refers to methods and techniques in artificial intelligence (AI)

that make the AI systems understandable to humans. These methods aim to let hu-

man users understand and trust the results and output created by machine learning

algorithms[1].

From the 1970s to the 1990s, symbolic reasoning systems like MYCIN, GUIDON,

SOPHIE, and PROTOS could represent, reason about, and explain their reasoning for

diagnostic, instructional, or machine-learning (explanation-based learning) purposes[2].

From the 1980s to the early 1990s, truth maintenance systems (TMS) enhanced the

capabilities of causal-reasoning, rule-based, and logic-based inference systems. Many

researchers started investigating whether it is possible to meaningfully extract the non-

hand-coded rules generated by opaque trained neural networks[3] [4].

Nowadays, some approaches have been developed to make complex models such as

machine learning and evolutionary algorithms more explainable and interpretable[5].

Other approaches explain some specific prediction made by a (nonlinear) black-box

model, a goal referred to as ”local interpretability” [6] [7]. This lack of transparency

can be a serious problem in critical applications like healthcare, finance, and legal

systems, where understanding the reasoning behind a decision is just as important as

the decision itself.

1


CHAPTER 1. INTRODUCTION 2

1.1.2 Evolutionary Algorithm

Metaheuristic algorithms are special methods used in various areas like economics,

engineering, and management to solve complex problems[8]. Many of these algorithms

are inspired by natural selection, swarm behavior, and the laws of physics. There are

two main types of metaheuristic algorithms (Fig. 1.1): single-solution and population-

based metaheuristics.

Figure 1.1: Classification of metaheuristic Algorithms

Single-solution metaheuristics: These focus on improving a single solution through

local search. However, they can sometimes get trapped in local optima[9]. Examples

include simulated annealing and tabu search.

Population-based metaheuristics: These work with multiple possible solutions at

the beginning of the algorithm. This approach helps maintain chromosome diversity

and prevents the evolution from getting stuck in local optima. These methods are

generally better for finding the best solution across a complex set of possible solutions.

Some well-known examples are Genetic Algorithms, Particle Swarm Optimization, and

Ant Colony Optimization.

Among the population-based metaheuristics, Holland proposed the Genetic Algo-

rithm (GA) in 1975[10]. Inspired by the Darwinian biological evolution process, the GA

evolution begins with a population of completely random individuals. The individuals

are composed of genes, which are represented in binary or other representations. The

fitness of the entire population is evaluated in each generation, and multiple individuals

are randomly selected from the current population (based on their fitness) to produce

a new population of life through natural selection and mutation, which becomes the


CHAPTER 1. INTRODUCTION 3

current population in the next iteration of the algorithm.

In 1992, Koza conducted a comprehensive study on Genetic Programming (GP) [11].

Unlike GA, GP uses a tree-type data structure as the chromosome, each chromosome

is a computer program, which can represent a solution for a given problem domain.

The amount of computation required for GP is so large (dealing with a large number

of candidate computer programs) that it could only be used to solve simple problems

back in the 1990s. In recent years, with the development of GP and the exponential

increase in the computational power of central processors, GP started to produce a

large number of remarkable results.

In 2002, Genetic Network Programming (GNP) [12] was introduced. As a special

type of GP, it uses a graph-type data structure instead of a tree-type data structure.

Compared to GP, GNP is more efficient in solving complex problems in a changing

environment. Subsequently, many other flavors of GNP have been developed to enhance

its efficiency. These include GNP chromosome structure modification[13], features of

mutation and crossover modification[14], and combinations of GNP with other machine

learning algorithms such as reinforcement learning, Ant Colony algorithms, and fuzzy

logic.

In recent years, GNP has been widely used in many real-life applications, includ-

ing robot navigation, optimization of financial portfolios, image and signal processing

techniques, and data routing for network design. The usage of these techniques is grow-

ing in many other fields along with the development of GNP, such as environmental

modeling, transportation planning, and optimization of energy systems.

1.2 Scope of Research and Problem Domain

1.2.1 Scope of research

With the development of A.I., many complex and non-linear problems have been solved.

However, these solutions, while very powerful, are not human-interpretable; they op-

erate like “black boxes” with millions of neurons and billions of connection weights.

The scope of this research is to interpret these ”black boxes” by converting them into

a computational graph, which is interpretable and reusable. The proposed approach

assumes that we have a set of judgment nodes and processing nodes that can be utilized

in a computational graph, tailored to the problem domain. We have only tested the

proposed method on the taxi problem, but other similar problems can be tackled as


CHAPTER 1. INTRODUCTION 4

well (e.g., tile world, maze problem, etc.).

On the other hand, the multi-goal path-finding problem is a classic research problem

in GNP. An improved GNP is needed for this kind of question, and the solution should

be optimal in real life. The scope of this research includes using GNP and other

methodologies to find the optimal solution for a single-agent multi-goal problem in a

stochastic environment. The research uses the taxi problem as a case study to conduct

a comprehensive analysis. This study defines a methodology for converting a ”black

box” system into a computational graph representation and solving the multi-goal

path-finding problem with an optimal GNP solution in general.

1.2.2 Problem Domain: The Taxi Problem

The Taxi problem [15] is a popular toy problem used to demonstrate and test algo-

rithms. It involves an environment where a taxi must pick up a passenger at one

location and drop them off at another within a grid world. The goal is for the taxi to

find the shortest, collision-free path from its current location to the pick-up/drop-off

location.

Actions:

Basically, The taxi has 6 kind of movements:

1. Move Up (MU)

2. Move Down(MD)

3. Turn Left (TL)

4. Turn Right (TR)

5. Pick Up (PU)

6. Drop Off (DO)

• Environment Simulation

The research uses the ’Taxi-v3’ environment provided by the OpenAI Gym library

as the simulation platform for the algorithm. The environment described is a 5x5

two-dimensional grid world, featuring walls, obstacles, and four distinct locations

marked as R (Red), G (Green), Y (Yellow), and B (Blue), as depicted in (Fig. 1.2).


CHAPTER 1. INTRODUCTION 5

The taxi must navigate any of the 25 grid squares and solve a total of 500 unique

environmental scenarios with six deterministic actions: move south, north, east,

west, pick up, and drop off the passenger.

Figure 1.2: Taxi Problem[15]

.

The implementation part of this study is developed using Python with the PY-

GAD library. The GNP will guide the taxi to find the optimal solutions for

different environments.

1.3 Research Objectives

The main objective of this research is to convert a “black box” solution into a com-

putational graph solution. This research focuses on providing a general, innovative

approach to achieve this. Additionally, this research develops a novel enhanced version

of Genetic Network Programming (GNP) based on the “black box” solution to address

the challenge of single-agent, multi-goal path planning in stochastic environments. The

goal is to establish a generalized approach for effectively solving this specific type of

path planning issue with GNP. Furthermore, the research provides unique techniques

aimed at ensuring that the produced GNP solution is optimal for this type of problem.

The following outlines the specific objectives:

• Convert a “black box” solution into a computational graph solution.

• Design a solution to simulate the ”black box” system for the specified problem

domain.

• Test and validate the generalizability of the designed solution across different

scenarios.


CHAPTER 1. INTRODUCTION 6

• Design the GNP chromosome based on the optimal solution.

• Solve the multi-goal path-finding challenges with our proposed Divide-and-Conquer

approach.

• Accelerate the GNP evolution using the optimal solution.

• Develop an improved GNP capable of solving these problems effectively.

• Create a technique to refine the result and generate the optimal GNP solution.

• Conduct tests to assess and validate the generalizability of the solution.

• Define and propose a general comprehensive approach for addressing multi-goal

path-finding challenges using GNP.

1.4 Significance of the Research

This thesis can be distinctly distinguished from other GNP works based on the main ap-

proach that we have taken (from top to bottom): we started by creating a hand-crafted

solution as a stand-in for the ”black box” system (top), which guides the evolution of

the best GNP individual (bottom). The ”black box” system serves as the perfect model

and may come in the form of a deep neural network or other advanced algorithms that

do not have any explanation capabilities. On the other hand, the target solution is a

computational graph that is elegant, highly interpretable, and reusable. Our proposed

method details how we can utilize the ”black box” system in constructing a chromosome

structure, fitness function, and modified crossover and mutation operators for our novel

GNP variant. Moreover, there are some challenging multi-agent problems, such as the

TileWorld problem, that cannot be solved 100 percent using GNP[14], so resorting to

using other advanced algorithms and then converting them to interpretable solutions

is an interesting avenue of research to pursue. In this research, the GNP solution will

be refined to generate the perfect GNP solution.

This research has the following innovations: First, this study proposes and defines

the ”Top-Bottom” approach to convert an unexplained but very powerful “black box”

solution into a computational graph solution. In previous studies, various methods

have been explored to address this problem, but there has been no systematic and effec-

tive approach specifically using Genetic Network Programming. Second, this research

presents a novel Genetic Network Programming variant that employs an innovative

fitness evaluation method, customized chromosome structure, and evolution operators


CHAPTER 1. INTRODUCTION 7

to accelerate GNP evolution. This enhancement enables quicker decision-making for

agents in changing environments. Third, we have developed a new methodology that

refines solutions to ensure the generated solution is perfect for the problem domain that

we used as a testbed. The methodology can guarantee the generated result is perfect,

which is crucial for path planning in real-life scenarios.

1.5 Research Methodology

The research was conducted through a combination of literature review and software

development. The methodology and procedure are structured into the following steps:

• Develop a hand-crafted solution as the ”black box” system for the research.

• Study and understand the input and output of the ”black box” system.

• Conduct a comparative study of the original GA, GP, and GNP, highlighting the

differences among these approaches.

• Develop a prototype Genetic Network Programming software to address a simple

problem, with a focus on understanding each component of GNP.

• Investigate current multi-goal path planning challenges using GA, GP, or GNP,

exploring the methods employed in existing research.

• Break down the taxi problem into smaller, more manageable sub-problems.

• Develop basic GNP software capable of solving these sub-problems in the taxi

scenario.

• Analyze how the ”black box” system assists the basic GNP in solving the taxi

problem.

• Conduct research on fitness functions, particularly those relevant to path planning

and multi-goal path-planning issues.

• Enhance the GNP with a unique fitness function and integrate the ”black box”

system to effectively solve the taxi problem.

• Design a comprehensive test plan to evaluate the proposed solution.

• Observe and analyze the pattern of solutions, verifying them with software im-

plementation.


CHAPTER 1. INTRODUCTION 8

• Develop an approach specifically designed to refine the solution.

• Conduct extensive testing of the developed approach.

• Summarize the entire research process, including the methodologies and proce-

dures employed.

1.6 Structure of the Thesis

This thesis is structured according to nine main parts, including the Introduction.

• Chapter 2: Background

This chapter introduces the background of GA, GP, and GNP in detail. It sum-

marizes the advantages and disadvantages of GA, GP, and GNP.

• Chapter 3: Literature Review

This chapter provides a comprehensive review of existing literature on different

variants of GNP and related works for ”black box” interpretation.

• Chapter 4: Methods

This section explains the detailed ’Top-Bottom’ approach, which is an approach

to convert the ’black box’ to the GNP solution. It also explains the hand-crafted

solution, which simulates the ’black box’ system and serves as the ground truth

of this research.

• Chapter 5: Proposed GNP Variant

A detailed exploration of GNP implementation is presented here, including the

structure of chromosomes, the development of a fitness function, and the cus-

tomization of crossover and mutation processes.

• Chapter 6: Solution Refinement

This chapter explains the proposed methodology for refining the GNP solution

to an optimal solution.

• Chapter 7: Experiment

This part of the thesis discusses the methods and results of testing and validating

our proposed GNP and methodology, ensuring its reliability and accuracy.


CHAPTER 1. INTRODUCTION 9

• Chapter 8: Discussion

This chapter discusses the results of the experiment and covers any missed details

from the previous chapters.

• Chapter 9: Summary and Future Work

The final chapter provides a summary of the findings and contributions of the

thesis. It also outlines potential areas for future research and development in this

field.


Chapter 2

Background and Literature

Review

To better understand the context of this research, it is important to understand some

knowledge and background related to this research. In this section, GA, GP, and GNP

will be explained and compared with each other. This section not only explains how

these evolutionary algorithms work but also summarizes the advantages and disadvan-

tages of GA, GP, and GNP. This section is the foundation of our study.

2.1 Genetic Algorithm

Inspired by the biological evolution process, an evolutionary algorithm called the Ge-

netic Algorithm (GA) was invented by Holland in 1975. It is used to find the optimal

solution for a given problem domain. Similar to natural selection, the optimal solu-

tion can be evolved from a population of candidate solutions. The evolution process is

controlled by parent selection, mutation, and crossover.

2.1.1 Life-Cycle

The figure (Fig. 2.1) shows the evolution life cycle of the GA. There are five phases in

the GA life cycle: initial population, fitness function, parent selection, mutation, and

crossover. For a certain problem domain, a population is initialized and the population

10


CHAPTER 2. BACKGROUND AND LITERATURE REVIEW 11

is composed of a set of chromosomes. The fitness function will evaluate every single

chromosome, and for each chromosome, it will return a fitness score. In the parent

selection phase, a set of chromosomes will be selected based on the fitness score for

the crossover phase. In the crossover phase, a pair of chromosomes will be paired for

swapping their genes to generate new offspring. In the mutation phase, one of the genes

in the chromosome will be mutated with a certain mutation rate. The new generation

is generated after these five phases, and the new generation will be used as the new

population for the next evolution life cycle. After several generations, the solution will

be found.

Figure 2.1: GA evolution life cycle

2.1.2 Chromosome Structure

The initial population consists of a set of chromosomes or individuals. Each chromo-

some represents a solution in the search space for a given problem. The solution is

encoded in a certain bit string format, and the encoding scheme depends on the prob-

lem domain. For example, in a binary encoding scheme, each chromosome (Fig. 2.2) is

a fixed-length vector, and each gene is a one-bit string (either 0 or 1). At the beginning

of the GA evolution, the chromosomes in the population are randomly initialized.


CHAPTER 2. BACKGROUND AND LITERATURE REVIEW 12

Figure 2.2: GA chromosome structure

2.1.3 Evaluation and selection

• Fitness

The fitness function plays an important role in evaluation; it is used to evaluate

the quality of each chromosome. The fitness is a maximization function: the bet-

ter the quality of the chromosome, the higher the fitness score it has. Depending

on the problem domain, different problems use different fitness functions. For

example, Manhattan distance and Euclidean distance are widely used in path-

finding problems.

• Selection

Parent selection, also known as elitism, selects the chromosomes with the highest

fitness scores. The selected chromosomes in the current generation will be used

in the next generation. The common parent selection techniques are tournament,

rank, roulette wheel, and stochastic universal sampling.

2.1.4 Evolution Operation

• Crossover

In GA, the idea of crossover is to select more than one chromosome from the par-

ent selection step as parents and swap parts of their genes with certain crossover

probabilities to produce one or more offspring. The well-known crossover oper-

ators are single-point, two-point, k-point, uniform, order, precedence preserving

crossover, partially matched, shuffle, and reduced surrogate. Here is an example

(Fig. 2.3) of single-point crossover: Given two selected chromosomes, parent A1

and parent A2, some of their genes are exchanged randomly with crossover prob-

ability, and two new chromosomes, offspring A5 and offspring A6, are produced.


CHAPTER 2. BACKGROUND AND LITERATURE REVIEW 13

Figure 2.3: GA Crossover

• Mutation This concept involves adding random genes to the offspring in a pop-

ulation. This is done to ensure diversity in the population from one generation

to the next, which helps prevent the population from becoming too similar too

quickly. Here (Fig. 2.4) is what happens after mutation.

Figure 2.4: GA Mutation

2.2 Genetic Programming

2.2.1 Life-Cycle

GP has the same evolution life cycle, which also contains an initial population, fitness

function, parent selection, mutation, and crossover. The main difference between GP

and GA is the chromosome structure in the initial population. The chromosome struc-

ture of GA is a vector of bit strings, which is randomly initialized according to the

problem domain. However, the chromosome structure of GP is a randomly generated

computer program. The chromosome for GA is raw data, while the chromosome for

GP is processed data.


CHAPTER 2. BACKGROUND AND LITERATURE REVIEW 14

2.2.2 Chromosome Structure

The representation of the GP chromosome structure is expressed as a syntax tree. The

figure (Fig. 2.5) shows the GP chromosome structure. The syntax tree is composed of

nodes and links. The nodes are if-else logic functions that indicate the instructions to

execute, and the links are terminals that indicate the actions for each instruction. The

start node for the chromosome is the tree root, and the terminals are at the deepest

level of the tree.

Figure 2.5: GP Chromosome

In Genetic Programming, the chromosome is structured as a tree. This tree com-

prises a single root node, along with multiple non-terminal and terminal nodes. The

non-terminal nodes in GP serve as functions, which can include arithmetic functions,

Boolean functions, and conditional statements. The terminal nodes are either inputs

to the GP program or are used for specific processing tasks that are relevant to the

problem at hand.

2.2.3 Evaluation and Selection

GP utilizes the same techniques for evaluation and selection as GA. The fitness function

will evaluate each chromosome and return a fitness score. Chromosomes with higher

fitness scores have a higher chance of being chosen.

2.2.4 Evolution Operations

GP employs a variety of genetic operators. These typically include:

• Crossover: Sub-trees are exchanged between parent programs.


CHAPTER 2. BACKGROUND AND LITERATURE REVIEW 15

The crossover (Fig. 2.6) applies to two selected chromosomes to generate two new

offspring. During the crossover, a node is randomly picked from both selected

chromosomes, and they will swap their picked nodes and their children.

• Mutation: Random alterations are made to parts of the tree.

Similar to GA, the key idea of the GP mutation (Fig. 2.6) is to avoid premature

convergence. The algorithm will randomly generate a new sub-tree and replace

the randomly selected node with the new sub-tree.

Figure 2.6: GP evolution operations

Compared to GA, GP is another kind of GA. The main difference is that the

chromosome for GA is a vector of bit strings (raw data), while the chromosome for GP

is a tree-type computer program (processed data). GA has some limitations, while the

performance of GP is better in solving some types of problems, such as problems where

there is no given ideal solution and problems with a stochastic environment where the

situation is constantly changing. For the robot navigation problem, the agent needs to

find the path and avoid obstacles. The tree-type chromosome with if-else logic nodes

is good for decision-making, as the agent can judge the situation and take the correct

action.

The tree-type chromosome has some advantages when solving decision-making is-

sues; however, it still has some limitations. The limitation of GP is the inefficient

search for optimal solutions. This inefficiency is primarily due to the ”bloat” of GP.

In GP, the term ”bloat” describes the propensity for the solution search space to grow

unnecessarily large. The reason for this expansion is that the GP tree structures can

become increasingly deep and complicated over time, presenting the algorithm with

a vast number of potential solutions to consider. Because of this, finding the best

solutions in GP becomes less effective, especially as the tree’s maximum depth grows.


CHAPTER 2. BACKGROUND AND LITERATURE REVIEW 16

2.3 Genetic Network Programming

As the foundation of our research, Genetic Network Programming (GNP) was intro-

duced as a powerful method for solving more complex decision-making problems in a

dynamic environment. GNP is another flavor of GA. The chromosome structure of

GNP is also a computer program; however, this computer program is a non-terminal

directed graph. To better understand how this design relates to our research, let’s delve

into the distinct representation and operational mechanisms of GNP as detailed in this

paper.

2.3.1 Chromosome Structure

GNP adopts a network structure for its chromosomes. This structure is characterized

by two primary types of nodes with time delay: judgment nodes and processing nodes.

These nodes in GNP are analogous to the non-terminal and terminal nodes in GP,

respectively. The network structure in GNP offers a distinctive approach to problem-

solving compared to the tree structure used in GP.

Figure 2.7: GNP Chromosome

Processing nodes are responsible for executing specific tasks or functions, while

judgment nodes are used to make decisions based on certain conditions. The time delay

is the period from the end of processing in one node to the start in another. During

the graph transaction, within a given time frame in a problem domain, judgment nodes

decide which node to activate next based on specific conditions, while processing nodes

perform certain actions or computations.


CHAPTER 2. BACKGROUND AND LITERATURE REVIEW 17

2.3.2 Function Library

The function library is a library that contains judgment functions and processing func-

tions. The judgment function responds to decision-making, and the processing function

responds to actions. When the network transitions to a certain judgment/processing

node, the corresponding judgment/processing function will be called to instruct the

agent to judge the current situation or perform a certain action.

2.3.3 Chromosome Transaction

In general, within a given time frame, the graph transaction starts from the start node,

which is usually a judgment node. The judgment node will be called based on the

current environment, and the judgment will return the information to determine the

next node. The time delay for the current node will be deducted from the time frame,

and the transaction now moves to node 2. The transaction will stop when the given

time is reached.

2.3.4 Evaluation and Selection

GNP use fitness function to evaluate the performance of the chromosome. After the

chromosome transaction, the fitness function will assign a fitness score for this chromo-

some, the value of the fitness score usually depends on the chromosome performance,

the feedback from the environment, and the time usage. GNP can use any kind of

fitness equation for different kinds of problems. The chromosome with relative hog

fitness score will be picked by the selection function, the same as the GA and GP, GNP

use the same parent selection techniques.

2.3.5 Evolution Operator

2.3.6 Evaluation and Selection

GNP uses a fitness function to evaluate the performance of the chromosome. After

the chromosome transaction, the fitness function will assign a fitness score to this

chromosome. The value of the fitness score usually depends on the chromosome’s

performance, feedback from the environment, and time usage. GNP can use any kind

of fitness equation for different kinds of problems. The chromosome with a relatively


CHAPTER 2. BACKGROUND AND LITERATURE REVIEW 18

high fitness score will be picked by the selection function. Similar to GA and GP, GNP

uses the same parent selection techniques.

2.3.7 Evolution Operator

GNP utilizes operators like mutation and crossover but in the context of its network

structure. This includes changing links between nodes or altering the functions of

nodes themselves. For instance, one mutation type in GNP involves randomly selecting

a point in the network and either altering the connection at that node or replacing

the node entirely with a new, randomly generated one. This demonstrates the unique

network-based approach of GNP compared to the tree-based structure of GP.

• Crossover

As shown in Figure (Fig. 2.8), given two graph chromosomes, Parent 1 and Parent

2, they generate two new offspring, Offspring 1 and Offspring 2. Both parents and

their offspring have the same amount and type of nodes. In contrast, node 1 and

node 9 for Parent 1 and Parent 2 have different connections. In Parent 1, node 1

is connected to node 2, whereas in Parent 2, node 1 is connected to nodes 4 and

6. Parent 1 and Parent 2 only swap the connections to generate their offspring.

In GNP, the crossover operator only swaps the connections between two selected

chromosomes randomly with a certain crossover probability.

• Mutation The mutation in GNP randomly changes the connections for one of

the nodes in the selected chromosome, this will generate one new offspring. In

Figure (Fig. 2.9) below, to mutate the randomly selected chromosome, one of

the nodes in this chromosome will be selected randomly, and its connection will

be changed to connect to another node with mutation probability. The mutated

individual will be saved for the next generation.


CHAPTER 2. BACKGROUND AND LITERATURE REVIEW 19

Figure 2.8: GNP Crossover

Figure 2.9: GNP Mutation


CHAPTER 2. BACKGROUND AND LITERATURE REVIEW 20

2.4 Literature Review

In this section, we will conduct a literature review of the relevant GNP variants, includ-

ing the improved GNP and GNP combined with other algorithms. The scope of the

review for this chapter emphasizes how these GNP variants work and where they have

been enhanced to improve GNP efficiency. This includes changes to the chromosome

structure, modified evolution operators, and the combination of GNP with other algo-

rithms. Through reviewing the literature, we will learn and determine how we want to

improve our GNP variant.

2.4.1 Improved GNP Operators and Structure

Although GNP is very powerful compared to GA and GP, it still has some limitations.

Recently, there have been significant improvements in GNP, mainly aimed at making

it more efficient and effective at solving problems. The concept of ”transition by ne-

cessity” in GNP was proposed in this paper [14]. In this context, only some judgment

functions (necessary judgments) are made to determine actions, where the process of

evolution determines the necessity of these judgments. This means that not all judg-

ment functions need to be considered in every scenario, only those deemed necessary

through evolutionary processes.

The traditional GNP does not consider ”transition by necessity,” which leads to two

major drawbacks: invalid evolution and negative evolution. Invalid evolution occurs

when the genetic operators modify parts of the graph that are not actively used or

necessary, while negative evolution indicates a situation where, as the evolutionary

process progresses, the fitness scores of the populations decrease instead of improving.

These issues can lead to deteriorated evolutionary efficiency.

To resolve these problems, the paper proposes novel evolution strategies with sim-

plified genetic operators. For the crossover, the simplified crossover is restricted to the

region of transited branches, where it intersects between crossover individual A and

crossover individual B. For the mutation, the simplified mutation is restricted to the

region of the mutation individual. These new operators are designed to reduce the

negative impacts of invalid and negative evolution, reducing unnecessary transactions

in GNP and improving the performance of GNP.

Apart from ”transition by necessity,” another important feature in GNP is the

”Reusability of nodes,” which plays a vital role in the GNP graph. This paper [16]

addresses this issue and introduces a new method to improve GNP performance.


CHAPTER 2. BACKGROUND AND LITERATURE REVIEW 21

The ”Reusability of nodes” refers to the frequency of node usage during evolution;

the more important the node, the higher its usage. In traditional GNP, all nodes

are treated equally, regardless of their contribution to successful outcomes, leading to

invalid evolution and negative evolution. To increase the usage of important nodes,

the paper introduces adaptive Genetic Network Programming. By identifying and

focusing on frequently reused, effective nodes, adaptive GNP can enhance evolution

efficiency. Adaptive GNP has customized operators that dynamically adjust crossover

and mutation probabilities based on the reusability of nodes within the GNP graph.

The reusability is quantified by counting the frequency of node usage, influencing the

evolutionary pressure for each variable (branch) of the individual GNP. The adaptation

process involves assessing each variable’s contribution to the solution’s fitness, allowing

the GNP to self-adjust its evolution strategy.

To enhance GNP and make it more efficient, some studies have modified the mech-

anism of evolution operators. They have implemented ”transition by necessity” evolu-

tion and increased the reusability of important nodes. Other research has focused on

altering the chromosome structure itself.

In traditional GNP, the judgment nodes, responsible for decision-making, are not

fully utilized, and the performance of the evolution highly depends on the usage of these

judgment nodes. To address this issue, Genetic Network Programming with Control

Nodes (GNPCN) is proposed [17]. The control nodes, similar to start nodes in the

original GNP, act as checkpoints, helping nodes transact more efficiently and ensuring

all parts of the chromosome are effectively used.

GNPCN is achieved by transferring the activated node to a control node after

a certain number of processing nodes are activated. This method ensures that node

transactions do not just randomly jump between nodes. Instead, it follows a structured

path or branch, where a certain amount of processing is done before moving to the next

stage (control node). GNPCN structures the chromosome in breadth and depth; the

breadth refers to the number of control nodes, and the depth refers to the number

of activated processing nodes. GNPCN evolves both the breadth and depth of the

chromosome before returning to a control node, meaning more structured paths or

branches will be involved during chromosome evolution.


CHAPTER 2. BACKGROUND AND LITERATURE REVIEW 22

2.4.2 GNP Combined with Other Algorithms

2.4.2.1 GNP with EDAs

Estimation of Distribution Algorithms (EDAs) [18] are a type of evolutionary com-

putation. Unlike traditional evolutionary algorithms that rely on random methods to

mimic biological evolution for creating new populations, this algorithm utilizes a prob-

abilistic model. This model is created using statistical or machine learning techniques

to approximate the current population’s probability distribution, and then it is used to

generate new populations. The life cycle of EDAs can be summarized in the following

steps: Firstly, it creates an initial population. Secondly, it selects some elites. Thirdly,

it develops a probabilistic model based on the selected individuals. Fourthly, this model

is then utilized to create a new population. Finally, the process returns to the second

step and repeats these steps until predefined stopping criteria are met.

Evolution operators such as crossover and mutation randomly change and swap

genes to evolve the chromosome. However, each GNP gene in the chromosomes is

highly dependent on each other, and these operators sometimes might break the useful

branch and decrease the evolution efficiency. To overcome this issue, Genetic Network

Programming with Estimation of Distribution Algorithms (GNP-EDAs) [19] combines

GNP and EDAs, replacing the crossover and mutation with the probabilistic model,

which can generate a useful new population by calculating the connection probabilities

between different genes.

The probabilistic model is constructed in this way: First, consider a method for

estimating the likelihood of connections among distinct nodes within a network. Dur-

ing the t-th iteration, this approach calculates the probability Pt(i, k, j) that a link

emanates from the k-th branch of node i to node j. The formula for this probability is

given by the ratio of the sum of connection strengths from node i to j over the sum of

all connection strengths from node i to any node. Mathematically, it is represented as:

Pt(i, k, j) =

∑
n∈N δnt (i, k, j)∑

j∈A(i,k)

∑
n∈N δnt (i, k, j)

Second, both the connection information and transition information between differ-

ent genes are considered. In a chromosome, not every node might be involved in the

problem-solving process. A node transition is determined by choosing the necessary

nodes.


CHAPTER 2. BACKGROUND AND LITERATURE REVIEW 23

Thus, in this method, the transition information is factored into the probabilistic

model. The probability that a connection from the k-th branch of node i will lead to

node j is dependent on how often such a transition occurs. The more frequent the

transition, the more valuable the connection is deemed. The formula for this method

is given by:

Pt(i, k, j) =

∑
n∈N δnt (i, k, j) + ηqnt (i, k, j)∑

j∈A(i,k)

∑
n∈N δnt (i, k, j) + ηqnt (i, k, j)

where δnt (i, k, j) represents the connection strength from node i to j at the n-th

node in the t-th generation, and qnt (i, k, j) accounts for the transition frequency from

node i to j.

With the implementation of the probabilistic model, the GNP evolution can gener-

ate more useful populations. However, GNP-EDAs have some obvious drawbacks: In

EDAs, the distributions of node connections are computed to guide the search process.

This leads to the propensity of EDAs to converge on local optima, which means that

they can get stuck in what seems like the best solution but is not (local optimum).

Another issue is that when calculating these probabilities, the method does not con-

sider how good (or fit) each solution is; it treats the best and the tenth-best solutions

the same. Additionally, the method counts every connection from good solutions, even

if they do not help make decisions, which can confuse the process of finding the best

solution. The biggest problem is that it does not handle loops well. When connections

in a loop are used too much (the probability distribution is too high), they seem more

important than they should be, which can make the method settle on a local optimum

too quickly.

To address the above issues and maintain chromosome diversity, papers [19] [20] used

an estimation of distribution algorithm, as in the previous study, to create the next set

of chromosomes. They also used reinforcement learning to build a model based on the

better solutions found in the group. A detailed analysis of these studies is available in

this paper [21]. The paper presents an extended probabilistic model building a genetic

network programming (PMBGNP) algorithm to speed up the GNP evolution. The key

idea of PMBGNP is to pick the sub-branches of the underperforming (bad) individuals

for building a probabilistic model. This model can improve GNP efficiency by using

Sarsa learning to calculate Q values to identify the good structure. When using the

probabilistic model with RL, if a task is completed successfully, only the final step that

achieves the goal gets a reward. This reward is distributed slowly over the system,

which can slow down how quickly it learns and evolves. Moreover, these algorithms


CHAPTER 2. BACKGROUND AND LITERATURE REVIEW 24

become less effective when the problems they are trying to solve get more complicated.

2.4.2.2 GNP with Reinforcement Learning

Integrating GNP with Reinforcement Learning (RL) represents an important advance-

ment in the development of GNP. A significant enhancement was the integration of

GNP with Q-learning, marking the first instance of GNP being combined with RL, as

introduced by [22]. This was used to refine the transfer rules between states within

GNP, and this approach was specifically applied to the prisoner’s dilemma problem.

Further studies [23] combined GNP with Q-learning to better suit the environment. In

other papers [24] [25], GNP was integrated with the SARSA algorithm to enhance the

GNP evolution efficiency, particularly in controlling the Khepera robots process.

To make GNP more efficient during evolution, Genetic Network Programming with

Reinforcement Learning (GNP-RL) [26] was introduced. GNP-RL combines GNP and

RL, where RL can help GNP generate chromosomes with relatively high fitness scores.

In the original GNP algorithm, the structure of the graph remains constant during

the evolution process and becomes fixed once an optimal individual is chosen during

the testing phase. Conversely, integrating GNP with reinforcement learning allows for

the alteration of node connections during the testing phase, enhancing the system’s

performance in changing environments. As an extension of GNP-RL, this paper [27]

attempted to use multiple start nodes to derive multiple programs from a GNP structure

in parallel.

Chromosome Structure The GNP-RL chromosome structure (Fig. 2.10) contains

nodes and sub-nodes. Different from the GNP chromosome structure, each sub-node

has a Q-value, which corresponds to each state-action pair. The key idea is to link a

specific state to a corresponding action. In this scenario, the ’state’ refers to the node

ID, while the ’action’ pertains to one of its sub-nodes. A node will execute only one of its

sub-nodes at a given time, and the choice of which sub-node to execute is determined

by the reinforcement learning algorithm. In a certain situation, an agent’s possible

actions are represented as distinct types of processing nodes within the chromosome.

In contrast, the GNP-RL algorithm treats these actions as sub-nodes under a single

processing node. The reinforcement learning algorithm will pick one of them to execute.

GNP-RL Transaction The figure (Fig. 2.11) illustrates the GNP-RL transaction,

which differs from the GNP life cycle in the following way: Once the initial population


CHAPTER 2. BACKGROUND AND LITERATURE REVIEW 25

Figure 2.10: GNP-RL chromosome structure

is created, each chromosome is evaluated using a fitness function, and Reinforcement

Learning trains the agents to determine the most effective sub-node. In the subsequent

generation, the new population will be trained and evaluated by the fitness function

again before being subjected to elite selection and evolutionary operations.

Figure 2.11: GNP-RL transection

2.4.2.3 GNP with Ant Colony Optimization

Integration of GNP with swarm intelligence algorithms is another crucial advancement

in the development of GNP. Ant Colony Optimization (ACO) was combined with GNP

in [28] [29] for the Evaluator group control system. The unique evolution operation and


CHAPTER 2. BACKGROUND AND LITERATURE REVIEW 26

pheromone intensity calculation methodology can improve GNP’s exploitation ability.

In GNP with ACO, the integrated approach dictates that for every ten GNP iterations,

one iteration is specifically allocated to execute ACO.

2.4.2.4 GNP with ACO Life Cycle

ACO mimics how ants find food. Ants use a scent called pheromone to mark paths from

the colony to food; stronger pheromone trails can attract more ants. As pheromone

evaporates over time, less popular routes disappear, helping ants choose the best paths.

As the table shows (Fig. 2.12), the similarity between GNP and ACO lies in some

common logic. GNP with ACO is proposed [28] to enhance GNP’s searchability.

Figure 2.12: GNP vs ACO

Figure 2.13: GNP-ACO life-cycle

The life cycle (Fig. 2.13) starts with population initialization, where the quantities

of different nodes and connections within the chromosome are set randomly. During the

pheromone calculation phase, the fitness function is used to evaluate the performance of

the chromosomes, and based on this evaluation, along with the transition frequencies,

the pheromone levels are calculated with the following equation:


CHAPTER 2. BACKGROUND AND LITERATURE REVIEW 27

hnm(i, k, a) =
Fn − fn

m

Fn
· γnm(i, k, a), (2.1)

After updating the pheromone, the life cycle proceeds to the genetic operation

phase. In this phase, the generation is separated into general generation and spe-

cial generation. Crossover occurs conventionally in each generation, with pheromone

treated as a branch attribute of GNP. Mutations are typical in general generations.

However, in special generations, pheromone information is used to create new individu-

als instead of mutation. Branches with more pheromone have a higher chance of being

passed on to the new GNP individual.

Furthermore, the Artificial Bee Colony (ABC) algorithm was proposed in work

[30]. The GNP chromosomes were evolved by an ABC-based evolutionary approach.

The ABC algorithm consists of three sequential phases: the Employed bee phase, the

Onlooker bee phase, and the Scout bee phase. During the Employed and Onlooker

bee phases, individuals within the population attempt to disseminate their knowledge

and spawn new individuals. A new individual will only replace its predecessor if it

proves to be superior. In the Scout bee phase, any individual who fails to demonstrate

improvement over a set number of iterations is replaced by a newly generated individual.

In other research [28] [29], it has been found that algorithms can struggle with more

complex problems. To help Genetic Network Programming (GNP) work better, ACO

was used to boost its ability to find good solutions. Yet, they did not eliminate crossover

and mutations, which sometimes meant losing parts that worked well. Inspired by GNP

with ACO, the ACNP algorithm was proposed in [31] to solve these issues. During

evolution, the ACNP creates an experience table according to the structure of the

chromosome, and a new population is generated according to the current experience

table. After parent selection and evolution operations, a new experience table is created.

The graph structure of GNP is optimized in this way.

2.4.3 Multi-goal Path-finding problem with evolution algorithm

Path-finding(path planning) and multi-goal problems are critical areas of AI study [32].

Unlike the A* algorithm for solving traditional path-finding issues (path-finding issues

with a constrained environment, the evolutionary algorithm has better performance on

path-finding for dynamic environments and real-time systems.


CHAPTER 2. BACKGROUND AND LITERATURE REVIEW 28

2.4.3.1 Travelling Salesman Problem

The Travelling Salesman Problem (TSP) [33] is a classic issue in path-finding. Many

researchers use the TSP as a case study for exploring evolutionary algorithms, aiming

to find efficient solutions for TSP. [34] uses GA to solve the Unmanned Aerial Vehicle

(UAV) problem (finding an optimal solution to a single UAV path is the same as

solving a TSP problem). However, the chromosome of GA is a vector of bit strings,

and a computer program with an if-else logic type chromosome is more suitable for

solving this kind of problem.

To address this issue, [35] uses GP to solve the UAV problem. Overall, path-finding

is a multi-objective optimization problem and can be defined by the formula below:

min f(x) = (f1(x), f2(x), f3(x))
T (2.2)

f1(x) =

n∑
i=1

l2i ,

f2(x) =

n∑
i=1

h2i ,

f3(x) =

n∑
i=1

fTi .

(2.3)

Where f1(x) defines the cost function of path distance, f2(x) defines the cost func-

tion of flight altitude, and f3(x) defines the evaluation value for all threats.

Chromosome Representation The solution is described by a set of feasible func-

tions. The chromosome structure is a syntax tree, where each non-terminal node rep-

resents a feasible function. There are a total of Nf non-terminal nodes, i.e., F =

{f1, f2, . . . , fNf
}, and Nterm terminal nodes, i.e., T = {a1, a2, . . . , aNterm}.

The purpose of using an evolutionary algorithm to solve the path-finding problem

is to find the computer program that can represent the problem’s nature, which means

the computer program with the most optimal fitness score. Both the input and output

of GP are computer programs. Thus, compared to the chromosome of GA, GP is more

suitable for solving this problem.

The Generalized Traveling Salesman Problem with Neighborhoods (GTSPN) is an

extended problem for TSP. In GTSPN, instead of having discrete cities as in TSP, each

cluster consists of a ”neighborhood,” which is a set of regions or points. This paper


CHAPTER 2. BACKGROUND AND LITERATURE REVIEW 29

[36] solved this multi-goal path planning problem with a Hybrid Random-Key Genetic

Algorithm (HRKGA).

• Input: a set V = {1, . . . , n} of the indices of the neighborhood sets Si, the

neighborhood sets Si, n sets Wi = {1, . . . ,mi} of the indices of the neighborhoods
Qi,k, and the neighborhoods Qi,k.

• Output: a minimal-cost cyclic tour T = {qτ(1), . . . , qτ(n)} with objective value

L.

GA needs to discover the optimal neighborhood within the neighborhood set. In

the chromosome encoding, each gene contains an index k, which is an index for the

neighborhood within the neighborhood set, and a vector part qi, which is the node

configuration within each neighborhood. Both k and qi are randomly generated. Each

gene is assigned a random number uniformly and encoded into binary format [0, 1].

In general, each chromosome is a linear equation. For instance, in the case of V =

{1, . . . , 4} and Wi = {1, . . . , 5}, the following chromosome:

1q1.22 4q2.72 2q3.45 3q4.02

is decoded into the following route:

q4 ∈ Q4,3 → q1 ∈ Q1,1 → q3 ∈ Q3,2 → q2 ∈ Q2,4

GA and GP are widely used with different equations to solve TSP and related

problems. As can be observed, for path-finding issues or multi-goal path-finding issues,

the optimal solution is an equation or other type of computer program. GP has better

performance compared to GA, as a computer program input is more suitable for this.

Therefore, the computer program type chromosome should be adapted. Thus, GP

and GNP should be used for the path-finding problem and multi-goal path planning

problem.

2.4.3.2 Robotic path planing

Robotic path planning is the problem where the aim is to find the most efficient path

for a robot to travel without any collisions. This involves connecting several target


CHAPTER 2. BACKGROUND AND LITERATURE REVIEW 30

points within the operational area of the robot. This particular challenge is known as

a multi-goal path planning problem (MTP), and it is a well-researched area within the

evolution algorithm field.

GA This paper [37] implements path planning using GA on mobile robots. The

goal is to let GA help the robot find the shortest collision-free path from the current

position of the agent to the destination. In GA, each chromosome represents a possible

path, which is stored in waypoints. The path is decoded by measuring equal intervals

along a straight trajectory from the starting point to the endpoint. As shown in the

figure (Fig. 2.14) below, waypoints are stored as pairs of x and y coordinates, which

are integers. However, this design is for a constrained environment, not for a dynamic

environment.

Figure 2.14: Chromosome structure

RTP-GA The paper [38] introduced real-time path-finding with Genetic Algorithm

(RTP-GA) to solve the path-finding issue in a changing environment. RTP-GA is GA-

based and combines with A*, a heuristic search algorithm, to increase the optimization

capabilities and adaptability of GA.

RTP-GA Life Cycle The RTP-GA life cycle is shown below (Fig. 2.15). RTP-

GA is integrated with the environment. Unlike GA, the fitness evaluation and parent

selection occur after the evolution operation. After parent selection, the agent will take

a series of actions and move through the environment. During the movement through

the A* phase, Greedy A* is employed for navigating the environment until the target

location is farther away than the Manhattan distance of the original location. Thus, it

is necessary for the navigating entity to regularly update and track its location on the

map and denote the areas it has already explored.

GP As mentioned in the previous section, for path-finding issues, GP has better

performance than GA due to the difference in chromosome structure. This paper (ref)

implements GP for multi-robot planning. Each chromosome is a computer program,

similar to the chromosome in GA. After running the chromosome, it will return a


CHAPTER 2. BACKGROUND AND LITERATURE REVIEW 31

Figure 2.15: RTP-GA life cycle

path. The figure shows the design of the chromosome, where the non-terminal nodes

represent the judgment for different situations, and the terminal nodes represent the

movement of the agent. To implement an evolutionary algorithm for solving path-

finding or multi-goal planning problems, the chromosome represents a solution for a

given problem domain; it should be a computer program, where the running result is

a path or a mathematical equation.

2.5 Conclusion

In this section, we have reviewed the state-of-the-art GNP variants. There are several

directions to improve GNP, such as improving the chromosome structure and modifying

the evolution operators. The efficiency of GNP can also be enhanced by integrating

with other algorithms, such as GNP with EDAs, RL, and ACO. These algorithms

usually help GNP pick the right nodes or actions for the evolution.

To improve the efficiency of our proposed GNP variant, we can modify the chro-

mosome structure, such as reducing the number of nodes, or we can use customized

crossover and mutation. Furthermore, we can allow other algorithms to help GNP


CHAPTER 2. BACKGROUND AND LITERATURE REVIEW 32

navigate the evolution process.


Chapter 3

General Strategies

In this section, we delve into the unique methodology that sets our Genetic Network

Programming (GNP) research apart from existing works in the field. Our approach is

characterized by a distinctive ”Top-Bottom” strategy, where the journey begins with

the conception of a complex ”black box” solution, which can subsequently guide the

evolution towards the most effective GNP individual.

On the other hand, we aim to derive a GNP computational graph solution from the

”black box.” The objective of this section is to explain the ”Top-Bottom” approach,

how this research utilizes the ”black box” for our novel GNP variant with the ”Taxi

problem” case study, and our unique version of the ”Divide and Conquer” strategy,

specially made for multi-goal path-finding challenges.

3.1 Related Work

The object of this research is to convert or interpret the ”black box” system into

a computational graph. The ”black box” system refers to systems that hide their

internal logic from the users [39]. The term ”interpretation” refers to providing a

clear explanation of concepts. In data mining and machine learning, interpretability

is defined as the ability to explain or provide meaning in understandable terms to a

human [40].

According to these papers [41] [42], an interpretable model has three characteristics:

Interpretability, Accuracy, and Fidelity. Humans can easily understand and interpret

33


CHAPTER 3. GENERAL STRATEGIES 34

these models, which include decision trees, rules, and linear models. A decision tree

can be linearized into decision rules using the if-then form [43]:

If condition1 ∧ condition2 ∧ condition3, then outcome.

This paper [7] categorizes various approaches to explain the ”black box.” In general,

the ”black box” involves transforming the complex AI model into a block of ”IF-THEN”

logic, which is human-readable. The paper broadly categorizes the approaches into

three main types:

• Model-specific explanations: This approach focuses on interpreting specific

types of models, such as deep neural networks or decision trees.

• Model-agnostic explanations: This approach provides insights that can be

applied to different types of ”black box” models.

• Techniques focusing on specific aspects of interpretability: This approach

helps us understand how the input affects the output of the model.

The work in [44] provides a comprehensive review of how GP can be leveraged to

enhance the interpretability of machine learning models. The focus is on two main

categories: The first category is intrinsic interpretability, which seeks to directly evolve

more effective (and interpretable) models through GP. The second category is post-hoc

interpretability, which uses GP to explain other black-box machine learning models or

the models that have evolved from simpler models, like linear models.

The XGP review section discusses intrinsic interpretability XGP approaches that

take into account model complexity (including size), number of features, meaningful

combinations between features, and questionnaire-based interpretability measures.

XGP focuses on the following properties to improve interpretability:

Smaller GP Model Size: Smaller GP models have better simulatability. To

achieve this, GP can be modified to reduce the size of the chromosome, penalize large

models during fitness evaluation and parent selection, and utilize customized operators.

Lower (Nonsize) GP Model Complexity: This can be categorized into Struc-

tural Complexity Reduction and Functional Complexity Control. The structural com-

plexity of models directly impacts their interpretability. Simplicity in model structure

can improve understanding of feature importance, interactions, and prediction infer-

ence. The following methods are used: [45] [46] [47]. To reduce Functional Complexity,

the following methods are used: [48] [49].


CHAPTER 3. GENERAL STRATEGIES 35

Fewer Distinct Features in GP Models: The GP model simplifies interpre-

tation by using fewer distinct features and concepts. Feature selection and feature

construction can be used to address this problem.

Interpretable GP (Sub-)Model Structures: Existing strategies fall into three

categories: interpretable function nodes, interpretable combinations between nodes,

and problem decomposition.

The post-hoc interpretability methods focus on explaining the ”black-box” models.

These methods are divided into two main categories: global and local interpretability.

Global interpretability methods use GP to generate interpretable models that explain

the overall behavior of the black-box model across all instances. Local interpretability,

on the other hand, seeks to explain the behavior of the black-box model for specific

instances, resulting in explanations that focus on the immediate context surrounding a

given data point or decision.

In our research, the ”black box” will be converted to a GNP computer graph rather

than an ”IF-THEN” logic system or using GP. The GNP graph is more elegant in terms

of reusability, modularity, and scalability:

• The chromosome structure of the GNP is a network. The GNP does not experi-

ence ”bloat” as the problem’s complexity increases.

• The GNP chromosome has better interpretable function nodes and interpretable

combinations between nodes. The GNP comes with a function library, which

consists of judgment functions and processing functions, achieving better inter-

pretability in this way.

To enhance the GNP interpretation further, this research will utilize the ”Top-

Bottom” approach with the GNP to generate the GNP computer graph, and we will

use the ”divide and conquer” approach to decompose the complex problem into multiple

sub-problems.

3.2 Top-Bottom Approach

3.2.1 ”Black Box” Conception

In this context, the complex solution (or target model) serves as a ”black box”; there-

fore, we do not examine its computational components, which could include millions of


CHAPTER 3. GENERAL STRATEGIES 36

neurons and billions of connection weights. We can only access the inputs and outputs

of the ”black box” system, nothing more. The target model knows the correct action

for any given scenario but does not provide any reasoning for its actions. This solu-

tion is often embodied by advanced algorithms like neural networks, known for their

exceptional ability to handle complex and nonlinear problems. Although they are not

completely transparent or easily interpretable, they were chosen for their ability to

guide us to a highly reusable and interpretable computational graph.

3.2.2 Bridging ”Black Box” and Computational Graphs

The ”Top-Bottom” approach is the bridge in this research that connects the ”black

box” with computational graphs. The cornerstone of this methodology lies in trans-

forming the conceptual ”black box,” often represented by complex algorithms, into a

format that aligns with the reusable, interpretable nature of computational graphs.

This transformation involves a series of steps:

• Abstraction and Decomposition

Initially, the complex problem is abstracted and decomposed into fundamental

components or sub-problems. This process involves identifying key operations,

decision-making processes, and data flow within the solution. For instance, in the

multi-goal path-finding problem, the decision-making for different cases and the

related actions will be identified, such as when the agent should move and the

transitions between them (what action the agent will perform when encountering

certain scenarios). This involves careful design to maintain the integrity of the

solution’s logic and performance. The primary challenge is to ensure that the

computational graph remains functionally equivalent to the original solution.

• Mapping to GNP representation

The GNP solution will be designed based on the ”black box” system. This critical

step converts the complex solution into a visual and structured format. Once the

action and judgment parts in the problem domain are defined, the judgment and

processing nodes will be created, and the chromosome for GNP will be designed

accordingly.

During the parent selection process, the fitness function will utilize the black box

system to help the GNP evolve. Additionally, the crossover and mutation will

swap and alter specific genes to guide the evolution in the right direction.

• Iterative Refinement The computational graph is then iteratively refined to


CHAPTER 3. GENERAL STRATEGIES 37

ensure that it accurately represents the complex solution while also benefiting

from GNP’s evolutionary processes. The evolved solution may contain duplicate

or unused branches. To create the perfect solution, our proposed method will be

used to further refine the generated solution.

3.2.3 Benefits in GNP Research

The top-bottom approach in Genetic Network Programming (GNP) offers a novel per-

spective on solving complex problems. While it presents several advantages:

• Guided Evolutionary Process Starting with a conception of the ideal solu-

tion provides a clear target for the evolutionary process. It helps in directing

the evolution of GNP individuals more purposefully towards this target, poten-

tially reducing the time and resources spent on aimless exploration. This focused

direction can lead to more efficient convergence towards effective solutions.

• Benchmarking Excellence

Having a defined ”black box” solution serves as a benchmark for evaluating the

progress and performance of evolving solutions. It allows researchers to measure

how close each generation of GNP individuals gets to this ideal solution. The

GNP design will also benefit from this, such as customized chromosomes and

evolution operators, which can help the evolution avoid randomly generated bad

genes.

• Adaptability to Complex Problems

Complex problems often require complex solutions. The black box model solution

enhances the adaptability of GNP, allowing it to handle more challenging scenar-

ios within dynamic environments. This adaptability is crucial in fields where

problems are not just complex but also continuously changing.

3.3 Sources of Ground Truth for the Problem

Here are the two possible approaches for obtaining the optimal GNP solution, or the

ground truth for the problem. The key idea is to use the ground truth to guide the

algorithm for transforming a string of judgment/processing node executions into a GNP

solution.


CHAPTER 3. GENERAL STRATEGIES 38

3.3.1 Approach 1: Handcrafted Solution Dataset

The first approach starts with a handcrafted solution. This handcrafted solution re-

ceives the current state of the environment to determine the best series of actions. The

steps are described as follows.

• Input State Processing: The state of the current environment is input into

the handcrafted solution.

• Node Execution: Feed inputs to the handcrafted solution to get the best action.

Based on this input, a sequence of nodes will be executed by the handcrafted

solution. These executed nodes are the correct actions for the current states.

After running the handcrafted solution, a list of executed nodes will be generated.

• Generating a Dataset: As the handcrafted solution runs against 400 different

environmental scenarios in the training phase, it creates a 400-node execution

list. These lists can comprise a rich dataset.

• Pattern Recognition and Duplication Handling: Within this dataset, it

can be observed that some sequences of node executions are repeated with certain

patterns. By identifying unique partial solutions and applying specific techniques,

these patterns can be analyzed and distilled.

• Deriving the Perfect GNP Solution: By identifying patterns and unique

sequences, we can refine the dataset using our proposed algorithm and formulate

a ’perfect’ GNP solution.

3.3.2 Approach 2: GNP Solution Dataset

• Run our GNP Across Many Scenarios: The GNP system is put to the test

across 400 different cases. This is similar to the first approach where a handcrafted

solution was tested.

• Record Node Activation Sequences: For each current state of the environ-

ment, the GNP executes a series of nodes. For example, given input A, it might

run through nodes J1, J3, J4, and then P2. These sequences are the responses

of the GNP solution, and they represent partial solutions to that particular envi-

ronment.

• Compile a Dataset: The sequences of executed nodes in response to each

environment are appended to the dataset. This dataset is a combination of the

GNP’s decisions and actions.


CHAPTER 3. GENERAL STRATEGIES 39

• Analyze for Patterns: Similar to the first approach, the pattern of executed

nodes will be identified. We will use our proposed algorithm or approach to

convert the string of judgment/processing node executions into an optimal GNP

solution.

3.4 Divide and Conquer

3.4.1 Introduction to the Customized Strategy

The Divide and Conquer strategy, a timeless algorithmic approach, has been uniquely

adapted in our research to address the complexities of multi-goal pathfinding problems

within the GNP framework. This adaptation plays a crucial role in complementing the

top-bottom approach, offering a structured method to break down complex, multi-goal

problems into more manageable sub-problems.

3.4.2 Complementing the Top-Bottom Approach

Alignment with Ideal Solutions: The Divide and Conquer strategy aligns seam-

lessly with the top-bottom approach, where the ’perfect’ solution is conceptualized

first.

3.4.3 Implementation in GNP

Decomposition of Multi-Goal Challenges: In the context of GNP, the Divide and

Conquer strategy breaks down the ideal solution into smaller, more understandable

chunks, each representing a sub-goal in problem-solving. In the taxi problem, this

strategy divides the ideal path into sections, each focusing on a specific sub-goal, such

as reaching a passenger or getting to a specific drop-off location.

The Divide and Conquer strategy is essential for simplifying and managing complex

tasks such as the taxi problem. Without the Divide and Conquer approach, every

decision impacts the entire task, as each action must be evaluated not only on its

immediate merits but also on how it affects the overall process. For example, in the

taxi problem, the action of dropping off a passenger is linked to the preceding action of

picking them up; if the taxi does not pick up the passenger first, it cannot drop them

off.


CHAPTER 3. GENERAL STRATEGIES 40

The Divide and Conquer strategy, on the other hand, breaks down a large problem

into smaller, separate parts, significantly reducing the problem’s complexity. This

separation enables simpler, more focused strategies for each component. Furthermore,

it provides a clear framework for troubleshooting and refining strategies, as issues can

be isolated and addressed within their specific sub-task. It also aids in quickly resolving

problems in one part without disrupting the others.

Evolution of Specialized Sub-Solutions: For each sub-goal, specialized GNP

individuals are evolved. These individuals are optimized to solve specific aspects of the

path-finding problem, ensuring efficiency and effectiveness in each GNP evolution.

In the previous section, we discussed several methods for increasing the efficiency

of GNP. The first method involves modifying the chromosome to reduce the number

of nodes, thereby reducing computational complexity. The second method involves

refining the evolution operators, which facilitates convergence. Integrating GNP with

the Divide and Conquer strategy complements these gains. Using Divide and Conquer

allows the chromosome to eliminate unnecessary nodes. This means that GNP evolution

only involves essential nodes for each sub-goal, and customized evolution operators will

only introduce the appropriate genes, making the evolution process more streamlined

and effective for complex problems.

Integration of Sub-Solutions: Once optimal or near-optimal solutions are evolved

for each sub-goal, the partial solutions will be combined to form a comprehensive so-

lution for the entire multi-goal path-finding problem. However, this combined solution

is not perfect and may include redundant parts. To address this problem, the gener-

ated solution will be refined based on the ”black box” solution. By constantly refining

against an ideal model, we improve the solution’s applicability to complex path-finding

challenges. This process ensures that the final solution is effective for each sub-goal

and optimal overall.

3.5 Hand-crafting a GNP Solution

As a stand-in for the perfect solution, we have hand-crafted a solution in the form of

a computational graph. This conveniently provides us with the ground truth for all

possible scenarios and allows us to verify the conciseness of our derived graph solution.

However, during the transformational process, the target model is only used as a fitness

function to guide the creation of a computational graph.


CHAPTER 3. GENERAL STRATEGIES 41

3.5.1 Design Principle

The design is to let the agent rely on continuous feedback from the environment to

inform its decisions and actions. The design reflects a cycle of perception and action,

which is a fundamental principle in robotics and artificial intelligence. This model

underscores the ”black box” solution design for the intelligent taxi, where it constantly

updates its state based on its past and current positions and uses a combination of

sensory data and pre-computed knowledge (like heuristic distances) to navigate towards

a goal.

3.5.2 Decision-Making Process Outline:

The design principle is shown in Figure 3.1: A looped arrow with the phrase ”Sense,

decide, act,” represents the continuous cycle of the agent’s process to make decisions.

Below this, a three-step sequence is detailed:

Figure 3.1: The continuous decision-making cycle of an agent in the Taxi Problem.

• Sense: The agent senses the environment to collect important information. This

is likely the step where the agent acquires data from its surroundings, which could

be the immediate grid squares it can move to.

• Decide: The agent utilizes knowledge about the environment. It may combine

sensed information with estimations (like heuristic distance from the target) to

make informed decisions. This implies a level of reasoning or computation, such

as path planning or avoiding obstacles.


CHAPTER 3. GENERAL STRATEGIES 42

• Act: The agent takes action, such as moving robot wheels or, in this context,

moving to another grid position.

In the Taxi problem, the objective is to have the taxi pick up a passenger and then

drop them off at a specified destination. This multi-goal path-finding challenge involves

treating the location of the passenger as the initial destination and the drop-off point

as the final destination. To navigate the taxi toward each destination, the position of

the taxi is compared with that of the current destination to determine what action to

take.

In this context, a three-step sequence is detailed:

• Sense: The taxi will collect the current state of the environment, such as the

position of the taxi and the goal.

• Decide: The taxi will utilize the knowledge of the environment, especially com-

paring the position of the taxi and the destination. For example, if the destination

is to the southeast of the taxi, the taxi should move south and east. Additionally,

the taxi should also judge whether it can move south or east. If there is a wall in

front of the taxi to the south or east, the taxi should choose another direction to

avoid the wall.

• Act: Based on the decision, the taxi will move in a certain direction, pick up, or

drop off the passenger.

As observed, the key logic of the design is the IF-ELSE-THEN logic. We can use

judgment functions to perform the IF-ELSE feature and processing nodes to perform

THEN logic. For instance, if the passenger is north of the taxi, the judgment function

will assess the positions of the taxi and passenger and make the decision, while the

processing function will move the taxi. Based on this, the following judgment and

processing functions are designed as follows:

• J1 - Should the taxi pick up or drop off the passenger?

• J2 (Drop off branch) - Compares the terminal/taxi location and determines the

appropriate direction for the taxi or if it should drop off the passenger when they

are at the same location.

• J3 (Pick up branch) - Compares the passenger/taxi location and decides the

appropriate direction for the taxi or if it should pick up the passenger when they

are at the same location.


CHAPTER 3. GENERAL STRATEGIES 43

• J4 (Should Move East Branch) - Decides whether the taxi should move east.

• J5 (Avoid the obstacle to the East) - Decides whether the taxi should move up

or down if there’s a wall to the east of the taxi.

• J6 (Should Move West Branch) - Decides whether the taxi should move west.

• J7 (Avoid the obstacle to the West) - Decides whether the taxi should move up

or down if there’s a wall to the west of the taxi.

• P1 - Directs the taxi to move north.

• P2 - Directs the taxi to move south.

• P3 - Directs the taxi to move east.

• P4 - Directs the taxi to move west.

• P5 - Instructs the taxi to pick up the passenger.

• P6 - Instructs the taxi to drop off the passenger.

Each function is a void type function. When a certain function is called, that

function will perform a judgment or processing feature and return a string, which is

the next function to be called. For example, when function P1 is called, the taxi will

move north, and P1 will return to J1. J1 will be called to judge whether the taxi should

pick up or drop off the passenger. If the taxi still needs to pick up the passenger, J1

will return ”J3,” etc. The details of these functions will be explained in a later chapter.

3.5.3 Graph representation

We will get a computer program if we connect all the judgment and processing functions.

To visualize this computer program, each node will represent a function, and the whole

computer program will be a directed graph. As seen in the image below (see Figure 3.2),

the judgment nodes represent nodes with judgment functions, the processing nodes

represent nodes with processing functions, and the connections represent how these

nodes relate to each other.

The graph begins at node J1, which judges whether the taxi should pick up or drop

off the passenger. If the taxi should pick up the passenger, the graph goes to the J3

branch; otherwise, it goes to the J2 branch. After this, the program will compare the

position of the taxi and the passenger. If they are at the same position, processing


CHAPTER 3. GENERAL STRATEGIES 44

node P5 will be executed to pick up the passenger. If the passenger is east of the taxi,

judgment nodes J4 and J5 will be executed to judge whether the taxi should move

east and if there is a wall in front of the taxi. If the path is clear, processing node P3

will be executed to move the taxi east. All the processing nodes point to J1, which is

the start node of the graph.

Figure 3.2: The visual representation of a computer program as a directed graph.

3.5.4 Implementation

Since we have all the judgment/processing functions and the complete graph, we need

to implement the computer program to execute the complete graph to solve the taxi

issue. To simulate the GNP graph transaction, the implementation is divided into two

parts: the function library and the computer program.

Function Library

The function library is simply a library that contains all the judgment and process-

ing functions. It functions more like a search engine: when the graph needs a certain

function, the function library will find its directory according to the function name, and

this found function will be called. The function library (see Figure 5.2) is a key-value

pair data structure, using an integer number as the key and the function name as the

value. When the key is found in the function library, the corresponding function will

be called.

Computer program


CHAPTER 3. GENERAL STRATEGIES 45

Figure 3.3: Function library.

The computer program is just a loop which can execute the judgement and process-

ing function iteratively. As we can see from the Pseudo code below (Algorithm 1), the

implementation is a while loop. It will keep calling judgment and processing functions

until the taxi has dropped the passenger successfully. The current is a pointer, which

points to the current executed node. With the given total time frame, each processing

and judgement function will take a certain time to execute, the loop will stop when it

reaches the given time frame.

Algorithm 1 Hand-crafted Solution

1: current← J1
2: while not reached the given time frame do
3: if taxi dropped the passenger then
4: break
5: end if
6: current function← functionsLibrary[current]
7: returned value← current function()
8: time← time+ used time
9: current← returned value

10: end while

Algorithm walk through

Here is a snapshot (Fig. 3.4) of utilizing Algorithm 1 to pick up the passenger

successfully. On the left-hand side of the snapshot is the taxi map; the yellow object

represents the taxi, and the blue letter represents the pick-up location. On the right-

hand side of the snapshot is the output of Algorithm 1. J1 (Should a taxi pick up

or drop off the passenger) will be executed in the first step, and the related judgment

function will be called (referring to step 6 in Algorithm 1). A value is returned, which

is the next node to be executed. In this case, J3 (Pick up branch) is returned, which


CHAPTER 3. GENERAL STRATEGIES 46

Figure 3.4: Function library.

means the taxi should pick up the passenger. The current node is set to J3, and the

judgment function J3 will be called. The function will take the current state of the

map as the input and return the next node to be executed. Therefore, J4 (Should Move

East Branch) will be executed. As we can see from the map, the pick-up location is

east of the taxi, and there is no obstacle. P2 (move taxi to the East) will be executed.

After certain steps, the taxi will pick up and drop off the passenger. (The details of

the judgment/processing functions will be explained in the proposed GNP Chapter.)

In this chapter, we established a handcrafted solution as the ’black box’ solution for

our ’Top-Bottom’ approach. As the ’Top’ part, it adapts the fundamental principles in

robotics and artificial intelligence, iteratively receiving information from the environ-

ment and performing a series of actions. The handcrafted solution is structured around

different judgment and processing functions, each tailored to specific scenarios within

the Taxi Problem. As the ground truth of the ’Top-Bottom’ approach, this research

utilizes this handcrafted solution as the ’black box’ solution to develop our GNP variant

and the proposed algorithm in later sections.


Chapter 4

Using a Black Box as a Fitness

Function in a GNP

4.1 Introduction

This section explores our proposed new variant of GNP in detail. It utilizes the ’Divide

and Conquer’ strategy to solve the Taxi problem, as well as a black box system to

guide the GNP. This GNP is built based on the design of our hand-crafted solution.

The focus is on several key aspects:

• GNP Life-Cycle: We will explain the details of the evolutionary cycle and how

the ’black box’ solution interacts with our GNP variant.

• Chromosome Structure: Based on the ’black box’ solution, the problem is

separated into ”pick-up” and ”drop-off” sub-problems. These two sub-problems

will be explained in both genotype and phenotype.

• Customized Evolution Operators: These are specially crafted based on the

’black box’ solution to guide the evolutionary process effectively toward the ideal

solution, ensuring meaningful progress in each evolution step.

• Innovative Fitness Function: In this fitness function, we combine the fitness

function with other algorithms and utilize the ’black box’ solution to guide and

evaluate each individual.

47


CHAPTER 4. USING A BLACK BOX AS A FITNESS FUNCTION IN A GNP 48

We will delve into each of these parts and show how they work together to make

our new GNP variant solve the multi-goal path-finding issues ef