1 09-Implemention


Which EA is best?
How do I choose an EA?

2 What do you want an EA to do?

“I want the EA solve my problem…”
Different objectives imply different ways of designing and working with an EA.
A good first step is to examine the given problem context.
We can roughly distinguish two main types of problems:
* design (one-off) problems
* repetitive problems
* including on-line control problems as special cases

3 Working with Evolutionary Algorithms

4 Experimentation

4.1 Various overlapping goals

5 Example: Production Perspective

Logistics: Optimizing delivery routes.
* Different destinations each day
* Limited time to run algorithm each day
* Must always be reasonably good route in limited time

6 Example: Design Perspective

7 Perspectives of goals

Design perspective:
find a very good solution at least once

Production perspective:
find a good solution at almost every run

Publication perspective:
must meet scientific standards (heh?)

Application perspective:
good enough is good enough (verification!)

These perspectives have very different implications on evaluating the results.
However, the are often left implicit…

8 Algorithm design

Design a representation
Design a way of mapping a genotype to a phenotype
Design a way of evaluating an individual

Introduce diversity
Design suitable mutation operator(s)
Design suitable recombination operator(s)

Select individuals
Decide how to select individuals to be parents
Decide how to select individuals for the next generation (how to manage the population )

Instantiate
Decide how to start: initialization method
Decide how to stop: termination criterion

9 Test problems

One can generalize many representations to numerical.
In assuming this unified representation space,
we can define purely numerical lanscapes with different shapes.

10 Bad example

What did I (my readers) not learn:
* How relevant are these results (test functions)?
* What is the scope of claims about the superiority of the tricky GA?
* Is there a property distinguishing the 7 good and the 2 bad functions?
* Are my results generalizable? (Is the tricky GA applicable for other problems? Which ones?)

11 Getting Problem Instances

11.1 Testing on real data

11.2 Standard data sets in problem repositories, e.g.:

* OR-Library
* <http://www.ms.ic.ac.uk/info.html>
* UCI Machine Learning Repository www.ics.uci.edu/~mlearn/MLRepository.html

11.3 Problem instance generators produce simulated data for given parameters, e.g.:

* GA/EA Repository of Test Problem Generators

12 Basic rules of experimentation

13 Things to Measure

Many different ways to measure, for example:
* Average result in given time
* Average time for given result
* Proportion of runs within % of target
* Best result over n runs
* Amount of computing required to reach target in given time with % confidence
* …

14 What time units do we use?

15 Measures

Performance measures (off-line)
* Efficiency (alg. speed)
* CPU time
* No. of steps, i.e., generated points in the search space
* Effectivity (alg. quality)
* Success rate
* Solution quality at termination

“Working” measures (on-line)
* Population distribution (genotypic)
* Fitness distribution (phenotypic)
* Improvements per time unit or per genetic operator
* …

16 Performance measures

17 Fair experiments

18 Example: off-line performance measure evaluation

Which algorithm is better?
Why?
When?
Populations mean (or best) fitness

Implemention/impl02.png
Comparing algorithms A and B by their scale-up behaviour.
Algorithm B can be considered preferable because its scale-up curve is less steep.

Implemention/impl04.png
Comparing algorithms on problem instances with a scalable parameter

More abstract distributions:
Implemention/impl03.png
Comparing algorithms by histograms of the best found fitness values

19 Example: on-line performance measure evaluation

Implemention/impl_ppt_01.png
Which algorithm is better?
Why?
When?

Implemention/impl01.png
Comparing algorithms A and B by after terminating at time T1 and T2 (for a minimisation problem).
Algorithm A clearly wins in the first case, while B is better in the second one

20 Example: averaging on-line measures

Implemention/impl_ppt_02.png
Averaging can “choke” or filter out interesting information

21 Example: overlaying on-line measures

Implemention/impl_ppt_03.png
Overlay of curves can lead to very “cloudy” figures

Best: show continuous mean, sem, STD over x axis, plus various randomized controls!
Like the supplementary figures in this paper of mine:
https://www.nature.com/articles/srep18112

22 Statistical Comparisons and Significance

23 Example

Is the new method better?
Implemention/impl_ppt_04.png

24 Example (cont’d)

25 Statistical tests

26 Better example: problem setting

27 Better example: experiments

28 Better example: evaluation

29 Some tips