~ruther/ctu-fee-eoa: chore: part of report

1 files changed, 207 insertions(+), 0 deletions(-)

A codes/report.md

A codes/report.md => codes/report.md +207 -0

@@ 0,0 1,207 @@
+# Intro
+
+This is report for a hw01 for EOA course on CTU TEE. The goal
+has been to try to solve traveling salesperson problem by
+means of evolutionary algorithms. The report covers the implemented
+algorithms and the results on 10 TSP instances from TSPLIB. All of
+those chosen instances are using euclidian metric and are 2D.
+
+# Representations
+
+Two representation have been chosen for the TSP problem,
+
+1. Node permutation
+2. Binary string half-matrix denoting if city i comes before city j
+
+## Node permutation
+Implemented as a vector of indices of the cities.
+Couple of perturbations and crossovers have been implemented:
+
+- Perturbations
+  - Move single random city to random position
+  - Swap two cities
+  - Reverse subsequence of cities
+- Crossovers
+  - Cycle crossover
+  - Partially mapped crossover
+  - Edge recombination crossover
+
+Also apart from a random initializer, two additional initializers are implemented,
+one based on minimal spanning tree and second based on nearest neighbors. Detailed
+descriptions of the algorithms follow in next sections.
+
+### Crossovers
+All used crossovers take two parents and it's possible to create two
+offsprings out of the two parents by switching the parent positions.
+(which one is first parent, second parent)
+
+#### Cycle crossover
+The cycle crossover creates a cycle, takes parts of the first parent
+from the cycle and fills the rest from the second parent.
+
+The cycle is created as follows:
+1. Start with index 0, call it current index
+2. Save current index
+3. Look at current index in first parent, let's call it current element
+4. Find the same element in second parent and update current index to this new index
+5. Repeat 2 - 4 until reaching index 0 again
+
+Then the offspring is created as follows:
+1. Clone the second parent
+2. Iterate the saved cycle indices, take element at index from first parent and copy it to offspring at the same index
+
+#### Partially mapped crossover
+The partially mapped/matched crossover randomly selects two points. At the end it should
+ensure that the offspring has first parent's elements in between the two cross points.
+
+The way to ensure that, while still ensuring the result is a valid permutation, is to
+always swap the elements.
+
+Offspring is created as follows:
+1. Clone second parent
+2. Then, for every index i between the cross points:
+3. Take the element on index i from first parent
+4. Find the city in the offspring, let's call its index j
+5. Swap i with j
+
+#### Edge recombination crossover
+Edge recombination is the most complicated from the three crossover operators.
+
+First, an adjacency list is created for both parents.
+
+## Binary string
+Classical perturbations and crossovers have been
+implemented for the binary string representation, specifically:
+
+- Perturbations
+    - Flip each bit with probability p
+    - Flip single random bit
+    - Flip of N bits (N is chosen beforehand, not randomly)
+    - Flip of whole string
+
+- Crossover
+    - N point crossover
+
+As for initialization, random initializer has been implemented.
+
+The fitness function is implemented by form of a wrapper that converts
+the BinaryString into NodePermutation and then the same fitness
+function is used as for node permutation representation.
+
+### N-point crossover
+The N point crossover works on two parents and is
+capable of producing two offsprings.
+The crossover first chooses N cross points randomly.
+
+Then, the cross points are ordered in an ascending order
+and first all bits from first parent are chosen until
+cross point is encountered, then all bits are taken from
+second parent until another cross point is reached. And
+this repeats until the end of the string is reached.
+
+Also, one-point and two-point crossovers
+have been implemented separately for more effective implementation
+than the generic N-point.
+
+# Prepared algorithms
+
+## Evolution algorithm
+
+## Local search
+
+## Random search
+Random search initializes new random elements in the search space each
+iteration and saves new elements if they are better than best found.
+
+# Heuristics
+Instead of starting with random solutions, two heuristics have been tried to
+make the initial populations for the evolutionary algorithms.
+
+## Nearest neighbors
+The heuristic starts at a given node and finds it's nearest neighbor. Then
+adds that neighbor to the permutation and moves to it. Then it repeats search
+for the nearest neighbor, making sure to not select nodes twice. The whole
+chromosome is built like this.
+
+Moreover the possibility to select second neighbor instead of first has been
+incorporated in the heuristic as well. Specifically, it's possible to choose
+the probability to choose second neighbor instead of first one.
+
+This makes it possible to generate a lot of initial solutions, specifically it's
+possible to generate nearest neighbors starting from each city and then it's possible
+to tweak the probability of choosing the second neighbor.
+
+To initialize the whole population:
+1. Generate chromosome starting from each city, choosing the first neighbor.
+2. Generate chromosome starting from each city, but always choosing the second neighbor.
+3. The rest of the population is initialized from randomly selected cities with randomly selected probability of choosing second neighbor.
+
+## Minimum spanning tree
+ jjj
+
+# Results
+To compare all the algoritms on various instances, always at least 10 runs of the algorithm
+have been made on the given instance. All the P_s (TODO) graphs were then constructed from
+averaging between the runs. The fitness charts sometimes show less instances to not be too
+overwhelming.
+
+## Comparing perturbations on LS
+
+
+## Comparing algorithms
+To compare the algorithms,
+first it has been ensured the algorithms were tweaked to produce the best results (best that the author
+has been capable of). Then, they were ran on 10 instances of TSP and averaged in the following chart:
+
+## Comparing crossovers
+During evaluation of the various crossovers, it has become apparent that with the currently
+chosen
+
+## Comparing heuristics
+
+# Things done above minimal requirements
+
+- 1 point - Compared 2 representations (permutation and binary string with precedences)
+- 1 point - Compared 4 LS perturbations (swap, reverse subsequence, move city, combination)
+- 1 point - Compared 3 crossover operators (edge recombination, partially mapped, cycle)
+- 1 point - Compared all algorithms on 10 instances
+- 1 point - Initialized with two constructive heuristics (nearest neighbors, solutions from minimal spanning tree)
+
+# Code structure
+Rust has been chosen as the language. There are three subdirectories, `eoa_lib`, `tsp_hw01` and `tsp_plotter`.
+
+`eoa_lib` is the library with the generic operators defined, with random search, local search and evolution
+algorithm functions. It also contains the most common representations, perturbations and crossovers for them.
+
+`tsp_hw01` contains the TSP implementation and runs all of the algorithms. It then produces csv results
+with best candidate evaluations for given fitness function evaluation count. This is then utilized by
+`tsp_plotter`. The node permutation tsp representation itself is implemented in `tsp.rs`. The configurations
+of all algorithms used are located in `main.rs`. All the instances used are located in `tsp_hw01/instances` and
+the solutions are put to `tsp_hw01/solutions`
+
+`tsp_plotter` contains hard-coded presets for charts to create for the report.
+
+# Usage of LLM
+
+While I was working on this homework, I have used LLM for certain tasks,
+specifically I have used it for the tasks that I do not like to do much myself:
+- Loading data,
+- Plotting graphs,
+- Refactoring at a lot of places (ie. at first I chose to make perturbation copy the chromosome,
+  but I realized this is very ineffective for EAs afterwards and change it to change in-place instead),
+- Writing some of the tests
+
+As I am not proficient with Rust, sometimes I asked LLM to help with the syntax as well,
+to find a library that will help solve a task or what functions are available for a specific
+task.
+
+I have used LLM only minimally for implementing the algorithms or for deciding on how
+to make the implementation, mainly sometimes in the form of checking if the algorithm
+looks correct. (and it did find a few issues that I then sometimes let it fix and
+sometimes fix myself). This is because I believe that I can learn the most by
+writing the implementations myself.
+
+I use Claude from within claude-code, a CLI tool that is capable
+of reading files, executing commands and giving the model live feedback,
+like outputs of ran commands. This means the LLM is able to iterate by
+itself, without my intervention, for example when fixing errors.