The Metric Travelling Salesman Problem : The Experiment on Pareto-optimal Algorithms

The Metric Travelling Salesman Problem is a subcase of the Travelling Salesman Problem (TSP), where the triangle inequality holds. It is a key problem in combinatorial optimization. Solutions of the Metric TSP are generally used for costs minimization tasks in logistics, manufacturing, genetics and other fields. Since this problem is NP-hard, heuristic algorithms providing near optimal solutions in polynomial time will be considered instead of the exact ones. The aim of this article is to experimentally find Pareto optimal heuristics for Metric TSP under criteria of error rate and run time efficiency. Two real-life kinds of inputs are intercompared VLSI Data Sets based on very large scale integration schemes and National TSPs that use geographic coordinates of cities. This paper provides an overview and prior estimates of seventeen heuristic algorithms implemented in C++ and tested on both data sets. The details of the research methodology are provided, the computational scenario is presented. In the course of computational experiments, the comparative figures are obtained and on their basis multi-objective optimization is provided. Overall, the group of Pareto-optimal algorithms for different N consists of some of the MC, SC, NN, DENN, CI, GRD, CI + 2-Opt, GRD + 2Opt, CHR and LKH heuristics.


Introduction
The Travelling Salesman Problem (TSP) is one of the most widely known questions in a class of combinatorial optimization problems.Essentially, to meet a challenge of the TSP is to find a Hamiltonian circuit of minimal length.A subcase of the TSP is Metric TSP where all of the edge costs are symmetric, and they satisfy the triangle inequality.The methods for solving the TSP have been developed for many years, and since the problem is NP-hard, it continues to be topical.The TSP has seen applications in the areas of logistics, genetics, manufacturing, telecommunications and neuroscience [1].The most common practical interpretation of the TSP relates to the movement of people and vehicles around tours, such as searching for the shortest tour through  cities, school bus route planning, and postal delivery.In addition, the TSP plays an important role in very large-scale integration (VLSI).The purpose of this study is to determine the group of Pareto-optimal algorithms among the set of selected ones for Metric TSP by criteria of run time and qualitative performance.
Clearly, a study of this type is inevitably restricted by various constraints, in this research only heuristic algorithms constructing near optimal solutions in polynomial time will be considered instead of the exact ones.The paper is structured as follows.First, the theoretical basis is described.It presents definition of resource-efficient parameters, Pareto optimization and, at last, the formulation of the aim of the project.Then the description of methods to be used is provided with their prior estimates.After that the details of the research methodology and expected results are mentioned.

Theoretical basis
In this paper, mathematical formulation of Metric TSP is adopted as stated here [2].

Parameters for Pareto-optimality
Let  be a set of selected heuristic algorithms for Metric TSP.There are two parameters of resource-efficiency for  ∈  for each number of vertices  in data set:   (, ) -qualitative performance;   (, ) -running time.Qualitative performance can be calculated using: where () is the obtained tour length and ( ) is the optimal tour length.The values of optimal tour lengths are taken from the open libraries VLSI Data Sets and National TSPs as the lengths of the best found (exactly) or reported solutions for each of the instances [3] [4].

The aim of the study
The aim is to find a set  = (∀ ∈ ) ( ≠  ) ⇒  () >  ( ) ∨  () >  ( ) of Pareto-optimal algorithms for Metric TSP by criteria of time and qualitative performance.

Algorithms
Algorithms for solving the TSP may be divided into two classes:  Exact algorithms, and  Heuristic (or approximate) algorithms.Exact algorithms are aimed at finding optimal solutions.However, a major drawback is connected with their time efficiency.It is a common knowledge that there are no exact algorithms running in polynomial time.Thus, only small datasets can be solved in reasonable time.For example, the 4410-vertex problem is believed to be the largest Metric TSP ever solved with respect to optimality [3].In this paper, some algorithms from a class of heuristic search algorithms will be taken into account.They are designed to run quickly and to get an approximate solution to a given problem.Heuristic algorithms are subdivided into two groups.The first group includes tour construction algorithms that have one feature in common -the tour is built by adding a new vertex at each step.The second group consists of tour-improving algorithms that, according to Applegate, '…take as input an approximate solution to a problem and attempt to iteratively improve it by moving to new solutions that are close to the original'.Full classification of heuristic algorithms has already been presented in [2].In order to restrict our investigation, it was decided to choose only three types of tour improving algorithms -the most simple local-optimal method (2-Opt), the most perspective one (LKH) and one of the best swarm intelligence methods -algorithm qCABC based on bee colony agents.The list of used algorithms for Metric TSP is as follows.

Nearest Neighbour (NN)
The key to NN is to initially choose a random vertex and to add repeatedly the nearest vertex to the last appended, unless all vertices are used [5].
Avdoshin S.M., Beresneva E.N.The Metric Travelling Salesman Problem: The Experiment On Pareto-optimal Algorithms.Trudy ISP RAN/Proc.ISP RAS, vol. 29, issue 4, 2017, pp. 123-138. 126 3.2 Double Ended Nearest Neighbour (DENN) This algorithm is a modification of NN.Unlike NN, not only the last appended vertex is taken into consideration, so the closest vertex to both of endpoints in the tour is added [6].

Greedy (GRD)
The Greedy heuristic constructs a path by adding the shortest edge to the tour until a cycle with  edges,  < , is created, or the degree of any vertex exceeds two [7].

Nearest Addition (NA)
The fundamental idea of NA is to start with an initial subtour made of the shortest edge and to add repeatedly other vertices which are the closest to the vertices being already in the cycle.It should be noted that insertion place is not specially calculated.It is always added after the nearest vertex in the cycle.Algorithm is terminated when all vertices are used and inserted in the tour.

Nearest Insertion (NI), Cheapest Insertion (CI), Farthest
Insertion (FI), Arbitrary Insertion (AI), Nearest Segment Insertion (NSI) The start step of these algorithms is similar to NA (except for FI, where the longest edge is found).Next, other vertices are added repeatedly using various rules.Depending on the algorithm the vertex not yet in the cycle should be inserted so that:  In NI it is the closest to any node in the tour;  In CI its addition to the tour gives a minor increment of its length;  In FI it is the farthest to any node in the cycle;  IN AI it is the random vertex not yet in the cycle;  In NSI distance between the node and any edge in the tour is minimal.The previous step should be repeated until all vertices are added to the cycle.The feature of these methods is additional computation that selects the best place for each inserting node [6] [8].

Double Minimum Spanning Tree (DMST)
DMST method is based on the construction of a minimal spanning tree (MST) from the set of all vertices.After MST is built, the edges are doubled in order to obtain an Eulerian cycle, containing each vertex at least once.Finally, a Hamiltonian circuit is made from an Eulerian circuit by sequential (or greedy) removing occurrences of each node [9].

Double Minimum Spanning Tree Modified (DMST-M)
This algorithm is a modification of DMST.Unlike DMST, it is necessary to remove duplicate nodes from an Eulerian cycle using triangle inequality instead of greedy method.

Christofides (CHR)
This method is a modification of DMST that was proposed by Christofides [10].The difference between CHR and DMST is addition of minimum weight matching calculation to the first algorithm.

Moore Curve (MC)
This is a recursive geometric method.Vertices are sorted by the order they are located on the plane.Only the two-dimensional example of Moore curve is implemented [11].Figure 1 shows the order of the cells after one, two and three subdivision steps respectively [11].

Sierpinski Curve (SC)
This algorithm is also included in the family of Space-Filling Curves combinatorial algorithms as MC.SC is more symmetric than MC [12].Figure 2 shows the order of the cells after one, two and three subdivision steps respectively.Fig. 2. The order for the Sierpinski curve after 1, 2 and 3 subdivision steps

2-Opt
The main idea behind 2-Opt is to take a tour that has one or more self-intersections and to remove them repeatedly.In mathematical terms, edges  and  should be  LKH uses the principle of 2-Opt algorithm and generalizes it.In this heuristic, the -Opt, where  = 2. .√ , is applied, so the switches of two or more edges are made in order to improve the tour.This method is adaptive, so decision about how many edges should be replaced is taken at each step [14].It should be noted that because of complexity of LKH algorithm, it was not implemented by the authors of research.The original open source code [15] was used to carry out experiments.All the parameters were not changed, so they were used by default.

Quick Combinatorial Artificial Bee Colony (qCABC).
This is one of the Swarm Intelligence methods, which is based on colony of bees.Algorithm suggests that all agents are divided into the three groups: scout bees (looking for new random solutions), employed bees (keeping information and sharing it) and onlooker bees (choosing the solution to explore) [16].

Estimates
Estimated upper bounds for the algorithms can be calculated as are the ratio of ( ) ( ) .
According to [1], for any -Opt algorithm, where  ≤ /4, problems may be constructed such that the error is almost 100%.So 2-Opt and LKH algorithms have approximate upper bound 2. Upper-bound estimates and running times of the algorithms are represented in Table 1.The time limit on algorithm's running time is introduced.It is 11 800 seconds ≈ 3 hours and 20 minutes, at the maximum.That means computational time for one experiment cannot exceed 11 800 seconds * 11 runs ≈ 36 hours ≈ 1.5 days.

Results
Experimental results showed that algorithm qCABC takes a large amount of time (more than 'the slowest' CHR) and gives improvement in accuracy even less than

132
'the most rough' 2-Opt.So qCABC as tour improving algorithm is admitted to be "unviable".We decided to select 10 pairs of data sets from VLSI and National TSPs with similar number of vertices (see Table 3) to plot charts that illustrate Paretos.The charts for pair with  = 22 775 and  = 22 777 are shown below (see Fig. 5, Fig. 6, Fig. 7, Fig. 8).The name of each TSPLIB instance is shown in chart title.The horizontal axis represents the time performance of methods in seconds.The vertical axis shows the gap between optimal and obtained solutions, expressed in percent.
Pareto-optimal methods are highlighted in red.The points which are represented by Pareto solutions are bigger than non-Pareto-optimal solutions.There are two charts (see Fig. 6, Fig. 8) where not all algorithms are compared.These auxiliary charts are enlarged copies of their originals.Their role is to graphically illustrate Pareto-optimal algorithms at scale-up.
Results on VLSI Data sets only are reported in more detail in [2].Pareto-optimal solutions, that can be suggested on the basis of both data sets only, are shown in Table 4 and they are sorted in the order of increase of running time:  Moore Curve (MC);  Sierpinski Curve (SC) -this algorithm depends on type of input data, so qualitative performance estimates are unstable;  Nearest Neighbour (NN);  Double Ended Nearest Neighbour (DENN);  Cheapest Insertion (CI) is Pareto-optimal if  ≲ 400 000 because of introduced time limit; if  ≲ 3 500 CI's behavior fluctuates;  Greedy (GRD) -is Pareto-optimal if  ≲ 30 000 because of memory limits -( ) pairs of edges are needed to be kept simultaneously ;  Cheapest Insertion and 2-Opt (CI + 2-Opt) -is Pareto-optimal if 30 000 ≲  ≲ 100 000;  Greedy and 2-Opt (GRD + 2-Opt) -is Pareto-optimal if  ≲ 800;  Christofides (CHR) -is Pareto-optimal if  ≲ 2 000;  Helsgaun's Lin and Kernighan Heuristic (LKH) -this algorithm works excellent if  ≲ 55 000, however if input data size exceeds 55 000 than time limit is met.The "+" sign means that the algorithm in the same row is supposed to be Paretooptimal at the range of vertices defined in the same column.The "±" sign shows that experiments did not clearly define if it is Pareto-optimal or not.

Conclusion
The presented study is undertaken to determine what heuristics for Metric TSP should be used in specific circumstances with limited resources.This paper provides an overview of seventeen heuristic algorithms implemented in C++ and tested on both the VLSI data set and instances of National TSPs.In the course of computational experiments, the comparative figures are obtained and on their basis multi-objective optimization is provided.Overall, the group of Paretooptimal algorithms for different  consists of some of the MC, SC, NN, DENN, CI, GRD, CI + 2-Opt, GRD + 2-Opt, CHR and LKH heuristics.
In our future work, we are going to fine-tune parameters of LKH method using genetic algorithms of search optimization.Further, it is possible to increase the number of heuristic algorithms, to transit to other types of test data and to conduct experiments using different metrics in order to ensure that a Pareto optimal group is sustainable.
The practical applicability of our findings is to present Pareto optimal algorithms that lead to solutions with maximum accuracy under the given resource limitations.The results can be used for scientific purposes by other researchers and for cost minimization tasks.

Fig. 3 .
Fig. 3. Computational scenario Metrics used in scenario have following meanings:   (, ) -qualitative performance of  (one iteration),   (, ) -running time of  (one iteration),   (, ) -best qualitative performance of ,   (, ) -average running time of  (sec),   ∑  (, ) -standard deviation of running time estimates through 10 iterative runs,    (, ) -expected value of qualitative performance of  for one DT,    (, ) -standard deviation of qualitative performance of  for one DT,    (, ) ,   (, ) -maximum and minimum values of qualitative performance of  for one DT.Qualitative performance metrics are represented in Table2.Table color scheme varies from green (the best result in a column) to red (the worst value in a column).

Table 1 .
[4]er-bound estimates and running time of algorithmsThis section documents details of the research methodology.The experiment is carried out on a 1.3 GHz Intel Core i5 MacBook Air.It includes the qualitative performance and the run time efficiency of the current implementations.Heuristics are implemented in C++.Two types of data bases from an open library TSPLIB are selected.The first one is VLSI data sets[3].There are 102 instances in the VLSI collection that range in size from 131 vertices up to 744,710 vertices.All of these instances are tested.The first dataset is National TSPs, which includes 25 instances that vary from 29 to 71009 points[4].There is one data set for each number of vertices for all input data.The integer Euclidean metric distance is used, so coordinates of nodes and distances between them have integer values.The distance d between some nodes v and w is calculated as follows: calculate   (, ) ,   ( + ′, ) for all  23: calculate   (, ) ,   ( + ′, ) for all  24: calculate   (, ) ,   ( + ′, ) for all

Table 2 .
Table color scheme varies from green (the best result in a column) to red (the worst value in a column).

Table 2 .
Running time of algorithms

Table 3 .
Pairs of input datasets from VLSI and National TSPs