I'm interested in finding similarities between two lists. I have the count of duplicates in the first column, and the pattern is in the second column. What would be the most logical way to compare these two lists so I don't need to do it manually?
List1 is:
11 | 55
4 | 31
4 | 1
3 | 22
2 | 13
1 | 81```
List2 is:
7 | 31
6 | 22
6 | 13
4 | 88
3 | 14
1 | 55
I'm interested in finding similarities between two lists. I have the count of duplicates in the first column, and the pattern is in the second column. What would be the most logical way to compare these two lists so I don't need to do it manually?
List1 is:
11 | 55
4 | 31
4 | 1
3 | 22
2 | 13
1 | 81```
List2 is:
7 | 31
6 | 22
6 | 13
4 | 88
3 | 14
1 | 55
Firstly these can be better stored, when you come to search these lists for the value it will currently be O(n). The data can be better stored in the form of a Bag (a set which is a set that allows duplicates) of patterns, or a dictionary where the keys are the patterns. These are likely to be implemented with a binary tree or a hash table, leading to O(log n) or O(1) searches.
You will need to iterate over the patterns stored in both bags, a language agnostic way to achieve this would be to form a set (so you don't get duplicates) of all the patterns in both bags. However with many languages you may manage to avoid storing a new set by writing a custom iterator or generator with knowlege of both bags.
When it comes to interpret the results of the comparison, this will vary by how detailed an output you want, do you just want to know if there is any difference, or a count of total differences, or to know which patterns differ, or the total differences by pattern, and do you need to know which list has more/less of each pattern?
Assuming you just want the total differences the algorithm would be to: