I have a list of unique words. I have to calculate the hamming distance between two list of strings. Suppose the list of strings are:
a = ['a' , 'b', 'c' ]
b = ['b' , 'a', 'd' ]
And let the unique words list be:
u = ['a', 'b', 'c', 'd', 'e']
I need to create two lists from a
and b
that will be of the same length as u
. Suppose the lists are va
and vb
. Each element of va
and vb
will either be 0
or 1
. It will be 1
if corresponding element of u
exists in a
or b
and 0
otherwise. For example,
va = [1, 1, 1, 0, 0]
vb = [1, 1, 0, 1, 0]
I will then calculate the hamming distance between va
and vb
using the sklearn's pairwise distance metric. What is the most efficient way to calculate va
and vb
from a
, b
and u
?