# We’ve had this conversation before: A Novel Approach to Measuring Dialog Similarity

This paper adapts edit distance to measure the similarity of dialog. Specifically, for the following Minimum Edit Distance algorithm:

$$d_{i,j} = min \begin{cases} d_{i-1, j} \hspace{10pt} + \hspace{15pt} w_{del}(a_{i}) \\ d_{i, j-1} \hspace{10pt} + \hspace{15pt} w_{ins}(b_{j}) \\ d_{i-1, j-1} + w_{sub}(a_{i}, b_{j}) \\ \end{cases}$$

they propose to define the substitution cost as:

$$\label{eq:sub-cosine} w_{sub}(u_1^i, u_2^j) = \alpha{\times}(1-\cos(e_1^i, e_2^j))$$

where the embeddings are from the Universal Sentence Encoder.

https://aclanthology.org/2021.emnlp-main.89/