This C++ package can be used to compute the similarity between two different strings. The similarity measures included are
Click here to download the package
The code also provides example scripts which can be used to compute similarity between strings in two different text files.
- Cosine Similarity
- Jaccard Similarity
- Length-weighted Kernel
- P-Spectrum Kernel
- Levenshtein Similarity
Here is a README which explains what is included and instructions for usage.
Back to Software
- Leslie, C. S., E. Eskin, and W. S. Noble. The spectrum kernel: A string kernel for
svm protein classification. In Pacific Symposium on Biocomputing 2002, pp. 566-575.
- S. V. N. Vishwanathan and Alex Smola , Fast Kernels for String and Tree Matching. In NIPS 2004