Levenshtein距离(莱文斯距离),是编辑距离的一种。指两个字符串之间一个转成所需的最少编辑操作次数。允许的编辑操作包括一个字符替换成另一个字符,插入一个字符,删除一个字符。适用场景包括一个字符串与多个字符串比较求最相近的字符串等等
例如将kitten转成sitting
1、sitten (k->s)
2、sittin (e->i)
3、sitting (->g)
Java代码实现:
Java代码详细下载地址:
/*** 两个字符串相似度算法* 编辑距离相似度算法*/
public class test {public static int compare(String str,String target){//声明二维数组int d[][];int i;int j;int strLength = str.length();int targetLength = target.length();d = new int[strLength+1][targetLength+1];if(strLength == 0){return targetLength;}if(targetLength == 0){return strLength;}//初始化二维数组for(i=0;i <= strLength;i++){d[i][0] = i;}for(j=0;j<= targetLength;j++){d[0][j] = j;}char c1;char c2;int temp;for(i=1 ;i<= strLength;i++){c1 = str.charAt(i-1);for(j=1;j<= targetLength;j++){c2 = target.charAt(j-1);if(c1 == c2 || c1 + 32 == c2 || c2 + 32 == c1){temp = 0;}else{temp = 1;}d[i][j] = min(d[i-1][j]+1,d[i][j-1]+1,d[i-1][j-1]+temp);}}return d[strLength][targetLength];}public static int min(int one,int two,int three){return (one = one < two ? one :two) < three ? one : three;}public static float getSimilarityRatio(String str,String target){int max = Math.max(str.length(),target.length());return 1 - (float)compare(str,target) / max;}public static void main(String[] args){String a = "kitten";String b = "sitting";System.out.println(getSimilarityRatio(a,b));}
}
Python代码实现:
Python代码详细下载地址:
def compare(str,target):
n = len(str) + 1m = len(target) + 1if( n== 0):return mif(m==0):return n#构建二维矩阵distance_matrix =[[0]*m for x in range(n)]#初始化矩阵for i in range(n):distance_matrix[i][0] = ifor j in range(m):distance_matrix[0][j] = jprint("distance_matrix>>>>{0}".format(distance_matrix))for i in range(1,n):for j in range(1,m):deletion = distance_matrix[i-1][j] + 1insertion = distance_matrix[i][j-1] + 1substitution = distance_matrix[i-1][j-1]if str[i-1] != target[j-1]:substitution += 1distance_matrix[i][j] = min(insertion,deletion,substitution)print("distance_matrix>>>>>>>{0}".format(distance_matrix[i][j]))return distance_matrix[n-1][m-1]def getCompareRatio(a,b):resultValue = compare(a,b)maxValue = max(len(a),len(b))return 1- (float)(resultValue/maxValue)if __name__ == '__main__':a = 'kitten'b = 'sitting'print(getCompareRatio(a,b))
你的鼓励是我分享技术最大的动力!如有错误之处,请指正,不胜感激。
本文发布于:2024-02-01 20:44:40,感谢您对本站的认可!
本文链接:https://www.4u4v.net/it/170679147939314.html
版权声明:本站内容均来自互联网,仅供演示用,请勿用于商业和其他非法用途。如果侵犯了您的权益请与我们联系,我们将在24小时内删除。
留言与评论(共有 0 条评论) |