Saturday, 13 December 2014

EDIT DISTANCE AND LCS

The edit distance and lcs are very much related
and the relation is 
\left|SCS(X,Y)\right| = n + m - \left|LCS(X,Y)\right|.

http://en.wikipedia.org/wiki/Longest_common_subsequence_problem

but
The edit distance when only insertion and deletion is allowed (no substitution), or when the cost of the substitution is the double of the cost of an insertion or deletion, is: in terms of lcs


spoj edist soultion


#include<stdio.h>
#include<string.h>
char a[1000000],b[1000000];
int k[20001][2001];
int mi(int a,int b,int c)
{
    if(a<=b && a<=c )
    return a;
    if(b<=a && b<=c )
    return b;
    if(c<=a && c<=b )
    return c;
}
int main()
{
    int l1,l2,t;
   // printf("%d",mi(1,0,3));
    scanf("%d",&t);
    while(t--)
    {

    scanf("%s",a);
    scanf("%s",b);
    l1=strlen(a);
    l2=strlen(b);
    for(int i=0;i<=l1;i++)
        k[i][0]=i;
    for(int i=0;i<=l2;i++)
        k[0][i]=i;

    for(int i=1;i<=l1;i++)
    {   int c=0;

        for(int j=1;j<=l2;j++)
        {
            if(a[i-1]==b[j-1])
                c=0;
            else
                c=1;
            k[i][j]=mi(k[i-1][j]+1,k[i][j-1]+1,k[i-1][j-1]+c);
        }
    }
    printf("%d\n",k[l1][l2]);
    }
}

explanation

we find the minimum from all the three up, left and up-left diagonal
the up and left diagonal will always be added with one and computed where as

if a[i]==a[j] then the upleft diagonal will be added with zero (0)
 else if a[i] != a[j] , then the upleft diagonal will be adder with 1 and computed .