Longest Subsequence Problem

Leetcode Problem 1143

Given two strings text1 and text2, return the length of their longest common subsequence. It is known as longest subsequence problem.

A subsequence of a string is a new string generated from the original string with some characters(can be none) deleted without changing the relative order of the remaining characters. (eg, “ace” is a subsequence of “abcde” while “aec” is not). A common subsequence of two strings is a subsequence that is common to both strings.

If there is no common subsequence, return 0.

Example 1:
Input: text1 = “abcde”, text2 = “ace”
Output: 3
Explanation: The longest common subsequence is “ace” and its length is 3.

Example 2:
Input: text1 = “abc”, text2 = “abc”
Output: 3
Explanation: The longest common subsequence is “abc” and its length is 3.

Example 3:
Input: text1 = “abc”, text2 = “def”
Output: 0
Explanation: There is no such common subsequence, so the result is 0.


Constraints:
1 <= text1.length <= 1000
1 <= text2.length <= 1000
The input strings consist of lowercase English characters only.

Analysis of the Problem Statement

The longest subsequence problem statement says that we are given two strings and we have to find the longest common subsequence in this problem.

Let us understand the problem statement more clearly.

Let’s say text1 = “abcdhe” and text2 = “aedfhr”, what is the longest common subsequence in these two strings ??

Longest Common Subsequence : “adh”
Longest Common Subsequence Length : 3

For the above problem, your function should return 3.

NOTE that the relative order of a subsequence remains the same, although ‘e’ was present in both the strings it will not be considered as the part of the subsequence because their relative order is not the same.

“ae” is one of the common subsequences but “adhe” is not a common subsequence.

I hope the Problem Statement is clear now. It is highly recommended to try this problem once before moving to its solution.

Naive Method

The naive method is to run a recursion and start comparing the string characters from backward.

Two cases will arise – 

(i) If both the character matches, we will simply increment the count by 1. We will remove the last character of both the strings and then we will again call the recursion to the modified strings.

(ii)  If  both the character doesn’t match, again two cases will arise –

(a) We will first search that character in the second string, by removing the last character of it and calling recursion to the modified strings.

(b) Then, we will also search that character in the first string, by removing the last character of it and calling recursion to the modified strings.

At last whichever case gives us maximum LCS, we will take that case.

Let us see the code to understand it clearly.

int LCS(string text1, string text2, int m, int n)
{
	if(m==0 || n ==0) return 0;
	
	if(text1[m-1] == text2[n-1]) {
		return 1 + LCS(text1, text2, m-1, n-1);
	}
	else return max(LCS(text1, text2, m-1, n),LCS(text1, text2, m, n-1);
}
	
int longestCommonSubsequence(string text1, string text2) {
	int m = text1.length();
	int n = text 2.length();
	
	return LCS(text1, text2, m, n);
}

Let us understand the code more clearly,  Let take another example –
text1 = “abc” and text2 = “act”

dynammic programming

This is how recursion will take place, Hope this diagram makes your concept clear and make you visualize what actually is happening.

Analysis of Naive Solution

The above algorithm computes all possible subproblems and hence time complexity of the above algorithm is O(2^n).

Can we reduce its Complexity?

If you observe carefully, there are overlapping subproblems present in the above recursion tree. The Circular mark shows that we are calculating subproblem (ab, a) twice.

It is of no point to calculate the same thing again and again. It increases our overall time complexity. In such a situation, Dynamic Programming comes into play.

Dynamic Programming Solution

In the above solution, we observe that we are calculating the repeated things again and again (Overlapping Subproblems).

We can solve this overlapping problem by storing the solution of sub problems into some array or hash map, so that whenever we need the solution of that subproblem again, we can use this data instead of calculating it again. This technique is called Memoization.

See the code for better explanation.

int dp[MAX][MAX];
int LCS(string text1, string text2, int m, int n)
{
        if(dp[m][n] != -1) return dp[m][n];
        
        if(m==0 || n ==0) dp[m][n] = 0;

        else if(text1[m-1] == text2[n-1]) {
            dp[m][n] = 1 + LCS(text1, text2, m-1, n-1);
        }
        else dp[m][n] = max(LCS(text1,text2,m-1,n),LCS(text1,text2,m,n-1));
        return dp[m][n];
}

This above algorithm takes simply O(m*n) time complexity.

This is a top-down Solution of longest subsequence problem, here we have initialized a 2-D Array of size MAX, we have also initialized it to -1 initially, and later we store the solution of subproblems in it so that whenever we need it later, instead of again computing it, we can use it again.

Bottom-Up Approach (Tabulation)

We have already discussed Top-Down Approach for Solving this above problem. Now we will discuss its Tabulation Solution.

In Tabulation Solution, we make a 2-D array of size [m+1][n+1], (m and n is the size of string text1 and text2 respectively. We fill this array in the bottom up-approach.

One thing to note here is we are taking one extra space in the 2-D array, the reason for taking an extra space is to store the base condition.

Let’s see the code.

int longestCommonSubsequence(string text1, string text2) {
        
        int m = text1.length(), n = text2.length();
        int dp[m+1][n+1];
        
       	memset(dp,0,sizeof(dp));
       	for(int i = 1; i <= m; i++)
       	{
           	for(int j = 1; j <= n; j++)
            	{
                	if(text1[i-1] == text2[j-1]) {
                                dp[i][j] = 1 + dp[i-1][j-1];
                        }
                	else {
                    		dp[i][j] = max(dp[i-1][j], dp[i][j-1]);
                	}
            	}
        }
        return dp[m][n];
}

Let us understand the code more clearly,

The idea is similar to the top-down approach. In the top-down approach or memoized based solution, we were comparing the last character of both the strings and if it is found the same, we were incrementing the count by 1 and recursively calling the modified string by removing its last character.

For function LCS(string text1, string text2, m, n) , we were recursively calling for LCS(text1,text2,m-1,n-1). Here LCS means longest common subsequence

Similarly in the bottom-up approach, if both the last character matches we just add 1 to dp[i-1][j-1].

For the second case, when the last character of both strings were not matching, in the  top down approach we were recursively calling LCS(text1,text2,m,n-1) and LCS(text2,text2,m-1,n) and storing its maximum in dp[][]  array.

In the same way in bottom up approach, when both the last characters don’t match we find max(dp[i][j-1], dp[i-1][j]) and store it in current dp[i][j].

Both solutions look quite similar, the only difference is that in the top-down approach, we start comparing characters of both strings from the end, and in top-down approach, we start comparing it from the beginning.

Let us understand it more clearly by taking an example,

text1 = “abc” and text2 = “act”

dynammic programming

We will create a 2D array of size 4*4. Initially, we will initialize all the elements of the 2-D array to 0.

We will start iterating from i = 1 and j = 1, since both the characters matches (‘a’ from “abc” and ‘a’ from “act”) , we will increment dp[i][j]=1+dp[i-1][j-1].

Similarly, we will move ahead by comparing every character of both the strings and fill the 2D array by following the bottom-up Algorithm. At last, dp[m][n] will give us the required solution.


I hope both the solutions are clear, both the solutions (Memoised and Tabular) takes O(m.n) time complexity which is far better than exponential time complexity.

This is how Dynamic Programming works, first of all you have to come up with a recursion based solution and then you have to write a DP solution to avoid overlapping subproblems by any of the two way discussed above.

That’s all folks..!!!(longest subsequence problem)

Tagged : /

Dynamic Programming

Solving Dynamic Problem Easy

Dynamic Programming (DP) is a technique that solves a few specific form of problems in Polynomial Time. Dynamic Programming solutions are quicker than the exponential brute method and can be effortlessly proved for their correctness. Before we take a look at the way to suppose Dynamically for a problem, we want to study:

Overlapping Subproblems
Optimal Substructure Property

Steps to clear up a DP
1) Identify if it’s miles a DP hassle
2) Decide a nation expression with
least parameters
3) Formulate state dating
four) Do tabulation (or add memoization)

Step 1 : How to categorize a problem as a Dynamic Programming Problem?

Typically, all of the issues that require to maximize or reduce certain quantity or counting problems that say to matter the preparations underneath certain circumstance or positive chance troubles can be solved through the usage of Dynamic Programming.
All dynamic programming troubles satisfy the overlapping subproblems belongings and maximum of the traditional dynamic troubles also satisfy the most reliable substructure belongings. Once, we have a look at these residences in a given problem, be sure that it may be solved the use of DP.

Step 2 : Deciding the kingdom

DP issues are all approximately kingdom and their transition. This is the most simple step which need to be performed very carefully because the state transition relies upon on the choice of country definition you make. So, permit’s see what can we suggest by the time period “country”.

State A nation can be defined as the set of parameters that can uniquely perceive a certain position or status within the given problem. This set of parameters should be as small as feasible to lessen state area.

For example: In our well-known Knapsack hassle, we define our country by using two parameters index and weight i.E DP[index][weight]. Here DP[index][weight] tells us the most income it can make with the aid of taking objects from variety 0 to index having the potential of sack to be weight. Therefore, here the parameters index and weight together can uniquely identify a subproblem for the knapsack problem.

So, our first step might be identifying a kingdom for the problem after figuring out that the trouble is a DP hassle.

As we recognize DP is all approximately the usage of calculated results to formulate the very last end result.
So, our next step can be to find a relation among previous states to attain the modern state.

Step three : Formulating a relation most of the states

This part is the toughest a part of for solving a DP problem and requires quite a few instinct, remark and exercise. Let’s apprehend it by thinking about a sample trouble

Given three numbers 1, three, five, we want to inform
the entire variety of methods we will form a variety of ‘N’
the use of the sum of the given 3 numbers.
(allowing repetitions and extraordinary preparations).

Total variety of ways to form 6 is: 8
1+1+1+1+1+1
1+1+1+three
1+1+three+1
1+three+1+1
three+1+1+1
3+3
1+five
five+1

Let’s think dynamically about this problem. So, to begin with, we determine a country for the given trouble. We will take a parameter n to decide country as it can uniquely pick out any subproblem. So, our country dp will appear like nation(n). Here, country(n) manner the entire variety of arrangements to form n through the usage of 1, 3, five as elements.

Now, we want to compute state(n).

How to do it?
So right here the intuition comes into movement. As we are able to most effective use 1, 3 or 5 to form a given quantity. Let us expect that we recognize the end result for n = 1,2,3,4,five,6 ; being termilogistic let us say we understand the result for the
state (n = 1), nation (n = 2), state (n = three) ……… kingdom (n = 6)

Now, we want to recognise the end result of the state (n = 7). See, we can simplest add 1, three and 5. Now we can get a sum general of 7 by the following three approaches:

1) Adding 1 to all feasible mixtures of state (n = 6)
Eg : [ (1+1+1+1+1+1) + 1]
[ (1+1+1+3) + 1]
[ (1+1+3+1) + 1]
[ (1+3+1+1) + 1]
[ (3+1+1+1) + 1]
[ (3+3) + 1]
[ (1+5) + 1]
[ (5+1) + 1]

2) Adding three to all possible mixtures of kingdom (n = four);

Eg : [(1+1+1+1) + 3]
[(1+3) + 3]
[(3+1) + 3]

three) Adding five to all possible mixtures of kingdom(n = 2)
Eg : [ (1+1) + 5]

Now, assume carefully and fulfill your self that the above 3 instances are overlaying all feasible methods to form a sum overall of 7;

Therefore, we are able to say that end result for
country(7) = nation (6) + country (four) + nation (2)
or
country(7) = kingdom (7-1) + kingdom (7-three) + kingdom (7-5)

In standard,
nation(n) = kingdom(n-1) + nation(n-three) + state(n-5)

So, our code will look like:
filter_none

edit

play_arrow

brightness_4
// Returns the number of preparations to
// form ‘n’
int clear up(int n)

// base case
if (n < zero)
go back 0;
if (n == 0)
return 1;

go back remedy(n-1) + remedy(n-3) + resolve(n-five);

The above code appears exponential as it’s miles calculating the same kingdom time and again. So, we simply want to add a memoization.

Step 4 : Adding memoization or tabulation for the state in Dynamic Problem

This is the easiest part of a dynamic programming solution. We simply need to shop the state solution so that subsequent time that state is needed, we can at once use it from our reminiscence

Adding memoization to the above code
// initialize to -1
int dp[MAXN];

// this function returns the wide variety of
// preparations to shape ‘n’
int remedy(int n)

// base case
if (n < 0)
return zero;
if (n == 0)
return 1;

// checking if already calculated
if (dp[n]!=-1)
go back dp[n];

// storing the end result and returning
return dp[n] = solve(n-1) + resolve(n-three) + remedy(n-five);

Another manner is to add tabulation and make a solution iterative. Please refer to tabulation and memoization for more information.

Dynamic Programming comes with lots of exercises. One has to try fixing diverse traditional DP issues that may be determined here. You may additionally take a look at the beneath troubles first and strive to fix them the use of the above-described steps:-

http://www.Spoj.Com/issues/COINS/
http://www.Spoj.Com/troubles/ACODE/