Solution: Count the Number of Good Subsequences
Let's solve the Count the Number of Good Subsequences problem using the Dynamic Programming pattern.
We'll cover the following
Statement
Count and return the number of good subsequences in the given string s
. You may return the modulo
A subsequence is a sequence formed from another sequence by deleting some or no elements while keeping the order of the remaining elements unchanged.
A good subsequence is a subsequence of a string if it is not empty and the frequency of each character is the same.
Constraints:
s.length
s
will only contain lowercase English characters.
Solution
The solution to this problem involves calculating the combinations. We will start by calculating the frequency of each character and then iterate over the frequencies, from 1 to the highest frequency of any character. We will count subsequences where each character appears for some or all of its frequency, i.e., calculating combinations. The count is summed up in each iteration, and after the loop terminates, 1 is deducted from the final count. The reason for deducting 1 is that the empty subsequence will also be counted. Because empty subsequence does not qualify as a good subsequence, we will remove its count from the final calculation.
Computing the combinations is a typical
Because calculating the factorial[i]
requires calculating the [factorial[i-1]..factorial[2]]
again and again, therefore we will use dynamic programming to calculate the factorial[i]
by multiplying i
only with factorial[i — 1].
That is, we will reuse the existing results instead of calculating them again.
The algorithm to solve this problem is as follows:
Initialize two arrays:
factorials
andinverses
of sizen + 1
, an integer array,character_frequencies
, of the same size to count the frequencies of each character and an integer variablefinalCount
with. Start a loop from
to n + 1
. Inside the loop, compute the factorial and inverses of eachith
number and store it infactorials[i]
andinverses[i]
, respectively.Start a loop from
to n + 1
and count the frequencies of each character ofs
. Moreover, keep the record of maximum frequency and save it inmax_frequency
.Start a nested loop, where the outer loop, controlled by
i
, runs formax_frequency
times and the inner loop, controlled byj
runs fortimes. Inside the outer loop, initialize a variable
count
with the value. Perform the following steps Inside the inner loop:Compute the number of combinations, that is,
n
choosek
, wheren
is the value ofcharacter_frequency[j]
, andk
is the value ofi
. The number of combinations can be calculated with(factorials[n] * inverses[k] % MOD) * inverses[n-k] % MOD
.Add
to the number of combinations. This is because we consider the possibility of not choosing a character when constructing subsequences. In other words, for each character that can be included in a subsequence, there are two choices: either include the character or do not. Adding captures these two possibilities. For example, if we have the string , we can either choose one and one or one and no , or no and one , or neither nor . If we don't add to our calculation, we would miss the possibility of having a non-empty subsequence that excludes both and . Update the
count
by multiplying this calculation with the current value of thecount
.
Deduct
from the count
, add it to the current value offinal_count
, and take the modulus of this value. Update thefinal_count
value with this new value. The reason to deductfrom the count
is that we need to exclude the empty subsequence.
After the loop terminates, the value of
final_count
holds the number of good subsequences in the input strings
.
Let’s look at the following illustration to get a better understanding of the solution:
Level up your interview prep. Join Educative to access 80+ hands-on prep courses.