String Segmentation Matching for Dynamic Programming Problems

Foreword

The question is as follows: There is an encrypted string s[1....n], and there is a codebook that records the original-ciphertext data of the key-value type. If the time of the table is Invariant, the result of the lookup table is treated as f(w), where w is a string, if it matches, it returns f(w), if it does not match, it returns null, try dynamic programming to determine the string. Whether s[1...n] can be decrypted.

Thought

First abstract the problem into a string segmentation problem, given a string s, given a dictionary d, and split s into any number of words. If it can be found in d, the match is successful. If any split mode cannot be achieved, the match fails.

So the question is coming, what is dynamic planning?

Check the information to learn about it and see the answer) of Zhihu.

The next step is to solve. One of the characteristics of dynamic programming is that it is no longer necessary to calculate the repeated sub-problems. Suppose the string s is equal to [ethiothionvisyoevvzsed], a total of 22 words, and the dictionary d is {ethio:hello , thio:guys,nvis:this,yo:is,evvz:good,sed:day}, assuming that b(x) is a boolean type judgment function, the flow is as follows:

  1. First calculate b(e) and return false because f(e)==null
  2. Then calculate b(et) and return false because b(e)==false&&f(t)==null
  3. Similarly, b(eth)==false, b(ethi)==false
  4. b(ethio)==true,becausef(ethio)!=null
  5. b(ethiot)==false,becauseb(ethio)&&b(t)==false
  6. Similarly, b(ethiothio)==true, because b(ethio)&(thio)==true

The subsequent steps are similar, omitted and not written. That is, when b(string) returns true, then the string or its substring can be found in d. Use pseudocode to achieve the following:

1
2
3
4
5
6
7
8
9
10
11
12
13
if d contain string x, b(x)==true, else b(x)=false

boolean[] booList =[false,false,...,false]

for i in range(0, len(s) + 1):
if b(s[0:i]) == True:
booList[i] = True
for j in range(0, i):
if booList[j] && b(s[j:i]) == True:
booList[i] = True
break

return booList[len(s)]

The time complexity is O(n^2).

Here is the Python3 code that can be run:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
#!/usr/bin/python
# -*- coding: UTF-8 -*-


def b(str):
if str == "ethio":
return True
if str == "thio":
return True
if str == "nvis":
return True
if str == "yo":
return True
if str == "evvz":
return True
if str == "sed":
return True
else:
return False


s = "ethiothionvisyoevvzsed"

booList = [False for n in range(len(s) + 1)]

for i in range(0, len(s) + 1):
if(b(s[0:i])):
booList[i] = True
for j in range(0, i):
if(booList[j] and b(s[j:i])):
booList[i] = True
break

print(booList[len(s)])

The result is True