2021. 1. 30. 21:04

This is the hint for "Fixing an Inconsistent Character Set"

http://rosalind.info/problems/cset/

 

The most important concept you will use is:

"Two splits S1|Sc1 and S2|Sc2 conflict when all four intersections S1 ∩ S2, S1 Sc2, Sc1 S2, and Sc1 Sc2 are nonempty."

 

Let's say we are splitting a set with six elements: 1, 2, 3, 4, 5, and 6.

The splits can be {1, 2} | {3, 4, 5, 6} or {1, 4, 5} | {2, 3, 6} etc.

 

We will check if two splits conflict using the concept above. 

Let's say S1 = {1, 2}, Sc1 = {3, 4, 5, 6}, S2 = {1, 4, 5}, and Sc2 = {2, 3, 6}.

 

S1 ∩ S2 = {1, 2}{1, 4, 5} = {1}

S1 ∩ Sc2 = {1, 2}{2, 3, 6} = {2}

Sc1 ∩ S2 = {3, 4, 5, 6} {1, 4, 5} = {4, 5}

Sc1 ∩ Sc2 = {3, 4, 5, 6}{2, 3, 6} = {3, 6}

 

As you can see, all four intersections are non-empty. Thus, those two splits conflict.

The other two splits {1, 2} | {3, 4, 5, 6} and {1, 2, 3} | {4, 5, 6} do not conflict because

 

S1 ∩ S2 = {1, 2}

S1 ∩ Sc2 = {}  ← Empty!

Sc1 ∩ S2 = {3}

Sc1 ∩ Sc2 = {4, 5, 6}

 

Now, look at the sample dataset:

100001
000110
111000
100111

We will number the columns as 1, 2, 3, 4, 5, 6 and re-write them as splits. Note it is not important to distinguish between 0 and 1.

100001 → {1, 6} | {2, 3, 4, 5}

000110 → {1, 2, 3, 6} | {4, 5}

111000 → {1, 2, 3} | {4, 5, 6}

100111 → {1, 4, 5, 6} | {2, 3}

 

Then, find a row that conflicts with more than one other row.

 

Conflicts between rows. X indicates "conflict"

Only row 1 and row 3 conflict. Thus, you can delete either one of the rows. Note you will need to remove only one row. This will make computation easier.

 

Please click the heart button to support this blog!

'English > Rosalind' 카테고리의 다른 글

Rosalind hint - Fixing an Inconsistent Character Set  (0) 2021.01.30