This is the hint for "Fixing an Inconsistent Character Set"
http://rosalind.info/problems/cset/
The most important concept you will use is:
"Two splits S1|Sc1 and S2|Sc2 conflict when all four intersections S1 ∩ S2, S1 ∩ Sc2, Sc1 ∩ S2, and Sc1 ∩ Sc2 are nonempty."
Let's say we are splitting a set with six elements: 1, 2, 3, 4, 5, and 6.
The splits can be {1, 2} | {3, 4, 5, 6} or {1, 4, 5} | {2, 3, 6} etc.
We will check if two splits conflict using the concept above.
Let's say S1 = {1, 2}, Sc1 = {3, 4, 5, 6}, S2 = {1, 4, 5}, and Sc2 = {2, 3, 6}.
S1 ∩ S2 = {1, 2} ∩ {1, 4, 5} = {1}
S1 ∩ Sc2 = {1, 2} ∩ {2, 3, 6} = {2}
Sc1 ∩ S2 = {3, 4, 5, 6} ∩ {1, 4, 5} = {4, 5}
Sc1 ∩ Sc2 = {3, 4, 5, 6} ∩ {2, 3, 6} = {3, 6}
As you can see, all four intersections are non-empty. Thus, those two splits conflict.
The other two splits {1, 2} | {3, 4, 5, 6} and {1, 2, 3} | {4, 5, 6} do not conflict because
S1 ∩ S2 = {1, 2}
S1 ∩ Sc2 = {} ← Empty!
Sc1 ∩ S2 = {3}
Sc1 ∩ Sc2 = {4, 5, 6}
Now, look at the sample dataset:
100001
000110
111000
100111
We will number the columns as 1, 2, 3, 4, 5, 6 and re-write them as splits. Note it is not important to distinguish between 0 and 1.
100001 → {1, 6} | {2, 3, 4, 5}
000110 → {1, 2, 3, 6} | {4, 5}
111000 → {1, 2, 3} | {4, 5, 6}
100111 → {1, 4, 5, 6} | {2, 3}
Then, find a row that conflicts with more than one other row.
Only row 1 and row 3 conflict. Thus, you can delete either one of the rows. Note you will need to remove only one row. This will make computation easier.
---
I don't put ads in my blog. If this post helps you, please consider donating a small amount of money that will be a huge motivation.
https://www.buymeacoffee.com/harryincupboard