Codility ‘GenomicRangeQuery’ Solution

Short Problem Definition:

Find the minimal nucleotide from a range of sequence DNA.

Link

GenomicRangeQuery

Complexity:

expected worst-case time complexity is O(N+M);

expected worst-case space complexity is O(N)

Execution:

Remember the last position on which was the genome (A, C, G, T) was seen. If the distance between Q and P is lower than the distance to the last seen genome, we have found the right candidate.

Solution:
def writeCharToList(S, last_seen, c, idx):
    if S[idx] == c:
        last_seen[idx] = idx
    elif idx > 0:
        last_seen[idx] = last_seen[idx -1]

def solution(S, P, Q):
    
    if len(P) != len(Q):
        raise Exception("Invalid input")
    
    last_seen_A = [-1] * len(S)
    last_seen_C = [-1] * len(S)
    last_seen_G = [-1] * len(S)
    last_seen_T = [-1] * len(S)
        
    for idx in xrange(len(S)):
        writeCharToList(S, last_seen_A, 'A', idx)
        writeCharToList(S, last_seen_C, 'C', idx)
        writeCharToList(S, last_seen_G, 'G', idx)
        writeCharToList(S, last_seen_T, 'T', idx)
    
    
    solution = [0] * len(Q)
    
    for idx in xrange(len(Q)):
        if last_seen_A[Q[idx]] >= P[idx]:
            solution[idx] = 1
        elif last_seen_C[Q[idx]] >= P[idx]:
            solution[idx] = 2
        elif last_seen_G[Q[idx]] >= P[idx]:
            solution[idx] = 3
        elif last_seen_T[Q[idx]] >= P[idx]:
            solution[idx] = 4
        else:    
            raise Exception("Should never happen")
        
    return solution
    

If you enjoyed this post, then make sure you subscribe to my Newsletter and/or Feed.

Facebooktwittergoogle_plusredditpinterestlinkedin
  • Joseph

    Hi,
    Thanks for your solution. I have tried it and it does receive 100%. In your code you are creating 4 new arrays (last_seen_A,C,G,T) and each has N elements. Doesn’t this break requirement of space complexity? The challenge requires O(N) space complexity.

    Also shouldn’t solution involve Prefix Sums since it is in the Lesson 3 batch?
    Thanks for the answers

    • Hi Joseph,

      it is valid. The O notation drops all constants. Therefore 4N becomes O(N). The same holds true for situations such as 10n^2 + 100n -> O(N^2).

      The second part of your question: I am using a dynamic programming prefix. It is not a sum, but it is prefix-ish, right?

  • BG

    Thank you for your codes..really helpful
    What about this code

    #include

    // you can write to stdout for debugging purposes, e.g.
    // cout << "this is a debug message" << endl;
    typedef struct nuc_arr{
    int A;
    int C;
    int G;
    int T;
    nuc_arr():A(0),C(0),G(0),T(0){};
    nuc_arr(int a, int c,int g,int t):A(a),C(c),G(g),T(t){};
    }nuc_arr;

    vector solution(string &S, vector &P, vector &Q) {
    // write your code in C++11
    std::vector maps((int)S.size()+1);

    nuc_arr * cum_item = new nuc_arr();
    nuc_arr * init_item = new nuc_arr(0,0,0,0);
    maps[0] = init_item;
    for(int i=0; iA,cum_item->C,cum_item->G,cum_item->T);
    switch (S[i])
    {
    case ‘A’:
    map_item->A = ++(cum_item->A);
    break;
    case ‘C’:
    map_item->C = ++(cum_item->C);
    break;
    case ‘G’:
    map_item->G = ++(cum_item->G);
    break;
    case ‘T’:
    map_item->T = ++(cum_item->T);
    break;
    };
    maps[i+1] = map_item;
    }
    // now fill result
    std::vector result;
    for(int i=0; iA – maps[P[i]]->A > 0)
    result.push_back(1);
    else if (maps[Q[i]+1]->C – maps[P[i]]->C > 0)
    result.push_back(2);
    else if (maps[Q[i]+1]->G – maps[P[i]]->G > 0)
    result.push_back(3);
    else if (maps[Q[i]+1]->T – maps[P[i]]->T > 0)
    result.push_back(4);
    }
    return result;
    }

    • Hi BG,

      it tends to be hard to read code in discus comments. The formatting is off. Anyways, the code is more or less the same.

      Just a C++ question? why do you allocate all items on the heap as pointers?

  • Tito

    Obj-C

    int findMin(NSString *S, int start, int stop) {
    NSDictionary *impacts = @{@”A”:@(1), @”C”:@(2), @”G”:@(3), @”T”:@(4)};
    int min = 10;

    for (int cix = start; cix <= stop; cix++) {
    NSString *currChar = [S substringWithRange:NSMakeRange(cix, 1)];
    int currVal = [impacts[currChar] intValue];

    if(currVal < min) min = currVal;
    }

    return min;
    }

    NSMutableArray * solution(NSString *S, NSMutableArray *P, NSMutableArray *Q) {
    // write your code in Objective-C 2.0
    NSMutableArray *output = [NSMutableArray arrayWithCapacity:[S length]];

    for (int ix = 0; ix < [P count]; ix++) {
    int start = [P[ix] intValue];
    int end = [Q[ix] intValue];
    int minInSeg = findMin(S, start, end);
    [output addObject:@(minInSeg)];
    }

    return output;
    }