Open
Description
I understand a variant growing bigger if the destination reference has an insertion, but shouldn't it be put back when it goes the other way?
original_hgvs = "NM_015120.4(ALMS1):c.36_38dupGGA"
def print_hgvs(sv):
length = sv.posedit.pos.end - sv.posedit.pos.start
print(f"hgvs='{sv}' - {length=}")
var_c = parse(original_hgvs)
print_hgvs(var_c)
var_g = c_to_g(var_c)
print_hgvs(var_g)
var_c2 = g_to_c(var_g, var_c.ac)
print_hgvs(var_c2)
Output:
hgvs='NM_015120.4(ALMS1):c.36_38dup' - length=2
hgvs='NC_000002.12:g.73385937_73385942dup' - length=5
hgvs='NM_015120.4:c.72_77dup' - length=5
Normlization?
I noticed that if you normalize this 1st, the problem goes away.
I think this is because normalization shifts the variant away from the gap. But this shouldn't matter? If you do need to normalize before projection then perhaps we should automatically do this or raise a warning or error if not normalized?
var_c_orig = parse(original_hgvs)
var_c = normalize(var_c_orig)
print(f"Normalized: {var_c_orig} => {var_c}")
print_hgvs(var_c)
var_g = c_to_g(var_c)
print_hgvs(var_g)
var_c2 = g_to_c(var_g, var_c.ac)
print_hgvs(var_c2)
Output:
Normalized: NM_015120.4(ALMS1):c.36_38dup => NM_015120.4:c.75_77dup
hgvs='NM_015120.4:c.75_77dup' - length=2
hgvs='NC_000002.12:g.73385940_73385942dup' - length=2
hgvs='NM_015120.4:c.75_77dup' - length=2
Note - while searching issues I found discussion about alignment gaps (on this transcript!) on #514