Skip to content

Only intern in case of typo when looking for one or two typoes #5555

Open
@ManyTheFish

Description

@ManyTheFish

We recently found a crash in the search due to the words stored in the search cache when computing the terms with one typo.
This was fixed in: #5551

However, we may have a similar issue in the below function:

while let Some((derived_word, state)) = stream.next() {
let derived_word = std::str::from_utf8(derived_word)?;
let derived_word_interned = word_interner.insert(derived_word.to_owned());
// in the case the typo is on the first letter, we know the number of typo
// is two
if get_first(derived_word) != get_first(word) {
let cf = visit(derived_word_interned, NumberOfTypos::Two)?;
if cf.is_break() {
break;
}
} else {
// Else, we know that it is the second dfa that matched and compute the
// correct distance
let d = second_dfa.distance((state.1).0);
match d.to_u8() {
0 => (),
1 => {
let cf = visit(derived_word_interned, NumberOfTypos::One)?;
if cf.is_break() {
break;
}
}
2 => {
let cf = visit(derived_word_interned, NumberOfTypos::Two)?;
if cf.is_break() {
break;
}
}
_ => unreachable!("2 typos DFA produced a distance greater than 2"),
}
}
}

Originally posted by @ManyTheFish in #5551 (comment)

Technical approach

Based on the fix made in the previous PR, amend the linked function to store the words only if it's necessary (before calling the visit closure).

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working as expectedgood first issueGood for newcomers

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions