Abstract:
In this thesis, we investigate the performance of available methods and tools for sequence
alignment in assembly and de Bruijn graphs. Sequence alignment tools are employed to
detect antimicrobial (AMR) gene sequences within the assembly graphs. Utilizing precise
and efficient tools for identifying these genes enables us to locate their neighboring genes
and evidence of horizontal gene transfer (HGT) more accurately. To this end, we have
considered three sequence alignment tools namely Bandage, SPAligner and GraphAligner. The tools have similar input and output types. The outputs are analyzed
qualitatively and quantitatively using Panda, Numpy and GFA libraries in Python. The
paths returned by each pair of tools for each query are compared to measure the similarity
between them. Furthermore, the output sequences from each software are compared to
the target sequence using a modified version of edit distance. It was seen that Bandage
was the most efficient and precise tool, followed by GraphAligner and then SPAligner for
the datasets tested.