Less Sequence: 七月 2015

When using fastq2bfq command covert .fastq to .bfq, sometimes it gives error/warning:

[seq_read_fastq] Inconsistent sequence name: @XXXXXXXXX. Continue anyway.

This terminal printout slows down the file conversion, the possible solution is to remove the content after '+' in every third-line [1], like this:

@ the header info

ATCGATCG...

quality scores....

a easy python code to remove the content after '+' of 3rd line and write everything in another file:

======================================
#!/usr/bin/env python

writer = open("new_fastq_file.fastq", 'w')
with open("original_fastq_file.fastq") as f:
for line in f:
# change '+SRR' to the first 4~5 letters in 3rd line of your fastq file.
if '+SRR' in line:
line = '+\n'
writer.write(line)

======================================

[1] http://sourceforge.net/p/maq/mailman/maq-help/thread/4D9A9EFD.70104@cb.k.u-tokyo.ac.jp/

Less Sequence

2015年7月8日星期三

MAQ: "Inconsistent sequence name" in fastq2bfq