[ FunGenePipeline | FunGene | RDPipeline | RDP ]
This tool takes one or more unaligned sequence files and removes exact duplicates. The output is in the root of the result archive in the file named "all_seqs_derep.fasta". In addition, an id and sample mapping are returned. The id mapping is a list of all the unique sequences, one per line, and all the ids that have that sequence string. The sample mapping is a list of seq ids and the name of the sample (in this case, the name of the file) the sequence came from. Any tool that takes a sample file uses this format.