Biomedical Computation

Fasta Subsample

Description:

Build a random part of the sequence orders in a FASTA formatted file. The random seed is fixed so the same part will be released in every circle of the program unless it is set.

Input:

Takes a FASTA file and the count of sequence orders to randomly opt .

Options:

  • seed . Seed the random number generator uses to select the sequences; default: 1;
  • rest . The file’s name to deliver the sequence orders not chosen in the output; default: none;
  • off . the offset within each sequence to print; default: 1 (no offset);
  • len . the maxiumum length that printed sequences are constrained to; default: print entire sequence.

Output:

Writes a FASTA patterned file to standard out having the defined subsample of the genuine file. If -rest is defined then any left over sequence orders are written to, which is efficient for cross-validation.