jade.basic.sequence package¶

jade.basic.sequence.ClustalRunner module¶

class jade.basic.sequence.ClustalRunner.ClustalRunner(fasta_path, clustal_name='clustal_omega', clustal_dir=None)[source]¶

A very simple class wrapper to run clustal omega.

output_alignment(out_dir, out_name, parellel_process=False)[source]¶: Configure command line and Run Clustal Omega

set_extra_options(extra_options='')[source]¶: Set any extra options as a string which will be added to the end of the command line.

set_fasta_path(fasta_path)[source]¶: Set the fasta path for alignment.

set_hard_wrap(hard_wrap)[source]¶: Set the number of charactors before clustal will wrap. Usually 60-80.

set_output_format(output_format)[source]¶: Set the output format

set_threads(threads)[source]¶: Limit the number of threads for Clustal

jade.basic.sequence.PDBConsensusInfo module¶

class jade.basic.sequence.PDBConsensusInfo.PDBConsensusInfo(resinfo_list)[source]¶

Class to compute frequency and probability from an array of PDBInfo classes. The sequences within PDBInfo do not necessarily need to be the same length. A given sequence position is identified and stored in the data maps by its [pdb_num, chain, and icode] -> Use get_position_from_residue(residue) to get this position from a Residue instance.

compute_stats()[source]¶: Compute frequency and probability (0-1) for each position for each amino acid

get_all_sorted_positions()[source]¶

get_consensus(residue)[source]¶

get_consensus_for_position(position)[source]¶

get_consensus_for_residues(residue_list)[source]¶: Get the consensus for an ORDERED list of Residues

get_consensus_sequence()[source]¶

get_frequency(residue, aa)[source]¶

get_frequency_for_position(position, aa)[source]¶

get_position_from_residue(residue)[source]¶

get_probability(residue, aa)[source]¶: Get probability of the current position (starting from 0) and aa

get_probability_for_position(position, aa)[source]¶

init_data_map()[source]¶: Sets all probabilities 0 and appends each map to the stats vector

output_seqlogo(outdir, outname, clustalpath=None)[source]¶

output_seqlogo_bt_residues(outdir, outname, res1, res2, chain)[source]¶

output_seqlogo_for_regions(regions, outdir, outname, chain)[source]¶: Regions is an array of Regions classes. Basically start/stop points

return_initialized_total_map()[source]¶

set_sequences(pdb_info_list)[source]¶: Set a sequence list

jade.basic.sequence.SequenceInfo module¶

class jade.basic.sequence.SequenceInfo.SequenceInfo[source]¶

Simple class for holding + accessing sequence metadata

Original class for sequence info. Basically deprecated by SequenceStats and PDBConsensusInfo.

get_chain()[source]¶

get_end_residue()[source]¶

get_length()[source]¶

get_pdbID()[source]¶

get_pdbpath()[source]¶

get_region()[source]¶

get_residue(resnum)[source]¶: If region is given, resnum is residue number of PDB If not, resnum in Rosetta resnum

get_sequence()[source]¶

get_start_residue()[source]¶

set_pdbID(pdbID)[source]¶

set_pdbpath(pdbpath)[source]¶

set_region(region)[source]¶

set_sequence(sequence)[source]¶

jade.basic.sequence.SequenceResults module¶

class jade.basic.sequence.SequenceResults.SequenceResults[source]¶

Simple class for holding, calculating, + accessing result data Residue Numbers are in Rosetta numbering.

Original class for sequence stats. Basically deprecated by SequenceStats and PDBConsensusInfo.

add_reference_residue(resnum, one_letter_code)[source]¶

add_residue(resnum, one_letter_code, decoy)[source]¶

get_all_mutated_positions()[source]¶

get_all_reference_percent_observed()[source]¶: Returns array of tripplets of [postion, one_letter_code, percent] of reference amino acid found.

get_all_residue_numbers()[source]¶

get_all_residues_observed(resnum)[source]¶

get_decoys_with_aa(resnum, one_letter_code)[source]¶: Returns all decoys with a specific mutation at a position.

get_decoys_with_joint_aa(resnum_one_letter_code_pair)[source]¶: Will output decoys that have x, y, z mutations at positions a, b, c

get_freq(resnum, one_letter_code)[source]¶

get_percent(resnum, one_letter_code)[source]¶

get_percent_string(resnum, one_letter_code)[source]¶

get_reference_residue(resnum)[source]¶

get_total(resnum)[source]¶

jade.basic.sequence.SequenceStats module¶

class jade.basic.sequence.SequenceStats.SequenceStats(sequence_list)[source]¶

Class for getting data from an array of strings of sequences (one letter code) of equal length.

compute_stats()[source]¶: Compute frequency and probability (0-1) for each position for each amino acid

get_consensus_sequence()[source]¶

get_frequency(position, aa)[source]¶

get_probability(position, aa)[source]¶: Get probability of the current position (starting from 0) and aa

init_data_map()[source]¶: Sets all probabilities 0 and appends each map to the stats vector

return_initialized_total_map()[source]¶

set_sequences(sequence_list)[source]¶: Set a sequence list

jade.basic.sequence.fasta module¶

jade.basic.sequence.fasta.chain_fasta_files_from_biostructure(structure, prefix, outdir)[source]¶

jade.basic.sequence.fasta.chain_fasta_files_from_pose(pose, prefix, outdir)[source]¶: Creates fasta for each chain in the pose. Returns a list of paths for each fasta.

jade.basic.sequence.fasta.chain_fasta_from_biostructure(structure, outname, outdir)[source]¶: Creates a single fasta from biopython structure, split by individual chains.

jade.basic.sequence.fasta.chain_fasta_from_pose(pose, outname, outdir)[source]¶: Creates a single fasta from pose, split by individual chains.

jade.basic.sequence.fasta.fasta_from_pose(pose, fasta_label, outname, outdir)[source]¶: Creates a fasta from the pose.

jade.basic.sequence.fasta.fasta_from_sequences(sequences, outdir, outname)[source]¶: Output a general fasta, with tag being 1_outname etc. Use write_fasta for more control. Returns path to Fasta File written

jade.basic.sequence.fasta.get_biochain_sequence(bio_chain)[source]¶

jade.basic.sequence.fasta.get_label_from_fasta(fasta_path)[source]¶: Gets the first chainID found - Should be a single chain fasta file.

jade.basic.sequence.fasta.get_sequence_from_fasta(fasta_path, label)[source]¶

jade.basic.sequence.fasta.output_fasta_from_pdbs_biopython(path_header_dict, out_path, native_path=None, native_label='native', is_camelid=False)[source]¶: Used only for L and H chains! Concatonates the L and H in order if present, otherwise assumes camelid at H.

jade.basic.sequence.fasta.output_weblogo(alignment_path, outdir, outname, tag='Dunbrack Lab - Antibody Database Team')[source]¶

jade.basic.sequence.fasta.output_weblogo_for_sequences(sequences, outdir, outname, tag='Dunbrack Lab - Antibody Database Team')[source]¶

jade.basic.sequence.fasta.read_header_data_from_fasta(fasta_path)[source]¶: Reads > from fasta (PDBAA) and returns a defaultdict of pdb_chain: [method, residues, resolution, R factor]

jade.basic.sequence.fasta.split_fasta_from_fasta(fasta_path, prefix, outdir)[source]¶: If we have a multiple fasta sequence, we split it into idividual files. Makes analysis easier. Returns list of paths for each fasta

jade.basic.sequence.fasta.write_fasta(sequence, label, HANDLE)[source]¶: Writes a fasta with a sequence, chain, and open FILE handle. FULL Sequence on one line seems to be fine with HMMER3.