jade.basic.sequence package

jade.basic.sequence.ClustalRunner module

class jade.basic.sequence.ClustalRunner.ClustalRunner(fasta_path, clustal_name='clustal_omega', clustal_dir=None)[source]

A very simple class wrapper to run clustal omega.

output_alignment(out_dir, out_name, parellel_process=False)[source]

Configure command line and Run Clustal Omega

set_extra_options(extra_options='')[source]

Set any extra options as a string which will be added to the end of the command line.

set_fasta_path(fasta_path)[source]

Set the fasta path for alignment.

set_hard_wrap(hard_wrap)[source]

Set the number of charactors before clustal will wrap. Usually 60-80.

set_output_format(output_format)[source]

Set the output format

set_threads(threads)[source]

Limit the number of threads for Clustal

jade.basic.sequence.PDBConsensusInfo module

class jade.basic.sequence.PDBConsensusInfo.PDBConsensusInfo(resinfo_list)[source]

Class to compute frequency and probability from an array of PDBInfo classes. The sequences within PDBInfo do not necessarily need to be the same length. A given sequence position is identified and stored in the data maps by its [pdb_num, chain, and icode] -> Use get_position_from_residue(residue) to get this position from a Residue instance.

compute_stats()[source]

Compute frequency and probability (0-1) for each position for each amino acid

get_all_sorted_positions()[source]
get_consensus(residue)[source]
get_consensus_for_position(position)[source]
get_consensus_for_residues(residue_list)[source]

Get the consensus for an ORDERED list of Residues

get_consensus_sequence()[source]
get_frequency(residue, aa)[source]
get_frequency_for_position(position, aa)[source]
get_position_from_residue(residue)[source]
get_probability(residue, aa)[source]

Get probability of the current position (starting from 0) and aa

get_probability_for_position(position, aa)[source]
init_data_map()[source]

Sets all probabilities 0 and appends each map to the stats vector

output_seqlogo_bt_residues(outdir, outname, res1, res2, chain)[source]
output_seqlogo_for_regions(regions, outdir, outname, chain)[source]

Regions is an array of Regions classes. Basically start/stop points

return_initialized_total_map()[source]
set_sequences(pdb_info_list)[source]

Set a sequence list

jade.basic.sequence.SequenceInfo module

class jade.basic.sequence.SequenceInfo.SequenceInfo[source]

Simple class for holding + accessing sequence metadata

Original class for sequence info. Basically deprecated by SequenceStats and PDBConsensusInfo.

get_chain()[source]
get_end_residue()[source]
get_length()[source]
get_pdbID()[source]
get_pdbpath()[source]
get_region()[source]
get_residue(resnum)[source]

If region is given, resnum is residue number of PDB If not, resnum in Rosetta resnum

get_sequence()[source]
get_start_residue()[source]
set_pdbID(pdbID)[source]
set_pdbpath(pdbpath)[source]
set_region(region)[source]
set_sequence(sequence)[source]

jade.basic.sequence.SequenceResults module

class jade.basic.sequence.SequenceResults.SequenceResults[source]

Simple class for holding, calculating, + accessing result data Residue Numbers are in Rosetta numbering.

Original class for sequence stats. Basically deprecated by SequenceStats and PDBConsensusInfo.

add_reference_residue(resnum, one_letter_code)[source]
add_residue(resnum, one_letter_code, decoy)[source]
get_all_mutated_positions()[source]
get_all_reference_percent_observed()[source]

Returns array of tripplets of [postion, one_letter_code, percent] of reference amino acid found.

get_all_residue_numbers()[source]
get_all_residues_observed(resnum)[source]
get_decoys_with_aa(resnum, one_letter_code)[source]

Returns all decoys with a specific mutation at a position.

get_decoys_with_joint_aa(resnum_one_letter_code_pair)[source]

Will output decoys that have x, y, z mutations at positions a, b, c

get_freq(resnum, one_letter_code)[source]
get_percent(resnum, one_letter_code)[source]
get_percent_string(resnum, one_letter_code)[source]
get_reference_residue(resnum)[source]
get_total(resnum)[source]

jade.basic.sequence.SequenceStats module

class jade.basic.sequence.SequenceStats.SequenceStats(sequence_list)[source]

Class for getting data from an array of strings of sequences (one letter code) of equal length.

compute_stats()[source]

Compute frequency and probability (0-1) for each position for each amino acid

get_consensus_sequence()[source]
get_frequency(position, aa)[source]
get_probability(position, aa)[source]

Get probability of the current position (starting from 0) and aa

init_data_map()[source]

Sets all probabilities 0 and appends each map to the stats vector

return_initialized_total_map()[source]
set_sequences(sequence_list)[source]

Set a sequence list

jade.basic.sequence.fasta module

jade.basic.sequence.fasta.chain_fasta_files_from_biostructure(structure, prefix, outdir)[source]
jade.basic.sequence.fasta.chain_fasta_files_from_pose(pose, prefix, outdir)[source]

Creates fasta for each chain in the pose. Returns a list of paths for each fasta.

jade.basic.sequence.fasta.chain_fasta_from_biostructure(structure, outname, outdir)[source]

Creates a single fasta from biopython structure, split by individual chains.

jade.basic.sequence.fasta.chain_fasta_from_pose(pose, outname, outdir)[source]

Creates a single fasta from pose, split by individual chains.

jade.basic.sequence.fasta.fasta_from_pose(pose, fasta_label, outname, outdir)[source]

Creates a fasta from the pose.

jade.basic.sequence.fasta.fasta_from_sequences(sequences, outdir, outname)[source]

Output a general fasta, with tag being 1_outname etc. Use write_fasta for more control. Returns path to Fasta File written

jade.basic.sequence.fasta.get_biochain_sequence(bio_chain)[source]
jade.basic.sequence.fasta.get_label_from_fasta(fasta_path)[source]

Gets the first chainID found - Should be a single chain fasta file.

jade.basic.sequence.fasta.get_sequence_from_fasta(fasta_path, label)[source]
jade.basic.sequence.fasta.output_fasta_from_pdbs_biopython(path_header_dict, out_path, native_path=None, native_label='native', is_camelid=False)[source]

Used only for L and H chains! Concatonates the L and H in order if present, otherwise assumes camelid at H.

jade.basic.sequence.fasta.output_weblogo_for_sequences(sequences, outdir, outname, tag='Dunbrack Lab - Antibody Database Team')[source]
jade.basic.sequence.fasta.read_header_data_from_fasta(fasta_path)[source]

Reads > from fasta (PDBAA) and returns a defaultdict of pdb_chain: [method, residues, resolution, R factor]

jade.basic.sequence.fasta.split_fasta_from_fasta(fasta_path, prefix, outdir)[source]

If we have a multiple fasta sequence, we split it into idividual files. Makes analysis easier. Returns list of paths for each fasta

jade.basic.sequence.fasta.write_fasta(sequence, label, HANDLE)[source]

Writes a fasta with a sequence, chain, and open FILE handle. FULL Sequence on one line seems to be fine with HMMER3.