jade.RAbD_BM package¶
jade.RAbD_BM.AnalysisInfo module¶
-
class
jade.RAbD_BM.AnalysisInfo.
AnalysisInfo
(json_path)[source]¶ Simple class that parses a json file which defines (USING RELATIVE PATHS):
- exp - The name of the experiment - whatever you want it to be.
- decoy_dir - the directory of the decoys.
- features_db - the db where the features reporters have been run.
The class will store this information, and parse the benchmark info in the decoy dir, storing a BenchmarkInfo object. Benchmark classes and scripts will take lists to these analysis files and use them to generate plots and data.
jade.RAbD_BM.AnalyzeRecovery module¶
-
class
jade.RAbD_BM.AnalyzeRecovery.
AnalyzeRecovery
(pyig_design_db_path, analysis_info, native_info, cdrs=None)[source]¶ Pools Recovery and RR data, outputs to DB
-
jade.RAbD_BM.AnalyzeRecovery.
calculate_exp_rr_and_recovery
(exp, result_df)[source]¶ Calculate the overall recovery and risk ratio. :param exp: :param result_df: :rtype: pandas.DataFrame
-
jade.RAbD_BM.AnalyzeRecovery.
calculate_per_cdr_rr_and_recovery
(exp, cdrs, result_df)[source]¶ Calculate the recovery and risk-ratios PER CDR. :rtype: pandas.DataFrame
-
jade.RAbD_BM.AnalyzeRecovery.
calculate_recovery_and_risk_ratios
(top_recovery_df, observed_df)[source]¶ Calculate the Risk Ratio and Recovery Percent for each pdb/cdr given dataframes output by the calculators below.
Return a merged dataframe of the top recovery and observed, with the resulting risk ratio data.
Parameters: - top_recovery_df – pandas.DataFrame
- observed_df – pandas.DataFrame
Return type: pandas.DataFrame
jade.RAbD_BM.RunBenchmarksRAbD module¶
-
class
jade.RAbD_BM.RunBenchmarksRAbD.
RunBenchmarksRAbD
[source]¶ Bases:
jade.rosetta_jade.RunRosettaBenchmarks.RunRosettaBenchmarks
Benchmark class specifically for RAbD
Details:
ALL INPUT PDBs should go into
project_root/datasetsTypically, you will have multiple directories - native, relaxed, etc.
This is specified as a benchmark using ‘input_pdb_type’ in your json file.ALL PDBLISTs for benchmarking should go into
project_root/datasets/pdblists
jade.RAbD_BM.benchmark_plotting module¶
-
class
jade.RAbD_BM.benchmark_plotting.
NativeCDRData
(datatype, native_path, data_table='cdr_metrics')[source]¶
jade.RAbD_BM.recovery_rr_tools module¶
-
jade.RAbD_BM.recovery_rr_tools.
calculate_geometric_means_rr
(df, x, y, hue=None)[source]¶ Example use: rr_data_lengths = calculate_geometric_means_rr(df_all, x=’cdr’, y=’length_rr’, hue=’exp’) rr_data_clusters = calculate_geometric_means_rr(df_all, x=’cdr’, y=’cluster_rr’, hue=’exp’)
-
jade.RAbD_BM.recovery_rr_tools.
calculate_rr_errors
(df_all_errors)[source]¶ Calculates the risk ratio errors for cluster and lengths using propagation error equations calculated for the recovery itself. Which is the same for percent as it would be raw data, as the N cancels out in the equations. http://lectureonline.cl.msu.edu/~mmp/labs/error/e2.htm
-
jade.RAbD_BM.recovery_rr_tools.
calculate_set_errorbars_hist
(ax, data, x, y, binomial_distro=True, total_column='total_entries', y_freq_column=None, x_order=None, hue_order=None, hue=None, caps=False, color='k', linewidth=0.75, base_columnwidth=0.8, full=True)[source]¶ Calculates the standard deviation of the data, sets erorr bars for a histogram. Default base_columnwidth for seaborn plots is .8
Optionally give x_order and/or hue_order for the ordering of the columns. Make sure to pass this while plotting.
- Notes:
- If Hue is enabled, this base is divided by the number of hue_names for the final width used for plotting.
- Caps are the line horizontal lines in the errorbar.
- ‘full’ means error bars on both vertical sides of the histogram bar.
- Warning:
- linewidth of .5 does not show up in all PDFs for all bars.
-
jade.RAbD_BM.recovery_rr_tools.
calculate_set_errorbars_scatter
(ax, data, x, y, binomial_distro=False, total_column='total_entries', caps=False, color='k', lw=1.5)[source]¶ (Untested) - Calculates the standard deviation of the data, sets error bars for a typical scatter plot
-
jade.RAbD_BM.recovery_rr_tools.
calculate_stddev_binomial_distribution2
(df, x, y, total_column, y_mean_column, hue=None, percent=True)[source]¶ Calcuates stddeviations for a binomial distribution. Returns a dataframe of stddevs If percent=True, we dived by the total to normalize the standard deviation. SD of ‘mean’ = SQRT(n*p*q) where p is probability of success and q is probability of failure.
-
jade.RAbD_BM.recovery_rr_tools.
load_precomputed_recoveries
(db_path='data/all_recovery_and_risk_ratio_data.db', table='full_data')[source]¶ Reads recovery data from a database created via script.
rtype: pandas.Dataframe
-
jade.RAbD_BM.recovery_rr_tools.
order_by_row_group
(df, column, groups)[source]¶ Order a dataframe by groups. Return the dataframe. Probably a better way to do this already, but I don’t know what it is.
-
jade.RAbD_BM.recovery_rr_tools.
remove_pdb_and_cdr
(df, pdbid, cdr)[source]¶ Removes a particular pdbid and cdr from the db. Returns the new df.
-
jade.RAbD_BM.recovery_rr_tools.
set_errorbars_bar
(ax, data, x, y, error_dfs, x_order=None, hue_order=None, hue=None, caps=False, color='k', linewidth=0.75, base_columnwidth=0.8, full=True)[source]¶ Sets erorr bars for a bar chart.
Default base_columnwidth for seaborn plots is .8
Optionally give x_order and/or hue_order for the ordering of the columns. Make sure to pass this while plotting.
- Notes:
- If Hue is enabled, this base is divided by the number of hue_names for the final width used for plotting.
- Caps are the line horizontal lines in the errorbar.
- ‘full’ means error bars on both vertical sides of the histogram bar.
- Warning:
- linewidth of .5 does not show up in all PDFs for all bars.
-
jade.RAbD_BM.recovery_rr_tools.
set_errorbars_bar_rr
(ax, data, x, y, error_dfs, x_order=None, hue_order=None, hue=None, caps=False, color='k', linewidth=0.75, base_columnwidth=0.8, full=True)[source]¶ Sets erorr bars for a bar chart.
Default base_columnwidth for seaborn plots is .8
Optionally give x_order and/or hue_order for the ordering of the columns. Make sure to pass this while plotting.
- Notes:
- If Hue is enabled, this base is divided by the number of hue_names for the final width used for plotting.
- Caps are the line horizontal lines in the errorbar.
- ‘full’ means error bars on both vertical sides of the histogram bar.
- Warning:
- linewidth of .5 does not show up in all PDFs for all bars.
jade.RAbD_BM.tools module¶
jade.RAbD_BM.tools_ab_db module¶
-
jade.RAbD_BM.tools_ab_db.
get_all_clusters_for_length
(db, cdr, length, limit_to_known=True, res_cutoff=2.8, rfac_cutoff=0.3)[source]¶ Get all unique clusters for a length and a cdr
-
jade.RAbD_BM.tools_ab_db.
get_all_lengths
(db, cdr, limit_to_known=True, res_cutoff=2.8, rfac_cutoff=0.3)[source]¶ Get all unique lengths for a CDR
-
jade.RAbD_BM.tools_ab_db.
get_cdr_data_table_df
(db_path)[source]¶ Get a dataframe with typical info from the cdr_data table in the PyIgClassify db. :param db_con: sqlite3.con :rtype: pandas.DataFrame
-
jade.RAbD_BM.tools_ab_db.
get_cdr_rmsd_for_entry
(db, pdb, original_chain, cdr, length, fullcluster)[source]¶
-
jade.RAbD_BM.tools_ab_db.
get_center_dih_degrees_for_cluster_and_length
(db, cdr, length, cluster)[source]¶ Returns a dictionary of center dihedral angles in positional order. Or returns False if not found. result[“phis’] = [phis as floats] result[“psis”] = [Psis as floats] result[“omegas”] = [Omegas as floats]
-
jade.RAbD_BM.tools_ab_db.
get_center_for_cluster_and_length
(db, cdr, length, cluster, data_names_array)[source]¶
-
jade.RAbD_BM.tools_ab_db.
get_cluster_enrichment
(df, gene, cdr, cluster)[source]¶ Get the number of matches in the df and pdbid to the cdr and cluster :param df: pandas.DataFrame :rtype: int
-
jade.RAbD_BM.tools_ab_db.
get_cluster_matches
(df, gene, cdr, cluster)[source]¶ Get a dataframe of the matching (“Recovered”) rows (DataFrame).
Parameters: df – pandas.DataFrame Return type: pandas.DataFrame:
-
jade.RAbD_BM.tools_ab_db.
get_data_for_cluster_and_length
(db, cdr, length, cluster, data_names_array, limit_to_known=True, res_cutoff=2.8, rfac_cutoff=0.3)[source]¶ Get a set of data of a particular length, cdr, and cluster. data_names_array is a list of the types of data. Can include DISTINCT keyword
Example: data_names_array = [“PDB”, “original_chain”, “new_chain”, “sequence”]
-
jade.RAbD_BM.tools_ab_db.
get_length_enrichment
(df, gene, cdr, length)[source]¶ Get the number of matches in the df and pdbid to the cdr and length
Parameters: - df – pandas.DataFrame
- length – int
Return type: int
-
jade.RAbD_BM.tools_ab_db.
get_length_matches
(df, gene, cdr, length)[source]¶ Get a dataframe of the matching (“Recovered”) rows (DataFrame).
Parameters: - df – pandas.DataFrame
- length – int
Return type: pandas.DataFrame
-
jade.RAbD_BM.tools_ab_db.
get_pdb_chain_subset
(db, gene)[source]¶ Return a list of tuples of [pdb, chain] of the particular gene
-
jade.RAbD_BM.tools_ab_db.
get_stem_rmsd_for_entry
(db, pdb, original_chain, cdr, length, fullcluster)[source]¶
jade.RAbD_BM.tools_features_db module¶
-
jade.RAbD_BM.tools_features_db.
get_all_entries
(df, pdbid, cdr)[source]¶ Get all entries of a given PDBID and CDR. :param df: pandas.DataFrame :rtype: pandas.DataFrame
-
jade.RAbD_BM.tools_features_db.
get_cdr_cluster_df
(db_path)[source]¶ Get a dataframe with typical cluster info in it, which was generated by the features reporter framework. :param db_con: sqlite3.con :rtype: pandas.DataFrame
-
jade.RAbD_BM.tools_features_db.
get_cluster
(df, pdbid, cdr)[source]¶ Get the fullcluster from the dataframe for native or experimental data
Parameters: df – pandas.DataFrame Return type: str
-
jade.RAbD_BM.tools_features_db.
get_cluster_matches
(df, pdbid, cdr, cluster)[source]¶ Get a dataframe of the matching (“Recovered”) rows (DataFrame).
Parameters: df – pandas.DataFrame Return type: pandas.DataFrame:
-
jade.RAbD_BM.tools_features_db.
get_cluster_recovery
(df, pdbid, cdr, cluster)[source]¶ Get the number of matches in the df and pdbid to the cdr and cluster :param df: pandas.DataFrame :rtype: int
-
jade.RAbD_BM.tools_features_db.
get_length
(df, pdbid, cdr)[source]¶ Get the length from the dataframe for native or experimental data
Parameters: df – pandas.DataFrame Return type: int
-
jade.RAbD_BM.tools_features_db.
get_length_matches
(df, pdbid, cdr, length)[source]¶ Get a dataframe of the matching (“Recovered”) rows (DataFrame).
Parameters: - df – pandas.DataFrame
- length – int
Return type: pandas.DataFrame