jade.basic.pandas package¶
jade.basic.pandas.PandasDataFrame module¶
-
class
jade.basic.pandas.PandasDataFrame.
GeneralPandasDataFrame
(data=None, index=None, columns=None, dtype=None, copy=False)[source]¶ Bases:
pandas.core.frame.DataFrame
-
get_matches
(column, to_match)[source]¶ Get all the rows that match a paricular element of a column. :param column: str :param to_match: str :rtype: pandas.DataFrame
-
get_row_matches
(column1, to_match, column2)[source]¶ Get the elements of the rows that match a particular column. If one element, this can be converted easily enough :param column1: str :param to_match: str :param column2: str :rtype: pandas.Series
-
n_matches
(column, to_match)[source]¶ Return the number of matches. :param column: str :param to_match: str :rtype: int
-
to_tsv
(path_or_buf=None, na_rep='', float_format=None, columns=None, header=True, index=True, index_label=None, mode='w', encoding=None, compression=None, quoting=None, quotechar='"', line_terminator='\n', chunksize=None, tupleize_cols=False, date_format=None, doublequote=True, escapechar=None, decimal='.')[source]¶
-
-
jade.basic.pandas.PandasDataFrame.
detect_numeric
(df)[source]¶ Detect numeric components
Parameters: df – pd.DataFrame Return type: pd.DataFrame
-
jade.basic.pandas.PandasDataFrame.
drop_duplicate_columns
(df)[source]¶ Drop Duplicate columns from the DataFrame. Return DF
Parameters: df – pandas.DataFrame Return type: pandas.DataFrame
-
jade.basic.pandas.PandasDataFrame.
get_columns
(df, columns)[source]¶ Get a new dataframe of only the columns
Parameters: - df – pandas.DataFrame
- columns – list
Return type: pd.DataFrame
-
jade.basic.pandas.PandasDataFrame.
get_match_by_array
(df, column, match_array)[source]¶ Get a new dataframe of all dataframes of the subset series, match_array
- Note: This will result in a dataframe, but there may be strange issues when you go to plot the data in seaborn
- No idea why.
Parameters: - df – pd.DataFrame
- column – str
- match_array – pd.Series
Return type: pd.DataFrame
-
jade.basic.pandas.PandasDataFrame.
get_matches
(df, column, to_match)[source]¶ Get all the rows that match a paricular element of a column.
Parameters: - df – pandas.DataFrame
- column – str
- to_match – str
Return type: pd.DataFrame
-
jade.basic.pandas.PandasDataFrame.
get_multiple_matches
(df, column, to_match_array)[source]¶ Get all the rows that match any of the values in to_match_array.
Parameters: - df – pandas.DataFrame
- column – str
- to_match_array – list
Return type: pd.DataFrame
-
jade.basic.pandas.PandasDataFrame.
get_n_matches
(df, column, to_match)[source]¶ Get the number of matches :param df: pd.DataFrame :param column: str :param to_match: :rtype: int
-
jade.basic.pandas.PandasDataFrame.
get_row_matches
(df, column1, to_match, column2)[source]¶ Get the elements of the rows that match a particular column. If one element, this can be converted easily enough :param df: pd.DataFrame :param column1: str :param to_match: str :param column2: str :rtype: pd.Series
-
jade.basic.pandas.PandasDataFrame.
get_value
(df, column)[source]¶ Get a single value from a one-row df. THis is to help for implicit docs, since the syntax to Iloc is so fucking strange.
Parameters: - df – pd.DataFrame
- column – str
Returns: value
-
jade.basic.pandas.PandasDataFrame.
multi_tab_excel
(df_list, sheet_list, file_name)[source]¶ Writes multiple dataframes as separate sheets in an output excel file.
If directory of output does not exist, it will create it.
Author: Tom Dobbs http://stackoverflow.com/questions/32957441/putting-many-python-pandas-dataframes-to-one-excel-worksheet
Parameters: - df_list – [pd.Dataframe]
- sheet_list – [str]
- file_name – str
jade.basic.pandas.stats module¶
-
jade.basic.pandas.stats.
calculate_stddev
(df, x, y, hue=None)[source]¶ Calcuates standard deviations for a normal distribution (Numerical data) over X and Hue categories.
If hue is given, the hue column will be added, and the overall will be of ‘ALL’
Example DataFrame output (x=’exp’, y= ‘length_recovery_freq’, hue = ‘cdr’:
SD cdr exp y20 6.739596 H2 ALL length_recovery_freq 21 7.373650 H2 min.remove_antigen-F length_recovery_freq 22 6.400637 ALL min.remove_antigen-T length_recovery_freq
Parameters: - df – pandas.DataFrame
- x – str
- y – str
- total_column – str
- hue – str
Return type: pandas.DataFrame
-
jade.basic.pandas.stats.
calculate_stddev_binomial_distribution
(df, x, y, total_column, y_mean_column, hue=None)[source]¶ Calculates standard deviations for a binomial distribution (like experiment True/False values) over X and Hue categories..
Typically used for bar-plot.
If hue is given the hue column will be added, and the overall will be of ‘ALL’, plus that of Hue
Example DataFrame output (x=’exp’, y= ‘length_recovery_freq’, hue = ‘cdr’:
SD cdr exp y20 6.739596 H2 ALL length_recovery_freq 21 7.373650 H2 min.remove_antigen-F length_recovery_freq 22 6.400637 ALL min.remove_antigen-T length_recovery_freq
Parameters: - df – pandas.DataFrame
- x – str
- y – str
- total_column – str
- hue – str
Return type: pandas.DataFrame