mathematical.data_frames
Mathematical operations for Data Frames
.
Data:
Type hint for the |
Functions:
|
Count the number of occurrences of a non-NaN value in the specified columns of a |
|
Compile the values for the specified columns in each row into a list. |
|
Calculate the difference between values in the two columns for each row of a |
|
Calculate the relative difference between values in the two columns for each row of a |
|
Calculate the logarithm of the values in each row for the specified columns of a |
|
Calculate the standard deviation of the log10 values in each row for the specified columns of a |
|
Calculate the mean of each row for the specified columns of a |
|
Calculate the median of each row for the specified columns of a |
|
Identify outliers in each row. |
|
Returns the value of the specified column as a percentage of the given total. |
|
Calculate the standard deviation of each row for the specified columns of a |
|
Set the display options for numpy and pandas. |
-
ColumnLabelList
Type hint for the
column_label_list
parameter in thedf_*()
functions.
-
df_count
(row, column_label_list=None)[source] Count the number of occurrences of a non-NaN value in the specified columns of a
data frame
.Do not call this function directly; use it with
df.apply()
instead:data_frame["Count"] = data_frame.apply( func=df_count, args=[["Bob", "Alice"]], axis=1, )
-
df_data_points
(row, column_label_list)[source] Compile the values for the specified columns in each row into a list.
Do not call this function directly; use it with
df.apply()
instead:data_frame["Data Points"] = data_frame.apply( func=df_data_points, args=[["Bob", "Alice"]], axis=1, )
-
df_delta
(row, left_column, right_column)[source] Calculate the difference between values in the two columns for each row of a
data frame
.Do not call this function directly; use it with
df.apply()
instead:data_frame["Delta"] = data_frame.apply( func=df_delta, args=["Bob", "Alice"], axis=1, )
- Parameters
- Return type
- Returns
The difference between
left_column
andright_column
.
New in version 0.4.0.
-
df_delta_relative
(row, left_column, right_column)[source] Calculate the relative difference between values in the two columns for each row of a
data frame
:(left - right) / right
Do not call this function directly; use it with
df.apply()
instead:data_frame["Rel. Delta"] = data_frame.apply( func=df_delta_relative, args=["Bob", "Alice"], axis=1, )
- Parameters
- Return type
- Returns
The relative difference between
left_column
andright_column
.
New in version 0.4.0.
-
df_log
(row, column_label_list, base=10)[source] Calculate the logarithm of the values in each row for the specified columns of a
data frame
.Do not call this function directly; use it with
df.apply()
instead:data_frame["Bob Log10"] = data_frame.apply( func=df_log, args=[["Bob"], 10], axis=1, )
-
df_log_stdev
(row, column_label_list=None)[source] Calculate the standard deviation of the log10 values in each row for the specified columns of a
data frame
.Do not call this function directly; use it with
df.apply()
instead:data_frame["Log Stdev"] = data_frame.apply( func=df_log_stdev, args=[["Bob", "Alice"]], axis=1, )
-
df_mean
(row, column_label_list=None)[source] Calculate the mean of each row for the specified columns of a
data frame
.Do not call this function directly; use it with
df.apply()
instead:data_frame["Mean"] = data_frame.apply( func=df_mean, args=[["Bob", "Alice"]], axis=1, )
-
df_median
(row, column_label_list=None)[source] Calculate the median of each row for the specified columns of a
data frame
.Do not call this function directly; use it with
df.apply()
instead:data_frame["Median"] = data_frame.apply( func=df_median, args=[["Bob", "Alice"]], axis=1, )
-
df_outliers
(row, column_label_list=None, outlier_mode=1)[source] Identify outliers in each row.
This function only returns the list of outliers (if any). If you want the list of values without the outliers see the functions in
mathematical.outliers
.Do not call this function directly; use it with
df.apply()
instead:data_frame["Outliers"] = data_frame.apply( func=df_outliers, args=[["Bob", "Alice"]], axis=1, )
- Parameters
The supported outlier modes are:
1
or :py:data`mathematical.data_frames.MAD` – Use the Median Absolute Deviation2
or :py:data`mathematical.data_frames.QUARTILES` – Treat values more than3×
the inter-quartile range away from the upper or lower quartile as outliers.3
or :py:data`mathematical.data_frames.STDEV2` – Treat values more thanrng × stdev
away from mean as outliers
- Return type
- Returns
The outliers.
-
df_percentage
(row, column_label, total)[source] Returns the value of the specified column as a percentage of the given total.
The total is usually the sum of the specified column.
Do not call this function directly; use it with
df.apply()
instead:data_frame["Bob Percentage"] = data_frame.apply( func=df_percentage, args=[13, "Bob"], axis=1, )
-
df_stdev
(row, column_label_list=None)[source] Calculate the standard deviation of each row for the specified columns of a
data frame
.Do not call this function directly; use it with
df.apply()
instead:data_frame["Stdev"] = data_frame.apply( func=df_stdev, args=[["Bob", "Alice"]], axis=1, )
-
set_display_options
(desired_width=300, max_columns=15, max_rows=20)[source] Set the display options for numpy and pandas.
- Parameters
desired_width (
int
) – The desired maximum output width, in characters. Default300
.max_columns (
int
) – The maximum number of columns to display in apandas.DataFrame
. Default15
.max_rows (
int
) – The maximum number of rows to display in apandas.DataFrame
. Default20
.
New in version 0.3.0.