gdphelper.gdpdescribe
Module Contents
Functions
|
Calculates summary statistics for the Numeric Variable x grouping by categorical variable y. |
- gdphelper.gdpdescribe.gdpdescribe(df, x, y, stats=['mean', 'sd', 'median'], dec=2)[source]
Calculates summary statistics for the Numeric Variable x grouping by categorical variable y.
The function is able to calculate the following descriptive statistics:
Mean Median Standard Deviation Minimum Value Maximum Value Range Value of 75th percentile Value of 25th percentile Interquartile range Number of Missing values
- Parameters
df (pd.Dataframe) – pandas dataframe with the variables to analyze
x (str) – column name of a pandas dataframe used to calculate the descriptive statistics
y (str) – column name of a grouping variable
dec (int) – number of decimal places to return in the table
stats (list, default ["mean", "sd", "median"]) – Descriptive statistics to calculate
- Returns
Table with the summary statistics specified as arguments of the function
- Return type
pd.Dataframe
Examples
>>> gdpdescribe(df, "Value", "Location", stats=["mean", "median", "sd", "min", "max", "range_", "q75", "q25", "iqr", "nas"], dec=3)