gdphelper.gdpdescribe

Module Contents

Functions

gdpdescribe(df, x, y, stats=['mean', 'sd', 'median'], dec=2)

Calculates summary statistics for the Numeric Variable x grouping by categorical variable y.

gdphelper.gdpdescribe.gdpdescribe(df, x, y, stats=['mean', 'sd', 'median'], dec=2)[source]

Calculates summary statistics for the Numeric Variable x grouping by categorical variable y.

The function is able to calculate the following descriptive statistics:

Mean Median Standard Deviation Minimum Value Maximum Value Range Value of 75th percentile Value of 25th percentile Interquartile range Number of Missing values

Parameters
  • df (pd.Dataframe) – pandas dataframe with the variables to analyze

  • x (str) – column name of a pandas dataframe used to calculate the descriptive statistics

  • y (str) – column name of a grouping variable

  • dec (int) – number of decimal places to return in the table

  • stats (list, default ["mean", "sd", "median"]) – Descriptive statistics to calculate

Returns

Table with the summary statistics specified as arguments of the function

Return type

pd.Dataframe

Examples

>>> gdpdescribe(df, "Value", "Location", stats=["mean", "median", "sd", "min", "max", "range_", "q75", "q25", "iqr", "nas"], dec=3)