Introduction

In this document, we will introduce you to functions for exploring and visualizing categorical data.

Cross Tabulation

The ds_cross_table() function creates two way tables of categorical variables.

If you want the above result as a tibble, use ds_twoway_table().

A plot() method has been defined which will generate:

Grouped Bar Plots

k <- ds_cross_table(mtcarz, cyl, gear)
plot(k)

Stacked Bar Plots

k <- ds_cross_table(mtcarz, cyl, gear)
plot(k, stacked = TRUE)

Proportional Bar Plots

k <- ds_cross_table(mtcarz, cyl, gear)
plot(k, proportional = TRUE)

Multiple One Way Tables

The ds_auto_freq_table() function creates multiple one way tables by creating a frequency table for each categorical variable in a data set. You can also specify a subset of variables if you do not want all the variables in the data set to be used.

ds_auto_freq_table(mtcarz)
#>                              Variable: cyl                              
#> -----------------------------------------------------------------------
#> Levels     Frequency    Cum Frequency       Percent        Cum Percent  
#> -----------------------------------------------------------------------
#>    4          11             11              34.38            34.38    
#> -----------------------------------------------------------------------
#>    6           7             18              21.88            56.25    
#> -----------------------------------------------------------------------
#>    8          14             32              43.75             100     
#> -----------------------------------------------------------------------
#>  Total        32              -             100.00              -      
#> -----------------------------------------------------------------------
#> 
#>                              Variable: vs                               
#> -----------------------------------------------------------------------
#> Levels     Frequency    Cum Frequency       Percent        Cum Percent  
#> -----------------------------------------------------------------------
#>    0          18             18              56.25            56.25    
#> -----------------------------------------------------------------------
#>    1          14             32              43.75             100     
#> -----------------------------------------------------------------------
#>  Total        32              -             100.00              -      
#> -----------------------------------------------------------------------
#> 
#>                              Variable: am                               
#> -----------------------------------------------------------------------
#> Levels     Frequency    Cum Frequency       Percent        Cum Percent  
#> -----------------------------------------------------------------------
#>    0          19             19              59.38            59.38    
#> -----------------------------------------------------------------------
#>    1          13             32              40.62             100     
#> -----------------------------------------------------------------------
#>  Total        32              -             100.00              -      
#> -----------------------------------------------------------------------
#> 
#>                             Variable: gear                              
#> -----------------------------------------------------------------------
#> Levels     Frequency    Cum Frequency       Percent        Cum Percent  
#> -----------------------------------------------------------------------
#>    3          15             15              46.88            46.88    
#> -----------------------------------------------------------------------
#>    4          12             27              37.5             84.38    
#> -----------------------------------------------------------------------
#>    5           5             32              15.62             100     
#> -----------------------------------------------------------------------
#>  Total        32              -             100.00              -      
#> -----------------------------------------------------------------------
#> 
#>                             Variable: carb                              
#> -----------------------------------------------------------------------
#> Levels     Frequency    Cum Frequency       Percent        Cum Percent  
#> -----------------------------------------------------------------------
#>    1           7              7              21.88            21.88    
#> -----------------------------------------------------------------------
#>    2          10             17              31.25            53.12    
#> -----------------------------------------------------------------------
#>    3           3             20              9.38             62.5     
#> -----------------------------------------------------------------------
#>    4          10             30              31.25            93.75    
#> -----------------------------------------------------------------------
#>    6           1             31              3.12             96.88    
#> -----------------------------------------------------------------------
#>    8           1             32              3.12              100     
#> -----------------------------------------------------------------------
#>  Total        32              -             100.00              -      
#> -----------------------------------------------------------------------

Multiple Two Way Tables

The ds_auto_cross_table() function creates multiple two way tables by creating a cross table for each unique pair of categorical variables in a data set. You can also specify a subset of variables if you do not want all the variables in the data set to be used.

ds_auto_cross_table(mtcarz, cyl, gear, am)
#>     Cell Contents
#>  |---------------|
#>  |     Frequency |
#>  |       Percent |
#>  |       Row Pct |
#>  |       Col Pct |
#>  |---------------|
#> 
#>  Total Observations:  32 
#> 
#>                                 cyl vs gear                                 
#> ----------------------------------------------------------------------------
#> |              |                           gear                            |
#> ----------------------------------------------------------------------------
#> |          cyl |            3 |            4 |            5 |    Row Total |
#> ----------------------------------------------------------------------------
#> |            4 |            1 |            8 |            2 |           11 |
#> |              |        0.031 |         0.25 |        0.062 |              |
#> |              |         0.09 |         0.73 |         0.18 |         0.34 |
#> |              |         0.07 |         0.67 |          0.4 |              |
#> ----------------------------------------------------------------------------
#> |            6 |            2 |            4 |            1 |            7 |
#> |              |        0.062 |        0.125 |        0.031 |              |
#> |              |         0.29 |         0.57 |         0.14 |         0.22 |
#> |              |         0.13 |         0.33 |          0.2 |              |
#> ----------------------------------------------------------------------------
#> |            8 |           12 |            0 |            2 |           14 |
#> |              |        0.375 |            0 |        0.062 |              |
#> |              |         0.86 |            0 |         0.14 |         0.44 |
#> |              |          0.8 |            0 |          0.4 |              |
#> ----------------------------------------------------------------------------
#> | Column Total |           15 |           12 |            5 |           32 |
#> |              |        0.468 |        0.375 |        0.155 |              |
#> ----------------------------------------------------------------------------
#> 
#> 
#>                          cyl vs am                           
#> -------------------------------------------------------------
#> |              |                     am                     |
#> -------------------------------------------------------------
#> |          cyl |            0 |            1 |    Row Total |
#> -------------------------------------------------------------
#> |            4 |            3 |            8 |           11 |
#> |              |        0.094 |         0.25 |              |
#> |              |         0.27 |         0.73 |         0.34 |
#> |              |         0.16 |         0.62 |              |
#> -------------------------------------------------------------
#> |            6 |            4 |            3 |            7 |
#> |              |        0.125 |        0.094 |              |
#> |              |         0.57 |         0.43 |         0.22 |
#> |              |         0.21 |         0.23 |              |
#> -------------------------------------------------------------
#> |            8 |           12 |            2 |           14 |
#> |              |        0.375 |        0.062 |              |
#> |              |         0.86 |         0.14 |         0.44 |
#> |              |         0.63 |         0.15 |              |
#> -------------------------------------------------------------
#> | Column Total |           19 |           13 |           32 |
#> |              |        0.594 |        0.406 |              |
#> -------------------------------------------------------------
#> 
#> 
#>                          gear vs am                          
#> -------------------------------------------------------------
#> |              |                     am                     |
#> -------------------------------------------------------------
#> |         gear |            0 |            1 |    Row Total |
#> -------------------------------------------------------------
#> |            3 |           15 |            0 |           15 |
#> |              |        0.469 |            0 |              |
#> |              |            1 |            0 |         0.47 |
#> |              |         0.79 |            0 |              |
#> -------------------------------------------------------------
#> |            4 |            4 |            8 |           12 |
#> |              |        0.125 |         0.25 |              |
#> |              |         0.33 |         0.67 |         0.38 |
#> |              |         0.21 |         0.62 |              |
#> -------------------------------------------------------------
#> |            5 |            0 |            5 |            5 |
#> |              |            0 |        0.156 |              |
#> |              |            0 |            1 |         0.16 |
#> |              |            0 |         0.38 |              |
#> -------------------------------------------------------------
#> | Column Total |           19 |           13 |           32 |
#> |              |        0.594 |        0.406 |              |
#> -------------------------------------------------------------