35.    Seaborn examples gallery study

April 1, 2019
home

Contents

description

01. row 1, plotting 1 - lmplot for linear regression

02. row 1, plotting 2, barplot

03. row 1, plotting 3, Different cubehelix palettes

                    # step 1:  Set up the matplotlib figure, 3 rows X 3 columns
                    f, axes = plt.subplots(3, 3, ...)

                    # step 2:  instantiae a python zip object with 2 parameters
                    #   parameter 1: axes.flat,
                    #      it has 9 MAT subplots. 
                    #   parameter 2: np.linspace(0, 3, 10))  evenly separated
                    #      0.00,  0.33,  0.66,  1.00, 1.33, 1.66, 2.00, 2.33, 2.66, 3.00     
                    #      9 will be used in the loop for palette color configuration.
            
                    # step 3:  loop the zip
                    #        for each iteration
                    #          - create a cmap, using s value
                    #          - Generate  a random bivariate dataset, 
                    #            x, y = rs.randn(2, 50)     
                    #                      two dimension nparray
                    #                      50 points

                    # step 4:  plot it with method kdeplot
                    #          - different data
                    #          - different palette
                

04. row 1, plotting 4, scatterplot, typical

05. row 2, plotting 1, distplot

06. row 2, plotting 2, Timeseries plot

07. row 2, plotting 3, FacitGrid with Projection

    import numpy as np
    import pandas as pd
    
    # ---    step 1, using numpy to generate data
    r = np.linspace(0, 10, num=100)
    print(' r = ' + str(r))  
    # r = [ 0.   0.1   0.2    1.0   2.0... 9.8, 10.0 ]
    # 100 number, evenly separated.
    
    
    # ---     step 2, using pandas to create a dataframe - wide form 
    df = pd.DataFrame({'r': r, 'slow': r, 'medium': 2 * r, 'fast': 4 * r})
    print('df = ' + str(df))
    print('df.shape = ' + str(df.shape))
    '''
             r       slow       medium       fast
    0    0.000000   0.000000   0.000000   0.000000
    1    0.101010   0.101010   0.202020   0.404040
    
    100 rows, 4 columns
    '''
    
    # ---     step 3, using pandas melt method to convert to long form
    #The plotting method expects lon-form format.
    df2 = pd.melt(df, id_vars=['r'], var_name='speed', value_name='theta')
    print('df2 = ' + str(df2))
    print('df2.shape = ' + str(df2.shape))
    '''
            r      speed    theta
    0     0.000000  slow   0.000000
    1     0.101010  slow   0.101010
    300 rows, 3 columns
    '''

                

08. row 2, plotting 4, FacetGrid, typical

        tips = sns.load_dataset("tips")
        g = sns.FacetGrid(tips, row="sex", col="time", margin_titles=True)
        bins = np.linspace(0, 60, 13)
        g.map(plt.hist, "total_bill", color="steelblue", bins=bins)
                

09. row 3, plotting 1, sns.relplot, kind="line"

10. row 3, plotting 2, Grouped barplot

11. row 3, plotting 3, Grouped boxplot

13. row 4, plotting 1, Annotated headmaps

14. row 4, plotting 2, Hexbin plot with marginal distributions

15. row 4, plotting 3, Horizontal bar plots

16. row 4, plotting 4,Horizaontal boxplot with observations

17. row 5, plot 1, horizontal jitter stripplot

18. row 5, plot 2, sns=jointplot, kind=kde

20, row 5, plot 4, Plotting large distributions

21. row 6, plot 1, logistic regression

22. row 6, col 2, FacetGrid and its map function, typical

23. row 6, col 3, heatmap on the left-bottom side

24. row 6, col 4, JointGrid - scatter and rug

25. row 7, col 1, Multiple bivariate KDE plots

26. row 7, col 2, Multiple linear regression

27. row 7, col 3, Paired density and scatterplot matrix

28. row 7, col 4, Paired categorical plots

29. row 8, col 1, Pairgrid with dotplots

30. row 8, col 2, categorical variable for x-coordinate, x-y realtionship

31. row 8, col 3, jointplot, kind="reg"

32. row 8, col 4, Plot the residuals after fitting a linear model

34. row 9, col 1, relplot, default kind: scatter, typical

34. row 9, col 2, one distribution for many different

35. row 9, col 3, pairplot

37. row 10, col 1, violinplot for high-level example

38. row 10, col 2, df.corr() and sns.cluster for brain networks

        -------------    step 1:   examine the data...  ------------------------------
        df = sns.load_dataset("brain_networks", header=[0, 1, 2], index_col=0)
        used_networks = [1, 5, 6, 7, 8, 12, 13, 17]
        used_columns = (df.columns.get_level_values("network")
                                  .astype(int)
                                  .isin(used_networks))
        df = df.loc[:, used_columns]
        print('df')
        '''
        network           1                      5                           ...17
        node              1                      1                           ...
        hemi             lh          rh         lh         rh                ...
            0         56.055744   92.031036 -35.898861  -1.889181...
            1         55.547253   43.690075  19.568010  15.902983...
        '''
        
        note 1: There are 3 lines for dataframe heading.
        note 2: The hierarchy of brain networks data are network, node, left or right.
        note 3: the values are like 56.0, 15.90... for each hemi
        
 
        -------------------- step 2:   generate the corrections form the data ----
        df.corr() is the function argument in method sns.clustermap.
        It genenerates correlations from the data.
        ''' 
        network                   1                   5
        node                      1                   1
        hemi                     lh        rh        lh        rh
        network node hemi
        1       1    lh    1.000000  0.881516  0.431619  0.418708
                    rh    0.881516  1.000000  0.431953  0.519916
        5       1    lh    0.431619  0.431953  1.000000  0.822897
                    rh    0.418708  0.519916  0.822897  1.000000
        '''
        
        note 1: The values are like 0.88, 0.5, 1...
        note 2: 1 means 100% correlation, it refers itself.
        note 3: 0 means no correlation.
        note 4: minus means opposite correlations.
        
    

39. row 10, col 3, Seaborn.line, data: DataFrame

        The following is the wide-form dataset.                 
                        A         B         C         D     
        2016-01-01  0.167921  0.523505  0.817376  1.703846  
        2016-01-02 -1.979026  1.237704  0.057230  2.743267  
        ....

40. row 10, col 4, multiple violin plots, dataframe, high-level method