33.    Pandas and Public Data

February xx, 2019
home

Contents

Description

  • To use one simple python method call to get public data for any further actions.
  • Like financial data - stocks, real estates, and others, not house data.
  • My labs are in February 2019.
  • 01. Using pandas package, pandas_datareader.data

                    import pandas as pd
                    import datetime                        
                    import pandas_datareader.data as web      
                    
                    start = datetime.datetime(2017, 1, 26)     # the 2nd is the constructor
                    end = datetime.datetime(2017, 2, 7)        # start and end are python type, not str
                    df = web.DataReader("AAPL", "yahoo", start, end)
                    print(df)
                    
                    print('--- 2   plotting -----------')
                    import matplotlib.pyplot as plt
                    from matplotlib import style
                    style.use('fivethirtyeight')      # the look
                    df['Volume'].plot()               # Date ~ High, time-series plot
                    plt.legend()
                    plt.show()  
    
                

    02. quandl, the financial data provider

    03. code skeleton, quandl time-series demo

        import pandas as pd
        import quandl 
        quandl.ApiConfig.api_key = "my_quandl_key"
        
        df = quandl.get('EOD/AAPL',        # quandl code
                        paginate=True,     # always for efficiency
                        rows=10)            # last 10 
        
        print(df)  
        '''
                        Open     High     ...      Adj_Close  Adj_Volume
        Date                            ...
        2019-02-05  172.86  175.080     ...         174.18  36101628.0
        2019-02-06  174.65  175.570     ...         174.24  28239591.0
        '''
                
        # You can append the following code for plotting
        close = df['Close']
    
        import matplotlib.pyplot as plt
        plt.plot(close, label='Close')
        
        plt.xlabel("date")
        plt.ylabel("close")
        plt.title("quandl get api test\n2/8/2019")
        plt.legend()
        
        plt.show()
        #comment: R can do also. Python can do some others. 
                

    04. quandl table demo

        import pandas as pd
        import quandl 
        quandl.ApiConfig.api_key = "my_quandl_key"
        
        df = quandl.get_table('ZACKS/FC',  
                        ticker='AAPL',
                        qopts={'columns':['ticker', 'per_end_date', 'eps_diluted_net']},
                        paginate=True)
        print(df.tail(5))
        '''
                ticker per_end_date  eps_diluted_net
        None
        35     AAPL   2017-12-31             3.89
        36     AAPL   2018-03-31             2.73
        37     AAPL   2018-06-30             2.34
        38     AAPL   2018-09-30             2.95
        39     AAPL   2018-12-31             4.18
        '''
                    
      description
    • quandl.get_table is the api method for table data
    • Each api method has its own style.
      The code "rows=5" works for quandl.get,
      byt, not for get_table
    • ZACKS/FC is the quandl table name.
    • Ticker is stock symbol.
    • eps, earnings-per-share is one of the most widely used way to gauge company profitability.
    • find all the column names in a quandl table.
      • In the above code, remove the line to filter columns
      • The api method get_table returns a dataframe.
      • Using the following code to get names:
        for c in df.columns:
        print(c)