Skip to content

Issues with the data reader fetching yahoo finance #315

Closed
@Crowbeezy

Description

@Crowbeezy

Apologies first issue/comment on GitHub. I will review proper protocol. Please correct me if this is not the correct place to put this.


RemoteDataError Traceback (most recent call last)
in ()
4 end = dt.datetime(2017, 5, 8)
5
----> 6 INPX = data.DataReader(INPX ,'yahoo', start, end)
7
8 #Convert Volume from Int to Float

C:\Users\randomname\AppData\Local\Continuum\Anaconda3\lib\site-packages\pandas_datareader\data.py in DataReader(name, data_source, start, end, retry_count, pause, session)
92 adjust_price=False, chunksize=25,
93 retry_count=retry_count, pause=pause,
---> 94 session=session).read()
95
96 elif data_source == "yahoo-actions":

C:\Users\randomname\AppData\Local\Continuum\Anaconda3\lib\site-packages\pandas_datareader\yahoo\daily.py in read(self)
75 def read(self):
76 """ read one data from specified URL """
---> 77 df = super(YahooDailyReader, self).read()
78 if self.ret_index:
79 df['Ret_Index'] = _calc_return_index(df['Adj Close'])

C:\Users\randomname\AppData\Local\Continuum\Anaconda3\lib\site-packages\pandas_datareader\base.py in read(self)
176 df = self._dl_mult_symbols(self.symbols.index)
177 else:
--> 178 df = self._dl_mult_symbols(self.symbols)
179 return df
180

C:\Users\randomname\AppData\Local\Continuum\Anaconda3\lib\site-packages\pandas_datareader\base.py in _dl_mult_symbols(self, symbols)
195 if len(passed) == 0:
196 msg = "No data fetched using {0!r}"
--> 197 raise RemoteDataError(msg.format(self.class.name))
198 try:
199 if len(stocks) > 0 and len(failed) > 0 and len(passed) > 0:

RemoteDataError: No data fetched using 'YahooDailyReader'

Activity

rgkimball

rgkimball commented on May 12, 2017

@rgkimball
Contributor

Can you provide a sample that replicates your issue? This works for me:

start = datetime(2016, 12, 31)
end = datetime.now()
INPX = data.DataReader('INPX', 'yahoo', start, end)
Crowbeezy

Crowbeezy commented on May 15, 2017

@Crowbeezy
Author
benpillet

benpillet commented on May 16, 2017

@benpillet

From my requirements.txt:

pandas-datareader==0.4.0
pandas==0.20.1

and in python shell:

from datetime import *
import pandas_datareader.data as data
start = datetime(2016, 12, 31)
end = datetime.now()
INPX = data.DataReader('INPX', 'yahoo', start, end)

with error:

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "./venv3.5/lib/python3.5/site-packages/pandas_datareader/data.py", line 117, in DataReader
    session=session).read()
  File "./venv3.5/lib/python3.5/site-packages/pandas_datareader/yahoo/daily.py", line 77, in read
    df = super(YahooDailyReader, self).read()
  File "./venv3.5/lib/python3.5/site-packages/pandas_datareader/base.py", line 157, in read
    params=self._get_params(self.symbols))
  File "./venv3.5/lib/python3.5/site-packages/pandas_datareader/base.py", line 74, in _read_one_data
    out = self._read_url_as_StringIO(url, params=params)
  File "./venv3.5/lib/python3.5/site-packages/pandas_datareader/base.py", line 85, in _read_url_as_StringIO
    response = self._get_response(url, params=params)
  File "./venv3.5/lib/python3.5/site-packages/pandas_datareader/base.py", line 120, in _get_response
    raise RemoteDataError('Unable to read URL: {0}'.format(url))
pandas_datareader._utils.RemoteDataError: Unable to read URL: http://ichart.finance.yahoo.com/table.csv?f=2017&ignore=.csv&b=31&c=2016&g=d&a=11&d=4&s=INPX&e=16

I have a feeling yahoo updated their endpoint to be something else. I get a 502 when I try curl too. The link at https://finance.yahoo.com/quote/SPY/history?p=SPY points to https://query1.finance.yahoo.com/v7/finance/download/SPY?period1=1492372898&period2=1494964898&interval=1d&events=history&crumb=MLOX17FWABw

benpillet

benpillet commented on May 17, 2017

@benpillet

Looks like there's also a cookie that needs to be sent in order to avoid a 401 Unauthorized. https://www.elitetrader.com/et/threads/yahoo-historical-data-did-they-change-the-url-recently.309554/

rgkimball

rgkimball commented on May 17, 2017

@rgkimball
Contributor

Not sure if the icharts failure is a temporary problem, but I submitted a WIP PR (above) to replace the request structure. Even if icharts does come back online, may be a good idea to implement a backup.

IvanTrendafilov

IvanTrendafilov commented on May 17, 2017

@IvanTrendafilov

I don't have the time to fix this in the library, but, essentially, there is another API endpoint that one can use. It's https://query1.finance.yahoo.com. But it requires a matching cookie and crumb to use it. I wrote a little PhantomJS script to get it, whilst I was working on it: https://github.com/IvanTrendafilov/YahooFinanceAPITokens

It can be useful to someone who needs to create a URL that they want to query automatically.

You can also get a valid cookie / crumb combination from the Chrome dev tools in the Network tab.

IvanTrendafilov

IvanTrendafilov commented on May 18, 2017

@IvanTrendafilov

They've confirmed icharts isn't coming back, so @rgkimball's patch should certainly go in.

https://forums.yahoo.net/t5/Yahoo-Finance-help/Is-Yahoo-Finance-API-broken/td-p/250503/page/3

bkcollection

bkcollection commented on May 18, 2017

@bkcollection

@rgkimball @IvanTrendafilov , can the fix be pip install upgrade for the ease for beginner?

Franlodo

Franlodo commented on May 18, 2017

@Franlodo

Yahoo has change the URL, and the way the use date. Now date are Unixtime.
For example to get historic cvs from AAPL :
https://query1.finance.yahoo.com/v7/finance/download/AAPL?period1=1492510098&period2=1495102098&interval=1d&events=history&crumb=ydacXMYhzrn

period1 or period2 is date in (unixtime = (Human time - 25568) * 86400) but you must check your timezone, for example my place is Europe, then I have UTC+2 and I must sustract 7200 seconds. So my formula is (Human time - 25568) * 86400) -7200; where Human time is the time (d/mm/yyyy), 25568 is the number of days from 01/01/1900 till 01/01/1970 (This is because i do it in Excel and this is the minimun date), 86400 are the seconds in a day and 7200 are the number of seconds in my 2 hours difference with UTC

Interval is day, week or month
Events history is historic data prices, div|split&filter=split, for splits and div|split&filter=div for dividens
Crumb is the cookie ... I don't really know how it works, but I have with the same since monday.

I'm using this for update my data in Excel and it works and now I don't need to wait until morning to get the historical data because it's available at less an hour after the market close (I´m talking about american markets)

I apologyze for not to be fluent in english.

I hope this help

bkcollection

bkcollection commented on May 18, 2017

@bkcollection

@Franlodo DO you try do download using the new link to download like 1000 stocks, will it get blocked? The old API seems has no limitation but I am curious if the new one still allow that. Hope you can try on it to validate

Franlodo

Franlodo commented on May 18, 2017

@Franlodo

The link is to get the csv file in the web, it must run for 2 or for 2000; In my Excel file I have nearly 200 and run properly.

You can get 1000 of csv files and import from pandas, it will be the same made it saving files or "in the air"
I had post here because the error reported for pandas-datareader was the url

Anyway, I will try and comment.

60 remaining items

javadba

javadba commented on Jul 16, 2017

@javadba

@Harrymon12 This is a pandas site. If you have some tips about helping out on PANDAS and specifically this issue please feel free to do so. Otherwise your posts ARE spam.

Harrymon12

Harrymon12 commented on Jul 16, 2017

@Harrymon12

@javadba I understand. Thank you. :)

alisiddiq

alisiddiq commented on Apr 1, 2018

@alisiddiq

Another small package I wrote to overcome the 401 issues

https://github.com/alisiddiq/py_yahoo_prices

JECSand

JECSand commented on Aug 14, 2018

@JECSand

@javadba
@justinlent
@rgkimball

My replacement for the old yahoo-finance module, YahooFinancials, can get all of the historical price data needed by pandas' users. YF can return daily, weekly, and monthly historical price and volume JSON data for all stocks, ETFs, indices, cryptocurrencies, currencies, and commodity futures available on Yahoo Finance. Most Stackoverflow questions I have encountered regarding the module seem to revolve around getting it to work with Pandas. As long as Yahoo Finance is running on it's new React setup (why they killed the old API late last year, they got a new web app), my module will get the financial data.

Usage Example:

from yahoofinancials import YahooFinancials

yahoo_financials = YahooFinancials('WFC')
print(yahoo_financials.get_historical_price_data("2018-07-10", "2018-08-10", "monthly"))

Returns

{
    "WFC": {
        "currency": "USD",
        "eventsData": {
            "dividends": {
                "2018-08-01": {
                    "amount": 0.43,
                    "date": 1533821400,
                    "formatted_date": "2018-08-09"
                }
            }
        },
        "firstTradeDate": {
            "date": 76233600,
            "formatted_date": "1972-06-01"
        },
        "instrumentType": "EQUITY",
        "prices": [
            {
                "adjclose": 57.19147872924805,
                "close": 57.61000061035156,
                "date": 1533096000,
                "formatted_date": "2018-08-01",
                "high": 59.5,
                "low": 57.08000183105469,
                "open": 57.959999084472656,
                "volume": 138922900
            }
        ],
        "timeZone": {
            "gmtOffset": -14400
        }
    }
}

Anyway I'd be happy to fork a branch and build the price data from YahooFinancials into the panda-datareader's get_data_yahoo() method if you all want. I'd also be happy to work with one of your contributors to do so as well. Just let me know and I'd be happy to help!

More details at:
https://github.com/JECSand/yahoofinancials

magicmathmandarin

magicmathmandarin commented on Apr 1, 2019

@magicmathmandarin

Hi, Yahoo Finance is working for me. I am confused as to why you guys saying it is deprecated?

GrechTsangWL

GrechTsangWL commented on Apr 5, 2019

@GrechTsangWL
bashtage

bashtage commented on Apr 5, 2019

@bashtage
Contributor
magicmathmandarin

magicmathmandarin commented on Apr 5, 2019

@magicmathmandarin
abdoulayegk

abdoulayegk commented on Jun 29, 2020

@abdoulayegk

Can anyone help me fix this error

NotImplementedError: data_source=datetime.datetime(2015, 1, 1, 0, 0) is not implemented
Traceback:
File "/home/balde/.local/lib/python3.8/site-packages/streamlit/ScriptRunner.py", line 322, in _run_script
exec(code, module.dict)
File "/home/balde/Desktop/Projects/StockProject/stock_app.py", line 14, in
globals()[stock] = DataReader(stock,start, end)
File "/home/balde/.local/lib/python3.8/site-packages/pandas/util/_decorators.py", line 214, in wrapper
return func(*args, **kwargs)
File "/home/balde/.local/lib/python3.8/site-packages/pandas_datareader/data.py", line 376, in DataReader
raise NotImplementedError(msg)

krb1971

krb1971 commented on Sep 5, 2020

@krb1971

I recently started to learn python for finance. From this thread, I understood at some level the yahoo fin package related issues but while I started to run the following code, 'ticker' not found error is persistent. Could you please check it and guide me further?

Here are the libraries used and the code:

import tensorflow as tf
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import LSTM, Dense, Dropout, Bidirectional
from sklearn import preprocessing
from sklearn.model_selection import train_test_split
from yahoo_fin import stock_info as si
from collections import deque

import numpy as np
import pandas as pd
import random

import matplotlib
import matplotlib.pyplot as plt
import matplotlib.ticker as mticker
import matplotlib.dates as mdates
import numpy as np

def load_data(ticker, n_steps=50, scale=True, shuffle=True, lookup_step=1,
test_size=0.2, feature_columns=['adjclose', 'volume', 'open', 'high', 'low']):
# see if ticker is already a loaded stock from yahoo finance
if isinstance(ticker, str):
# load it from yahoo_fin library
df = si.get_data(ticker)
elif isinstance(ticker, pd.DataFrame):
# already loaded, use it directly
df = ticker
# this will contain all the elements we want to return from this function
result = {}
# we will also return the original dataframe itself
result['df'] = df.copy()
# make sure that the passed feature_columns exist in the dataframe
for col in feature_columns:
assert col in df.columns, f"'{col}' does not exist in the dataframe."
if scale:
column_scaler = {}
# scale the data (prices) from 0 to 1
for column in feature_columns:
scaler = preprocessing.MinMaxScaler()
df[column] = scaler.fit_transform(np.expand_dims(df[column].values, axis=1))
column_scaler[column] = scaler

    # add the MinMaxScaler instances to the result returned
    result["column_scaler"] = column_scaler
# add the target column (label) by shifting by `lookup_step`
df['future'] = df['adjclose'].shift(-lookup_step)
# last `lookup_step` columns contains NaN in future column
# get them before droping NaNs
last_sequence = np.array(df[feature_columns].tail(lookup_step))
# drop NaNs
df.dropna(inplace=True)
sequence_data = []
sequences = deque(maxlen=n_steps)
for entry, target in zip(df[feature_columns].values, df['future'].values):
    sequences.append(entry)
    if len(sequences) == n_steps:
        sequence_data.append([np.array(sequences), target])
# get the last sequence by appending the last `n_step` sequence with `lookup_step` sequence
# for instance, if n_steps=50 and lookup_step=10, last_sequence should be of 59 (that is 50+10-1) length
# this last_sequence will be used to predict in future dates that are not available in the dataset
last_sequence = list(sequences) + list(last_sequence)
# shift the last sequence by -1
last_sequence = np.array(pd.DataFrame(last_sequence).shift(-1).dropna())
# add to result
result['last_sequence'] = last_sequence
# construct the X's and y's
X, y = [], []
for seq, target in sequence_data:
    X.append(seq)
    y.append(target)
# convert to numpy arrays
X = np.array(X)
y = np.array(y)
# reshape X to fit the neural network
X = X.reshape((X.shape[0], X.shape[2], X.shape[1]))
# split the dataset
result["X_train"], result["X_test"], result["y_train"], result["y_test"] = train_test_split(X, y, test_size=test_size, shuffle=shuffle)
# return the result
return result

load the data

data = load_data(ticker, N_STEPS, lookup_step=LOOKUP_STEP, test_size=TEST_SIZE, feature_columns=FEATURE_COLUMNS)

THIS GIVES FOLLOWING ERROR:

NameError Traceback (most recent call last)
in ()
1 # load the data
----> 2 data = load_data(ticker, N_STEPS, lookup_step=LOOKUP_STEP, test_size=TEST_SIZE, feature_columns=FEATURE_COLUMNS)
3
4 # save the dataframe
5 data["df"].to_csv(ticker_data_filename)

NameError: name 'ticker' is not defined

rgkimball

rgkimball commented on Sep 5, 2020

@rgkimball
Contributor

load the data

data = load_data(ticker, N_STEPS, lookup_step=LOOKUP_STEP, test_size=TEST_SIZE, feature_columns=FEATURE_COLUMNS)

THIS GIVES FOLLOWING ERROR:

NameError Traceback (most recent call last)
in ()
1 # load the data
----> 2 data = load_data(ticker, N_STEPS, lookup_step=LOOKUP_STEP, test_size=TEST_SIZE, feature_columns=FEATURE_COLUMNS)
3
4 # save the dataframe
5 data["df"].to_csv(ticker_data_filename)

NameError: name 'ticker' is not defined

@krb1971,

The error you've shared doesn't have anything to do with the pandas datareader package. You need to define ticker before passing it into the function.

krb1971

krb1971 commented on Sep 6, 2020

@krb1971

Thank you for responding. When I defined load_data as following, it did not ask for defining 'ticker'

def load_data(ticker, n_steps=50, scale=True, shuffle=True, lookup_step=1,
test_size=0.2, feature_columns=['adjclose', 'volume', 'open', 'high', 'low']):
....................

But when I use ticker below, it complains as "ticker" not defined. As I am novice to this, kindly guide.

data = load_data(ticker, N_STEPS, lookup_step=LOOKUP_STEP, test_size=TEST_SIZE, feature_columns=FEATURE_COLUMNS)

Franlodo

Franlodo commented on Sep 6, 2020

@Franlodo
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Relationships

    None yet

      Development

      Participants

      @jotbe@javadba@IvanTrendafilov@jreback@krinkere

      Issue actions

        Issues with the data reader fetching yahoo finance · Issue #315 · pydata/pandas-datareader