MATLAB Answers

0

How do I read raw text from a webpage (ESPN Fantasy Baseball)?

Asked by James Bopp on 20 May 2019
Latest activity Commented on by James Bopp on 20 May 2019
I am trying to read the raw text from the following url (using Google Chrome, and R2016b MATLAB):
The format of the text is JSON. All I need is the raw text so I can parse it with JSONDECODE. Right now I'm having to manually copy and paste the text from the webpage into a text document, which allows me to read the data as a string. I don't want to do this. I am open to any methods that will allow me to go to the above url and directly read all of the text into a string.
Bonus problem: the above url is for a public website, however I would like to be able to log into ESPN at the following url: http://www.espn.com/login then proceed with getting the text from the url.
To sum up, I want to do the following:
1) Log into ESPN (go to its url and input the user/password, either manually or programatically)
2) Go to an ESPN url to read in its data (either the raw text which I can decode with JSONDECODE, or read it in directly as a JSON structure).

  0 Comments

Sign in to comment.

1 Answer

Answer by Geoff Hayes
on 20 May 2019

James - as a first step, try using webread to read the data from the URL which you can then pass to jsondecode like
urlToFetch = 'http://fantasy.espn.com/apis/v3/games/flb/seasons/2019/segments/0/leagues/15243217?view=mMatchupScore&view=mRoster&view=mScoreboard&view=mSettings&view=mTopPerformers&view=mTeam&view=modular&view=mNav';
jsonData = webread(urlToFetch);
As for logging into the website, that may be a little more difficult. Do you know of any public APIs that would allow this? I suspect that when you login, a token of some kind would be returned that you would need to use on subsequent calls to get the other data.

  3 Comments

Thank you, Geoff.
In the case of a public league (using the url I provided) your solution is perfect. However a private league will give me an error, which is why I'm looking for a solution for the login or a workaround by reading text directly from a webpage.
I have tried the following: I have used Matlab's web browser (web.m) to log into the ESPN website, then called the function again with my url (e.g. [data,h] = web(urlToFetch)). This will open the web page in Matlab's web browser, however I have no way of extracting the text programatically (I manually have to copy/paste).
Edit: I did not provide the error code I get when attempting to use webread with a private url, perhaps that would be useful:
Error using readContentFromWebService (line 45)
The server returned the status 401 with message "" in response to the request to URL
The 401 error means that you are unauthorized and so need to provide the correct login credentials.
As for getting the content (the json) from the web browser, could you try something like
urlToFetch = 'http://fantasy.espn.com/apis/v3/games/flb/seasons/2019/segments/0/leagues/15243217?view=mMatchupScore&view=mRoster&view=mScoreboard&view=mSettings&view=mTopPerformers&view=mTeam&view=modular&view=mNav';
[stat,h] = web(urlToFetch);
jsonData = get(h, 'HtmlText') ;
using the private URL once you have logged in.
Thanks again Geoff,
It's funny, I had tried that before on a different problem I was working and ran into trouble, so when I came upon this problem I thought I had already tried that. As it turns out, it seems I hadn't. The getHtmlText seems to work (just need to strip the HTML tags, but that's no problem). When I get home from work I will look more into this, but I believe you have solved the problem for my workaround and I can proceed with my project. Thanks!
Now, if anyone can figure out how to log into ESPN's website and use that for webscraping follow-on calls to ESPN's site, that would still be greatly appreciated.

Sign in to comment.