... | ... | @@ -16,7 +16,7 @@ terminal. |
|
|
|
|
|
## Exercise 1 - Crawl Academy Awards for Best Actor/Actress
|
|
|
|
|
|
We’ve prepared a [website](http://10.0.0.1/academyawardnominees/) that
|
|
|
We’ve prepared a [website](http://disco-crawler-lab.tik.ee.ethz.ch/academyawardnominees/) that
|
|
|
shows a table with all actors and actresses who have been nominated for
|
|
|
an Academy Award. Familiarize yourself with the page’s source code by
|
|
|
using the source inspector of your browser and solve the following
|
... | ... | @@ -41,7 +41,7 @@ Hints: |
|
|
To keep web traffic low and reduce the risk of being blacklisted, we
|
|
|
have cloned some Rotten Tomatoes pages and are hosting them locally. You
|
|
|
can access the detail page through a unique URL. Combine the year and
|
|
|
movie title like this: <http://10.0.0.1/m/year/title> to access the
|
|
|
movie title like this: <http://disco-crawler-lab.tik.ee.ethz.ch/m/year/title> to access the
|
|
|
local clone of the movie detail page. (Transform the movie title to
|
|
|
lower case. Remove any apostrophe characters (’) and replace spaces and
|
|
|
slashes (/) with underline characters (\_)).
|
... | ... | |