WebScraping With Python, Beautiful Soup, and Urllib3

In this day and age, information is key. Through the internet, we have an unlimited amount of information and data at our disposal. The problem, however, is because of the abundance of information we as the users become overwhelmed. Fortunately, for those users, there are programmers with the ability to develop scripts that will do the sorting, organizing, and extracting of this data for them. Work that would take hours to complete can be accomplished with just over 50 lines of code and run in under a minute. Today, using Python, Beautiful Soup, and Urllib3, we will do a little WebScraping and even scratch the surface of data extraction to an excel document.

Research

The website that we will be working with is called books.toscrape.com. It's one of those websites that is literally made for practicing WebScraping. Before we begin, please understand that we won't be rotating our IP Addresses or User Agents. However, on other websites, this may be a good idea, since they will most likely block you if you're not "polite." (I'll talk more on the concept of being polite in later posts. For now, just know that it means to space out the amount of time between your individual scrapes.)