Getting Started With Pandas

Today we will introduce one of the first inner training chapters on the fundamentals of DataScience treatment tools. We are talking about Pandas, Numpy, and Matplotlib. Pandas is a third-party library for numerical computing based on NumPy. It excels in handling labeled one-dimensional (1D) data with Series objects and two-dimensional (2D) data with DataFrame objects.

NumPy is a third-party library for numerical computing, optimized for working with single- and multi-dimensional arrays. Its primary type is the array type called ndarray. This library contains many routines for statistical analysis.

Using Technical Analysis Indicators to Send Buy-or-Sell Trading Signals to a Chatroom

In this article, I will look at how I can use technical analysis indicators to send buy-or-sell trading signals to a chatroom on an ongoing basis — removing the need to keep eyeballing Technical Analysis charts constantly.

According to Investopedia ‘Technical Analysis is a trading discipline employed to evaluate investments and identify trading opportunities by analyzing statistical trends gathered from trading activity, such as price movement and volume’. It further states ‘technical analysts focus on patterns of price movements, trading signals … to evaluate a security’s strength or weakness’.

My Newbie Challenges With Matplotlib

In this article, I would like to share the challenges I faced (and the solutions!) as a Python newbie using Matplotlib in anger for the first time…

Recently I was tasked with developing a Python-based workflow as well as an article for my employer. The workflow involved generating some Technical Analysis related charts and the article involved plotting some Volatility Surfaces and various Curves.

Book Review: Machine Learning With Python for Everyone by Mark E. Fenner

Machine learning, one of the hottest tech topics of today, is being used more and more. Sometimes it's the best tool for the job, other times a buzzword that is mainly used as a way to make a product look cooler. However, without knowing what ML is and how it works behind the scenes, it’s very easy to get lost. But this book does a great job of guiding you all the way from very simple math concepts to some sophisticated machine learning techniques. 

Today, in the Python ecosystem, we have a plethora of powerful data science and machine learning related packages available, like Numpy, Pandas, Scikit-learn, and many others, which help to simplify a lot of its inherent complexity. In case you are wondering, in terms of Python packages, the great hero in this book is Scikit-learn, often abbreviated as  sklearn. Of course, the data wrangling is much easier and much faster using Numpy and Pandas, so these two packages are always covering sklearn’s back. Seaborn and Matplotlib, two of the most standard data visualization packages for Python, are also used here. In chapter 10, patsy makes a brief appearance, and in chapter 15, pymc3 is used in the context of probabilistic graphic models. 

Visualizing Distributions With Scatter Plots in Matplotlib

Let's say that we want to study the time between the end of a marked point and next serve in a tennis game. After gathering our data, the first thing that we can do is to draw a histogram of the variable that we are interested in: 

import pandas as pd
import matplotlib.pyplot as plt

url = 'https://raw.githubusercontent.com/fivethirtyeight'
url += '/data/master/tennis-time/serve_times.csv'
event = pd.read_csv(url)

plt.hist(event.seconds_before_next_point, bins=10)
plt.xlabel('Seconds before next serve')

plt.show()

Histogram