Exploring CockroachDB with ipython-sql and Jupyter Notebook

Today, I will demonstrate how ipython-sql can be leveraged in querying CockroachDB.  This will require a secure instance of CockroachDB for the reasons I will explain below. 

Running a secure docker-compose instance of CRDB is beyond the scope of this tutorial. Instead, I will publish everything you need to get through the tutorial in my repo, including the Jupyter Notebook. You may also use CRDB docs to stand up a secure instance and change the URL in the notebook to follow along.

This post will dive deeper into the Python ecosystem and build on my previous Python post. Instead of reliance on pandas alone, we're going to use a popular SQL extension called ipython-sql, a.k.a. SQLmagic to execute SQL queries against CRDB.


As stated earlier, we need to use a secure instance of CockroachDB. In fact, from this point forward, I will attempt to write posts only with secure clusters, as that's the recommended approach. Ipython-sql uses sqlalchemy underneath and it expects database URLs in the format postgresql://username:password@hostname:port/dbname. CockroachDB does not support password fields with insecure clusters, as passwords alone will not protect your data.

Getting Started With C# DataFrame and XPlot.Ploty

For the Python programming language, Pandas is an efficient and popular data analysis tool, especially its Dataframe, used to manipulate and display data. For the .NET programming languages, we can use Deedle or Microsoft.Data.Analysis package available in NuGet which also provides a DataFrame class used to manipulate, transform, and display data.

This example focuses on Microsoft.Data.Analysis package by demonstrating some basic features of the DataFrame class in Jupyter Notebook.

Using .NET Core in Jupyter Notebook

Jupyter Notebook is an open-source web application that allows you to create and share documents that contain live code, equations, visualizations, and narrative text. It can be used as a tool for interactively developing and presenting data science projects. Mostly, it is used with Python and R which are scripting languages. However, it can also be used with compiled languages, such as .NET programming languages, Go, and Julia. For a list of supported programming languages, please refer to the Jupyter kernels page in GitHub.

This article explains steps to set up Jupyter Notebook for .NET Core programming languages on Windows 10.  It is based on the .NET Notebooks Preview 2 that supports C#, F#, and PowerShell. This article also provides few C# examples that demonstrate how to use DataFrame and Charts.

A Primer on ML and Jupyter Notebook

Recently, I was working on an edge computing demo[1] that used ML (machine learning) to detect anomalies for a manufacturing use case. While I had a generic understanding of what ML is, I lacked the practitioner's understanding of how to use it. Similarly, I’d heard of Jupyter Notebook and was vaguely aware that it was connected with ML, but didn’t really know what it was and how to use one. This article is geared towards people who just want to understand ML and Jupyter Notebook. There are plenty of great resources available if you want to learn how to build ML models.

Caution: If you’re a data scientist then this article is not for you! We’ll be using very simple analysis techniques to serve as a teaching aid. 

How Python Can Be Your Secret Weapon As a Data Scientist

Python is highly versatile and one of the most advanced programming languages in the world. There are tons of reasons why Python is getting extremely popular these days. Many experts consider it as one of the first choices in industries coming to programming languages. 

Also, there have been many sayings about Python that the development of future technologies will solely rely on it. Technologies that include Data Science, AI, ML will take the driver seat to combine with Python. By adding more and more easiness in deep-driven research purposes and better product development.

Using R on Jupyter Notebook

Overview

R is an interpreted programming language for statistical computing and graphics supported by the R Foundation. It is widely used among statisticians and data miners for developing statistical software and data analysis.

R is available as Free Software under the terms of the Free Software Foundation’s GNU General Public License in source code form. It compiles and runs on a wide variety of UNIX platforms and similar systems (including FreeBSD and Linux), Windows, and macOS.

Simplifying Access to Db2 Databases in Jupyter Notebook

Are you trying to figure out the best way to access Db2 data from within your Jupyter Notebook? Or perhaps you are already using a technique and are looking for ways to simplify things? If so, did you know that there are three ways of connecting to your existing Db2 data?

  1. Use native Python Db2 API calls to connect and manipulate the data
  2. Take advantage of Pandas built-in support of databases
  3. Install extensions to Jupyter notebooks (Magic commands)

Getting access to Db2 data from within a notebook requires that you use the following command to install the appropriate Db2 drivers either from your notebook or from a shell prompt.