How to Do Deep Learning for Java

Deep in thought studying deep learning for Java.

Introduction

Some time ago, I came across this life-cycle management tool (or cloud service) called Valohai, and I was quite impressed by its user-interface and simplicity of design and layout. I had a good chat about the service at that time with one of the members of Valohai and was given a demo. Previous to that, I had written a simple pipeline using GNU Parallel, JavaScript, Python, and Bash — and another one purely using GNU Parallel and Bash.

I also thought about replacing the moving parts with ready-to-use task/workflow management tools like Jenkins X, Jenkins Pipeline, Concourse or Airflow, but due to various reasons, I did not proceed with the idea.

Machine Learning in Android Using Firebase ML Kit

Back in the day, using machine learning capabilities was only possible over the cloud, as it required a lot of compute power, high-end hardware, etc. But mobile devices nowadays have become much more powerful and our algorithms more efficient. All this has led to on-device machine learning being a possibility and not just a science fiction theory.

On-device machine learning is being used everywhere, such as:

The Problem Is, This Jeeves Can’t Think

Circa 2025. An autonomous BMW sedan with a passenger slows down near a crossing in LA. It has sensed an elderly couple on the pavement waiting to cross the road. A couple of minutes pass by, and both parties remain static. The couple  —  who is actually waiting for their son to pick them up  —  has no clue why the driverless car has come to a halt in front of them. They gesture the car to go ahead even as the passenger fumes in the backseat. But the vehicle has "machine learned" to be polite and careful.

It does not have an alternative course of behavior, unlike the resourceful Jeeves in a PG Wodehouse novel.

How to Build Pivot Tables

Did you know that a pivot table allows you to quickly summarize your data based on a group, pivot, and aggregation columns? This summary might include sums, averages, or other statistics, which the pivot table splits the statistics is a meaningful way for different subgroups and draws attention to useful information.

Fig. 1: A pivot table showing the average sunshine hours for each city in each month. This table was constructed by applying the pivoting function to a dataset that contains at least one column for month (group column), one column for city (pivot), and one column for sunshine hours (aggregation column).

Fast Deterministic Prime Test for n Less Than Quintillion

The most common way to test whether a large number is prime is the Miller-Rabin test. If the test says a number is composite, it's definitely composite. Otherwise, the number is very likely, but not certain, to be prime. A pseudoprime is a composite number that slips past the Miller-Rabin test. (Actually, a strong pseudoprime. More on that below.)

Miller-Rabin Test

The Miller-Rabin test is actually a sequence of tests, one for each prime number. First, you run the test associated with 2, then the test associated with 3, then the one associated with 5, etc. If we knew the smallest numbers for which these tests fail, then for smaller numbers, we know for certain that they're prime if they pass. In other words, we can turn the Miller-Rabin test for probable primes into a test for provable primes.

Deep Learning and the Human Brain: Inspiration, Not Imitation

Artificial intelligence is the future. Structurally, artificial intelligence is perceived almost to be an individual entity influencing every technology. Machine learning is one of the sciences behind this entity, and deep learning is the engine that propels the science.

Deep learning transcends human ability to process a large volume of data. With a rush of data and the advent of faster GPUs and TPUs, deep learning is taking giant strides in the realm of image analysis, facial recognition, autonomous driving, etc.

Why Is Innovation Not A Corporate Priority?

Innovation is supposed to be the most valuable currency in the tech industry right now, as executives strive to cope with the volatile times we find ourselves in. The popular and business press is awash with stories of digital disruption, with an overall impression created that only the most innovative can survive.

One would imagine, therefore, that innovation is a top priority for executives the world over. Except that doesn't appear to be the case, at least not according to a recent survey from Harvard Business School, which found that just 30% of the 5,000 or so executives the researchers quizzed put innovation in their top three concerns. The survey also revealed that just 21% believe technology trends was a pressing concern as well, leaving the two metrics ranked just 5th and 7th respectively.

Using AI To Make Hearing Aids Better

Hearing loss can be debilitating and can significantly hinder the life of an individual. One of the key challenges is in distinguishing voices in noisy environments.

A Danish team believes they may have come up with a solution, with AI being deployed to both recognize and separate voices even in unknown sound environments. The work, which was documented in a recently published paper, aims to improve the ability of hearing aids to process sounds, even in unknown environments.

New Research Highlights the Long Road Still Ahead for AI

The media has been awash with breathless prose about the capabilities of artificial intelligence in recent years. One would be forgiven for thinking that machines are practically at human levels of cognition already, or at least will be very soon.

A recent study from UCLA highlights just how far there still is to go. The study illustrated a number of quite significant limitations that the researchers believe we have to understand and improve upon before we let ourselves get carried away.

Study of Medical AI Boasts Impressive Accuracy, But Doesn’t Tell the Full Story

A new study published recently in Nature Medicine and covered in Quartz suggests that AI systems may be able to someday take the diagnostic reins from physicians, at least when it comes to the diagnosis of common childhood diseases. The study’s deep-learning system was so successful, in fact, that it outperformed some doctors in correctly identifying a range of conditions. The study, however, (though promising) is not without its limitations.

As anyone familiar with how these models work will tell you, these systems are ultimately only as good as the data upon which they’re trained; and in this instance, the data came entirely from one medical center in China. Sure, it was able to successfully find diagnostic patterns when subsequently put to the test among this very specific community, but can we really assume it would be just as successful in, say, Manhattan (NY, not Kansas), having had no training on this vastly different population? There are certainly models out there – like this one I recently wrote about – that perform quite well in zero-shot environments, but the amount and variety of the data required to make this happen is staggering.

Movie Recommendations With Spark Collaborative Filtering

Collaborative filtering (CF)[1] based on the alternating least squares (ALS) technique[2] is another algorithm used to generate recommendations. It produces automatic predictions (filtering) about the interests of a user by collecting preferences from many other users (collaborating). The underlying assumption of the collaborative filtering approach is that if a person A has the same opinion as a person B on an issue, A is more likely to have B's opinion on a different issue than a randomly chosen person. This algorithm gained a lot of traction in the data science community after it was used by the team winner of the Netflix Prize.

The CF algorithm has also been implemented in Spark MLlib[3] with the aim to address fast execution on very large datasets. KNIME Analytics Platform with its Big Data Extensions offers it in the Spark Collaborative Filtering node. We will use it here to recommend movies to a new user within a KNIME implementation of the collaborative filtering solution provided in the Infofarm blog post[4].