Next-Gen Data Pipes With Spark, Kafka, and K8s: Part 2

Introduction 

In our previous article, we discussed two emerging options for building new-age data pipes using stream processing. One option leverages Apache Spark for stream processing and the other makes use of a Kafka-Kubernetes combination of any cloud platform for distributed computing. The first approach is reasonably popular, and a lot has already been written about it. However, the second option is catching up in the market as that is far less complex to set up and easier to maintain. Also, data-on-the-cloud is a natural outcome of the technological drivers that are prevailing in the market. So, this article will focus on the second approach to see how it can be implemented in different cloud environments.

Kafka-K8s Streaming Approach in Cloud

In this approach, if the number of partitions in the Kafka topic matches with the replication factor of the pods in the Kubernetes cluster, then the pods together form a consumer group and ensure all the advantages of distributed computing. It can be well depicted through the below equation:

Python SDKs Package Management in GCP Artifact Registry

Introduction

Using a centralized, private repository to host SDK as a package not only enables code reuse but also simplifies and secures the existing software delivery pipeline. By using the same formats and tools as you would in the open-source ecosystem, you can leverage the same advantages, simplify building, and keep business logic and applications secure.

Storing SDK packages in Google Cloud Artifact Registry not only enables SDK code reuse but also simplifies and secures your existing build pipeline. In addition to bringing your internal packages to a managed repository, using Artifact Registry also allows you to take additional steps to improve the security of your software delivery pipeline. 

Sample Architecture Using Amazon AWS, Microsoft Azure, Google GCP, MongoDB, and Couchbase

Article Image
A drawing should have no unnecessary lines and a machine no unnecessary parts. 

                William Strunk Jr., Elements of Style

In the book Designing Data-Intensive Applications: The Big Ideas Behind Reliable, Scalable, and Maintainable Systems, Martin Kleppmann has written about traits and trade-offs for data infrastructure while designing modern applications. He has given an example architecture for a data system that combines several components. I used this example for the article Example Architectures for Data-Intensive Applications. That article explored just the Couchbase features and functions.

10 Rules for Better Cloud Security

Introduction

It’s estimated that already 50% of all global corporate data is being stored in the cloud, which is quite telling about the explosive growth of this still relatively young sector. We all know the benefits which propelled this adoption: increased agility, ease of scaling, and cost-effectiveness.

But regarding security, things are more nuanced: for some, the idea of handling most if not all of one business’s most valuable assets to a third-party organization is a kind of crazy, but for the (vast majority) of the others, it totally makes sense. You can benefit from the enormous security resources put in place by the cloud providers to protect your data, with the very best engineers working 24/7 to fulfill their mission. Even though, this is not quite the end of the story.

What Does 2022 Have in Store for Cybersecurity and Cloud Security Specialists?

Cloud adoption and industry transformation are accelerating as the world looks for efficiency. Let’s face it, 2022 promises to be another busy year for cybersecurity and cloud security specialists. 

According to the 2021 ISC Cybersecurity Workforce Study, we are still short 2.7 million cybersecurity professionals globally. There aren’t enough people to keep up with the rising threat, so we need to deploy automation heavily to tackle it. 

Contextual Design With Google Actions

One of the most complex tasks while designing a conversation is to create natural interactions with our users. However, there is one process called a contextual design that helps us to create these natural conversations. With the contextual design, you can design your conversation depending on the current situation of our users. 

For example, if the user is a first-timer using our Google Action, we will tell him/her a different welcome message than if he/she access it for the second time. Another example is the following one: if the user is from one City, we will provide information related to that city accessing geoinformation. Contextual design is one of the keys of Conversational AI.

Multimodal Design With Google Actions: Rich Responses Using Cards

Creating conversations is a really hard task. This is an entire design process that can take a lot of time. In terms of voice assistants, this process is even more complex due to the ability to interact with the user using sound and a display. When you mix those 2 interactions, you are creating a multimodal experience.

In this article, we will learn how to create engaging conversations using multimodality in our Google Action thanks to its Rich Responses using Cards.

Multimodal Design With Google Actions: Visual Selection Responses Using Collections

Creating conversations is a really hard task. This is an entire design process that can take a lot of time. In terms of voice assistants, this process is even more complex due to the ability to interact with the user using sound and a display. When you mix those 2 interactions, you are creating a multimodal experience.

In this article, we will learn how to create engaging conversations using multimodality in our Google Action thanks to its Visual Selection Responses using Collections.

Local Debugging on a Google Action

Google Actions can be developed using Firebase Cloud functions or a REST API endpoint. Firebase Cloud Function function is Googles's implementation of serverless functions available in Firebase. Google recommends using Firebase Cloud Function functions for Google Action development.

This is a very lightweight and powerful approach to developing our Google Action. However, it is complex to work locally with serverless functions like Firebase Cloud Functions.

Google Action With Node.js

Google Actions can be developed using Firebase Cloud functions or a REST API endpoint. Firebase Cloud Function function is Googles's implementation of serverless functions available in Firebase. Google recommends using Firebase Cloud Function functions for Google Action development.

In this post, we will implement a Google Action for Google Assistant by using Node.js, yarn, and Firebase Cloud Functions. This Google Action is basically a Hello World example.

Google Action Type Importer

This CLI allows you to transform your Alexa Custom Slots into Google Action Types.

Preface

Natural Language Understanding

NLU or Natural Language Understanding is one field of AI that allows us to understand the users' input in the form of voice or text.

0 to 100 Your DevOps Using Zeet

My wife (Nicole) has been by my side for several years – watching me architect, design, and develop applications. She has witnessed ideas born from cocktail napkins become part of the hundreds of articles I have written as a freelance writer. Nicole was there when I designed quite successful applications for her mother and twin sister.

She is also the decorator of our home. Above my desk is a very cool reminder she had made, which simply states "everything begins with an idea." To help you visualize this piece of artwork, I took a photo using my smartphone:

DevOps Services Pricing: AWS vs Azure vs Google Cloud

Cloud computing has rapidly become a strong driving factor for companies worldwide, as software is transferred out of in-house data centers in an effort to modernize, reduce costs, and boost agility. Businesses more and more use it as an all-in-one solution, a model in which a third-party supplier comprises and manages a customer’s fundamental infrastructure.

Among the most used and popular DevOps services, namely Amazon Web Services, Azure DevOps services, and Google Cloud services, there is a battle going on in the market. Based on Statista analytics, Amazon Web Services, the most prominent provider in the cloud computing industry, held 32% of the total market in the 3rd quarter of 2021. Microsoft Azure comes in 2nd place with a 21% market share, followed by Google Cloud with an 8%. Therefore, in the 3rd quarter of 2021, these three cloud suppliers are undoubtedly leading within the statistics.

How to Hive on GCP Using Google DataProc and Cloud Storage: Part 1

Google Cloud Dataproc is a managed Spark and Hadoop service that lets you take advantage of open-source data tools for batch processing, querying, streaming, and machine learning. This includes the Hadoop ecosystem (HDFS, Map/Reduce processing framework, and a number of applications such as Hive, Mahout, Pig, Spark, and Hue that are built on top of Hadoop). Hive gives an SQL-like interface to query data stored in various databases and file systems that integrate with Hadoop. Queries submitted via HIVE are converted into Map/Reduce jobs that access stored data,  results are then aggregated and returned to the user or application. 

For this exercise, we will be using New York city's yellow and green taxi trip data accumulated for the year 2019. Yellow Taxis are the only vehicles licensed to pick up street-hailing passengers anywhere in NYC while Green Taxis provide street hail service and prearranged service in northern Manhattan (above E 96th St and W 110th St) and in the outer boroughs. The dataset is available at the city portal.

AWS vs Azure vs GCP: Cloud Web Services Comparison in Detail

I am sure you are acquainted with the third wave of the digital revolution — cloud computing. Well, it's time to know them in person and figure out by using these cloud services, do you actually have a shot or not.

Digitalization is being embraced by all of us across the globe, especially cloud computing technology. Whether it's because of its scalability or security or reduced costs, cloud platforms have sprung up to a great extent over a few years. Gone are the days when businesses were confused about whether to choose a cloud service provider or not. Now the confusion surrounds the question of which cloud service provider to use. AWS, Azure, and Google Cloud are our top three contenders.

Cloud Technology News of the Month: September 2021

Autumn is officially here and with it another portion of fresh cloud technology news. 

This series brings you up to speed with the latest releases, acquisitions, research, and hidden gems in the world of cloud computing – the stuff actually worth reading.

Hands on Presto Tutorial: Presto 103

Introduction

This tutorial is Part III of our getting started with Presto series. As a reminder, PrestoDB is an open source distributed SQL query engine. In tutorial 101, we showed you how to install and configure presto locally, and in tutorial 102, we covered how to run a three-node PrestoDB cluster on a laptop. In this tutorial, we’ll show you how to run a PrestoDB cluster in a GCP environment using VM instances and GKE containers.

Environment

This guide was developed on GCP VM instances and GKE containers.