The Distributed Data Problem

Today, online retailers sell millions of products and services to customers all around the world.  This was more prevalent in 2020, as COVID-19 restrictions all but eliminated visits to brick-and-mortar stores and in-person transactions. Of course, consumers still needed to purchase food, clothing, and other essentials, and, as a result, worldwide digital sales channels rose to the tune of $4.2 trillion, up $900 billion from just a year prior.

Was it enough for those retailers to have robust websites and mobile apps to keep their customers from shopping with competitors?  Unfortunately, not. Looking across the eCommerce landscape of 2020, there were clear winners and losers. But what was the deciding factor?

Optimizing Distributed Joins: Google Cloud Spanner and DataStax Astra DB

Distributed joins are commonly considered too expensive to use for real-time transaction processing. That is because, besides joining data, they also frequently require moving or shuffling data between nodes in a cluster, which can significantly affect query response times and database throughput. However, there are certain optimizations that can completely eliminate the need to move data to enable faster joins. In this article, we first review the four types of distributed joins, including shuffle join, broadcast join, co-located join, and pre-computed join. We then demonstrate how leading fully managed Relational and NoSQL databases, namely Google Cloud Spanner and DataStax Astra DB, support optimized joins that are suitable for real-time applications.

Four Types of Distributed Joins

Joins are used in databases to combine related data from one or more tables or datasets. Data is usually combined based on some condition that relates columns from participating tables. We call columns used in a join condition join keys and assume they are always related by equality operators.

Google Cloud Messaging with Payload

google cloud messaging (or gcm) sends two types of messages:

  1. collapsible, “send-to-sync” messages, where  new messages replace older ones  in the sending queue. (i.e. the older messages are “collapsed”).
  2. non-collapsible messages with payload, where  every single message is delivere  d.

each payload in non-collapsible messages is a unique content that has to be delivered and can’t be just replaced with a more recent message in the server sending queue. on the other hand, a  collapsible message can be a simple  ping  from the server to ask its mobile clients to sync their data.

Geek Reading – Cloud, SQL, NoSQL, HTML5

I have talked about human filters and my plan for digital curation. These items are the fruits of those ideas, the items I deemed worthy from my Google Reader feeds. These items are a combination of tech business news, development news and programming tools and techniques.

I hope you enjoy today’s items, and please participate in the discussions on those sites.

Android Cloud Apps with Azure

a  recent study by gartner  predicts a very significant increase in cloud usage by consumers in a few years, due in great part to the ever growing use of smartphone cameras by the average household. in this context, it could be useful to have a smartphone application that is able to upload / download digital content from a cloud provider.

in this article, we will construct a basic android prototype that will allow us to plug in the  windows azure  cloud provider, and use the  windows azure toolkit for android (  available at  github  ) to do all of the  basic cloud operations  :  upload  content to cloud storage,  browse  the storage,  download  or  delete  files in cloud storage. once those operations are implemented, we will see how to enable our android application to receive server  push notifications  .

Mobile Database Essentials

Relational, NoSQL, cloud-based, embedded, multi-model — the database options are endless. When selecting the right database, it is important to explore essential components like local data storage, synchronization, security, and more. In this Refcard, assess critical data needs, storage requirements, and more when leveraging databases for cloud and edge applications.

Arduino Data on MQTT

MQTT is an OASIS standard messaging protocol for the Internet of Things (IoT) and one of the protocols supported by akenza. 
It is designed as an extremely lightweight publish/subscribe messaging protocol that is ideal for connecting remote devices with a small code footprint and minimal network bandwidth. MQTT is used in various industries. 
To run this project, we used akenza as an IoT platform, as it runs an open-source MQTT broker from Eclipse MosquittoBy using a combination of MQTT and API functionalities, we have been able to automatically create Digital Twins for our device.

As Hardware, we have chosen an Arduino Uno WiFi Rev2.

1. Configure the Arduino Device

1.1 Set up the WiFi Connection

To have the Arduino Uno Wifi able to connect to WiFi, we used the WiFiNINA library, available in the Library Manager of Arduino IDE.

1.1.1 Manage Username and Password

To manage Username and Password, we have created an additional header file called arudino_secrets.h 
 
#define SECRET_SSID "<your username>"
#define SECRET_PASS "<your password>"


1.1.2 WiFi Connection Code

The code to connect Arduino to WiFi is reported as below:
 
#include <WiFiNINA.h>
#include "arduino_secrets.h"

///////please enter your sensitive data in the Secret tab/arduino_secrets.h
char ssid[] = SECRET_SSID;     // your network SSID (name)
char pass[] = SECRET_PASS;    // your network password (use for WPA, or use as key for WEP)

WiFiClient wifiClient;

void setup() {
  //Initialize serial and wait for port to open:
  Serial.begin(9600);
  while (!Serial) {
    ; // wait for serial port to connect. Needed for native USB port only
  }

  // attempt to connect to Wifi network:
  Serial.print("Attempting to connect to WPA SSID: ");
  Serial.println(ssid);
  while (WiFi.begin(ssid, pass) != WL_CONNECTED) {
    // failed, retry
    Serial.print(".");
    delay(5000);
  }

  Serial.println("You're connected to the network");
  Serial.println();
}

void loop()
{}


1.2 Set up the MQTT Connection to akenza

For security reasons, akenza only supports authenticated connections via MQTT. For this, we have chosen as library PubSubClient to manage our MQTT connection. This enables us to use username and passwords in our connection string. 
 
#include <PubSubClient.h>

//MQTTClient mqttClient(WiFiClient);

char host[] = "mqtt.akenza.io";
char clientid[] = "Arduino";
char username[] = "<copy from Akenza Device Api configuration>";
char password[] = "<copy from Akenza Device Api configuration>";
char outTopic[] = "<copy from Akenza Device Api configuration>";

PubSubClient client(host, 1883, callback, wifiClient);

void setup() {
  if (client.connect(host, username, password)) {
    Serial.print("Connected to ");
    Serial.println(host);
    Serial.println();
    
    boolean r = client.subscribe(outTopic);
    Serial.print("Subscribed to ");
    Serial.println(outTopic);
    Serial.println();
    } 
    else {
      // connection failed
      // mqttClient.state() will provide more information
      // on why it failed.
      Serial.print("Connection failed: ");
      Serial.println(client.state());
      Serial.println();
  }
}


Advanced Cloud Security

Cyber threats have become more sophisticated. Hence, it is a good idea to utilize the expertise of public cloud providers to better manage assets against security threats. Cloud security is a collection of proactive measures to protect your cloud assets from internal and external threats. In this Refcard, we will walk through common cloud security challenges, continuous security for cloud infrastructure, and advanced strategies for securing cloud workloads.

What Should I Choose in 2015 – Cloud Hosting VS VPS Hosting?

This article was originally published on 1/5/14

The countdown for 2015 has already begun and so are the businesses eager to roll out new strategies and technical innovations with the New Year. Especially the entrepreneurs. 2014 has been a year of many ups and downs for most ventures due to the various algorithm updates rolled out by the search giant. 

Key Takeaways: Adrian Cockcroft’s talk on Netflix, CD, and Microservices

One of the big draws of the O'Reilly Software Architecture Conference was Adrian Cockcroft's talk, "Deliver Faster and Spend Less with Cloud Native Microservices."  Cockcroft is an experienced speaker on the conference circuit and he's well-known as the architect who led Netflix into its new era of unprecedented scale and agility.  

He now works for Battery Ventures, but he still draws primarily on his experiences at Netflix for his talks.  He and his team were the ones behind the greatest success story for the latest trend in software architecture: microservices.

Google Cloud Messaging with Android

You have probably heard a lot of talk about the wonderful things the cloud can do for you, and you are probably curious about how those services may come into play in your daily life. If this sounds like you, then you need to know that cloud services are playing an increasingly important role in our lives, and we need to look at how they can change how we message one another. 

Many people are looking at Android cloud messaging as the next leap forward into a future where it is possible to reach out to the people we care about and save those messages directly in the cloud. Never miss the opportunity to communicate with someone who truly matters to you, and start using cloud storage to back up your messages. It is as simple as that! 

Getting Started With Kubernetes

Containers weighing you down? Kubernetes can scale them. To build a reliable, scalable containerized application, you need a place to run containers, scale them, update them, and provide them with networking and storage. Kubernetes is the most popular container orchestration system. It simplifies deploying, monitoring, and scaling application components, making it easy to develop flexible, reliable applications. This updated Refcard covers all things Kubernetes, exploring core concepts, important architecture considerations, and how to build your first containerized application.

Introduction to AWS Config: Simplified Cloud Auditing

Modern cloud environments are ever-changing, and so is the nature of cloud computing. The growing cloud assets accompany the attack surface expansion problem for organizations, which unveils the need for visibility of cloud resources. AWS Config addresses that exact demand. It can be challenging to understand resources within your infrastructure like:

  • Seeing what resources you have
  • Understanding your current configurations
  • Knowledge of configuration changes and change histories
  • Assessing if your resources are compliant with specific governances controls 
  • Having accurate and up-to-date audit information

Depending on the size of your AWS resources or deployment, overcoming these challenges and obtaining this information can become time-consuming and budget-intensive unless you use resource visibility and auditing tool like AWS Config

Model Quantization for Edge AI

Deep learning is witnessing a growing history of success. However, the large/heavy models that must be run on a high-performance computing system are far from optimal. Artificial intelligence is already widely used in business applications. The computational demands of AI inference and training are increasing. As a result, a relatively new class of deep learning approaches known as quantized neural network models has emerged to address this disparity. Memory has been one of the biggest challenges for deep learning architectures. It was an evolution of the gaming industry that led to the rapid development of hardware leading to GPUs, enabling 50 layer networks of today. Still, the hunger for memory by newer and powerful networks is now pushing for evolutions of Deep Learning model compression techniques to put a leash on this requirement, as AI is quickly moving towards edge devices to give near to real-time results for captured data. Model quantization is one such rapidly growing technology that has allowed deep learning models to be deployed on edge devices with less power, memory, and computational capacity than a full-fledged computer.

How Did AI Migrate From Cloud to Edge?

Many businesses use clouds as their primary AI engine. It can host required data via a cloud data center for performing intelligent decisions. This process of uploading data to cloud storage and interaction with data centers induces a delay in making real-time decisions. The cloud will not be a viable choice in the future as demand for IoT applications and their real-time responses grows. As a result, AI on the edge is becoming more popular.

A Case for Databases on Kubernetes from a Former Skeptic

Kubernetes is everywhere. Transactional apps, video streaming services, and machine learning workloads are finding a home on this ever-growing platform. But what about databases? If you had asked me this question five years ago, the answer would have been a resounding “No!” — based on my experience in development and operations. In the following years, as more resources emerged for stateful applications, my answer would have changed to “Maybe,” but always with a qualifier: “It’s fine for development or test environments…” or “If the rest of your tooling is Kubernetes-based, and you have extensive experience…”

But how about today? Should you run a database on Kubernetes? With complex operations and the requirements of persistent, consistent data, let’s retrace the stages in the journey to my current answer: “In a cloud-native environment? Yes!

Cloud ERP vs On-Premise ERP: Which Is Right for You?

One of the most critical considerations to make when selecting a new enterprise resource planning (ERP) system for your organization is whether to go with an on-premises ERP system or a cloud-based ERP solution.

Cloud ERP is growing more popular than ever before. Almost every ERP provider now has a cloud deployment option, and some have completely abandoned on-premise ERP services. “Hybrid Cloud Market Worth USD 173.33 Billion by 2025 and Growing at a 22.5 percent CAGR,” according to research published by Global Newswire in 2021.

The Definitive Guide to Building a Data Mesh With Event Streams

Data mesh. This oft-talked-about architecture has no shortage of blog posts, conference talks, podcasts, and discussions. One thing that you may have found lacking is a concrete guide on precisely how to get started building your own data mesh implementation. We have you covered. In this blog post, we’ll show you how to build a data mesh using event streams, highlighting our design decisions, and the key benefits and challenges you’ll need to consider along the way. In fact, we’ll go one better: we’ve built a data mesh prototype for you to check out on your own to see what this would look like in action, or fork to bootstrap a data mesh for your own organization. 

Data mesh is technology agnostic so there are a few different ways you can go about building one. The canonical approach is to build the mesh using event streaming technology that provides a secure, governed, real-time mechanism for moving data between different points in the mesh.