Idan Asulin | The Blog Pros

January 2, 2024

Comparing WebHooks and Event Consumption

In event-driven architecture and API integration, two vital concepts stand out: WebHooks and event consumption. Both are mechanisms used to facilitate communication between different applications or services. Yet, they differ significantly in their approaches and functionalities, and by the end of this article, you will learn why consuming events can be a much more robust option than serving them using a webhook.

The foundational premise of the article assumes you function as a platform that wants or already delivers internal events to your clients through WebHooks.

December 20, 2023

How To Handle API Rate Limitations With a Queue

Rate limitation is also a challenge for the apps that encounter it, as it requires to “slow down” or pause. Here’s a typical scenario:

Initial Request: When the app initiates communication with the API, it requests specific data or functionality.
API Response: The API processes the request and responds with the requested information or performs the desired action.
Rate-Limitation: If the app has reached the limit, it will usually need to wait until the next designated time frame (like a minute to an hour) before making additional requests. If it is a “soft” rate limitation and timeframes are known and linear, it’s easier to handle. Often, the waiting time climbs and increases in every block, requiring a whole different and custom handling per each API.
Handling Rate Limit Exceedances: If the app exceeds the rate limit, it might receive an error response from the API (such as a “429 Too Many Requests” status code). The app needs to handle this gracefully, possibly by queuing requests, implementing backoff strategies (waiting for progressively more extended periods before retrying), or informing the user about the rate limit being reached.

To effectively operate within rate limitations, apps often incorporate strategies like:

December 13, 2023

Real-Time Data Scrubbing Before Storing in a Data Warehouse

Between January 2023 and May 2023, companies violating general data processing principles incurred fines totaling 1.86 billion USD (!!!).

In today’s data-driven landscape, the importance of data accuracy and compliance cannot be overstated. As businesses amass vast amounts of information, the need to ensure data integrity, especially PII storing, becomes paramount. Data scrubbing emerges as a crucial process, particularly in real-time scenarios, before storing information in a data warehouse.

March 12, 2023

Stateful Stream Processing With Memphis and Apache Iceberg

Amazon Web Services S3 (Simple Storage Service) is a fully managed cloud storage service designed to store and access any amount of data anywhere. It is an object-based storage system that enables data storage and retrieval while providing various features such as data security, high availability, and easy access. Its scalability, durability, and security make it popular with businesses of all sizes.

Apache Iceberg is an open-source tabular format for data warehousing that enables efficient and scalable data processing on cloud object stores, including AWS S3. It is designed to provide efficient query performance and optimize data storage while supporting ACID transactions and data versioning. The Iceberg format is optimized for cloud object storage, enabling fast query processing while minimizing storage costs.

February 24, 2023

How To Increase Data Quality With Memphis Schemaverse

Using Schemaverse, you force producers to produce messages based on the given structure by validating each ingested message. This is especially useful if you’re working with multiple producers. It also allows you to set up standards for your product’s schema and evolve your schema (In RUN TIME). Let’s find out how using Memphis Schemaverse will increase your data quality.

What Is Memphis?

Memphis{dev} is an open-source real-time data processing platform that provides end-to-end support for in-app streaming use cases using Memphis distributed message broker.

February 7, 2023

Design Considerations for Cloud-Native Data Systems

When it comes to designing a cloud-native data system, there's no particular hosting infrastructure, programming language, or design pattern that you should use. Cloud-native systems are available in various sizes and shapes. However, it is true that most of them follow the same cloud-native design principles. Let's take a look at the cloud-native architecture, the design principles you should keep in mind, and the features that make up a good cloud-native platform.

Cloud-Native Architecture

A cloud-native architecture is essentially a design pattern for apps built for the cloud. While there's no specific way of implementing this kind of architecture or a pre-defined cloud-native design, the most common approach is to break up the application into several microservices and let each microservice handle a different kind of function. Each microservice is then maintained by a small team and is typically deployed as a container.

February 4, 2023

Building Real-Time Data Systems the Hard Way

What Is Real-Time Data?

Real-time data proves to be beneficial for businesses. They use real-time data throughout the organization to derive useful information from it and extract better insights. Real-time data is helpful for monitoring and maintaining IT infrastructure. Real-time data enables businesses and organizations to get better visibility and understanding of the operation of their intricate networks.

It is crucial to have a better grip on concepts related to real-time data. These might include data velocity, the choice of processing, and an unavoidable problem of maintaining and monitoring systems for creating and consuming real-time data.

February 3, 2023

4 Key Design Principles and Guarantees of Streaming Database

Real-time data processing is a foundational aspect of running modern technology-oriented businesses. Customers want quicker results than ever and will defect at the slightest opportunity of having faster outcomes. Hence organizations these days are in a continuous hunt to shave off milliseconds from their responses.

Real-time processing takes over most aspects that were earlier handled using batch processing. Real-time processing requires executing business logic on an incoming stream of data. This is in stark contrast to the traditional way of storing the data in a database and then executing analytical queries. Such applications can not afford the delay involved in loading the data first to a traditional database and then executing the queries. This sets the stage for streaming databases. Streaming databases are data stores that can receive high-velocity data and process them on the go without a traditional database in the mix. They are not drop-in replacements for the traditional database but are good at handling high-speed data. This article will cover the four key design principles and guarantees of streaming databases.

January 30, 2023

Building a Scalable Search Architecture

Creating a scalable search architecture is a popular and important task for many systems. There are different solutions for this task. Choosing the right one depends on the requirements of your project.

Sometimes, as a project grows and its requirements change, you may run into new problems that you cannot solve with the search architecture you are using. For example, when increasing the amount of data, including synonyms in the search, adding multilingual search, etc. In this case, you need to think about creating a new, more efficient, scalable search architecture.