We Crunched 1 Billion Java Logged Errors – Here’s What Causes 97% Of Them

97% of Logged Errors are Caused by 10 Unique Errors

It’s 2021 and one thing hasn’t changed in 30 years. DevOps teams still rely on log files to troubleshoot application issues. We trust log files implicitly because we think the truth is hidden within them. If you just grep hard enough or write the perfect regex query, the answer will magically present itself in front of you.

Comparing InfluxDB, TimescaleDB, and QuestDB Time Series Databases

We're living in the golden age of databases, as money flows into the industry at historical rates (e.g., Snowflake, MongoDB, Cockroach Labs, Neo4j). If the debate between relational vs. non-relational or online analytical processing (OLAP) vs. online transaction processing (OLTP) ruled the past decade, a new type of database has been steadily growing in popularity. According to DB-Engines, an initiative to collect and present information on database management systems, time series databases are the fastest growing sector since 2020:

Data vs. Database Type Popularities

Why Use a Time Series Database?

Time series databases (TSDB) are databases optimized to ingest, process, and store timestamped data. Such data may include metrics from servers and applications, readings from IoT sensors, user interaction on a website or an app, or trading activity on financial markets.

How to Utilize Java Benchmarks With Arm Processors

I have seen a lot of speculation surrounding ARM processors, specifically after Apple announced its plan to change over to Arm-based processors. Many people assume that the performance will be similar to a Raspberry Pi, however, this is incorrect. While Java on ARM is not uncommon, there has been a recent spike due to increased ARM investments from cloUd vendors. Amazon and Microsoft have taken steps towards this, with Amazon updating its ARM offerings, and Microsoft porting the JVM to Arm64 for Windows, which will be helpful for future Azure support. 

In this article, I will show the Java benchmarks I took on different AWS EC2 instances, and for fun on my laptop.

Reducing Large S3 API Costs Using Alluxio

I. Introduction

Previous Works

There have been numerous articles and online webinars dealing with the benefits of using Alluxio as an intermediate storage layer between the S3 data storage and the data processing system used for ingestion or retrieval of data (i.e. Spark, Presto), as depicted in the picture below:

To name a few use cases:

The Curious Case of False Positives in Application Security

Over the past year, data breaches, through web, business, and mobile application exploitation, have continued to run rampant. In 2018, major household names like Ticketmaster, the United States Postal Service (USPS), Air Canada, and British Airways were hit by application-based exploits. To minimize vulnerabilities — and identify existing ones before they can do this level of damage — application security solutions need to be fast, provide good coverage for capturing all classes of vulnerabilities, and more importantly, they need to be highly accurate, to be useful to DevOps application development teams. Providing results fast but less accurately is counter-productive to an efficient and successful application security program. Time wasted by engineers to triage the false positives far outweighs the speedier results provided.

Most automated application security testing solutions have the ability to scan thousands of applications containing millions of lines of code and can produce results containing millions of attack vectors. But every application is different — different functionality, different code, different size, and different complexity —resulting in significantly different security findings with different accuracy. More so, selecting any single scanned application with the best accuracy from many and claiming accuracy is misleading. Even taking averages would be misleading, because it would be a measure of only the limited set of applications that the vendor’s solution scanned, and hence, incomparable to the accuracy of other solutions.

Scaling Benchmarks With More Robust UseNUMA Flag in OpenJDK

What happens when you run a Java application instead of checking your hardware configuration? Obviously, your application lags in terms of performance. For small applications, you need not to worry, but for applications that require larger memory (in GB's), you need to take care of the configurations; otherwise, your application can suffer a lot.

What Is NUMA?

Non-Uniform Memory Access, also called NUMA, is a configuration of processor and memory such that some cluster of cores are near to its memory and memory is local to those cores. The below picture of AMD EPYC 2P system explains more clearly about the NUMA nodes. Here, the Die is a single NUMA node and the memory channels (DDR) are also local to it. These dies are also interconnected by fabric, but the access penalty is more for far nodes compared to local nodes.