How to Deal With the Most Common Challenges in Web Scraping

Introduction

In the world of business, big data is key to competitors, customer preferences, and market trends. Therefore, web scraping is getting more and more popular. By using web scraping solutions, businesses get competitive advantages in the market. The reasons are many, but the most obvious are customer behavior research, price and product optimization, lead generation, and competitor monitoring. For those who practice data extraction as an essential business tactic, we’ve revealed the most common web scraping challenges.

Modifications and Changes in Website Structure

From time to time, some websites are subject to structural changes or modifications to provide a better user experience. This may be a real challenge for scrapers, who may have been initially set up for certain designs. Hence, some changes will not allow them to work properly. Even in the case of a minor change, web scrapers need to be set up along with the web page changes. Such issues are resolved by constant monitoring and timely adjustments and set-ups.

Top 5 Evolving Cybersecurity Threats to Cloud Computing in 2021

There would be no wrong in saying that the COVID-19 pandemic has created a new playground for hackers. Sadly, Cisco estimates that 53% of small-to-medium businesses (SMB) suffered from data breaches globally. 36 billion records were exposed in 2020. But the question that arises is, will these cybersecurity threats continue to grow in the upcoming years? 

By the year 2025, the cloud computing market is expected to grow $832.1 billion. Below are the top five cybersecurity threats that are evolving in the cloud computing market in 2021. 

Supersonic, Subatomic gRPC Services With Java and Quarkus

gRPC is an open source remote procedure call (RPC) framework. It was released by Google in 2015 and is now an incubating project within the Cloud Native Computing Foundation. This post introduces gRPC while explaining its underlying architecture and how it compares to REST over HTTP. You'll also get started using Quarkus to implement and consume gRPC services.

Remote Method Calling in gRPC

Wait, what’s this? Did you say remote method calling? Isn’t that something we did in the ‘90s with things like CORBA, RMI, and XML-RPC/SOAP?

How to Optimize AWS Observability Tools

Amazon Web Services (AWS) is a powerhouse cloud computing service allowing companies to produce computational functionality. They enable developers to quickly create serverless functions, which quickly delivers new features to consumers without scaling up infrastructure, taking both time and cost. The downside to this speed is that tracking and observing these functions’ health issues can be difficult, especially when running microservices. AWS provides several tools to assist developers in understanding their system’s health and are in the process of delivering new tools as well.

Observability With AWS CloudWatch

CloudWatch is AWS’s monitoring and insight service. Developers use CloudWatch to collect logs from compute functions and track performance information for many AWS services. Using this data, CloudWatch can create insights on which developers can develop alarms or insights. Using the combination of these tools, developers can create AWS observability tools that meet their needs.

Heap Dump With Lots of ‘Unresolved Name’ Objects

If you’re familiar with Java as a programming language, you may have come across the following message: java.lang.OutOfMemoryError: Java heap space. We recently got that message in one of the services that we’re currently working on. To better understand why this happens, it’s good to get a Java memory heap dump for further analysis.

After parsing the heap dump in both Eclipse MAT and Visual VM, I noticed something strange. My heap dump felt obfuscated and showed lots of objects named ‘Unresolved Name 0x’.

Jenkins vs. Travis vs. Bamboo vs. TeamCity: Clash of the Titans

What’s the first thing that comes to mind when you hear the words software development and DevOps? There’s only one magic word (five to be more precise): continuous integration and continuous delivery.

It is impossible to carry out software development without counting on DevOps testing or CI/CD tools, making picking the right CI/CD tool super important. Now the question is, How do you choose the right tool with so many options? Well, to make it a little easier for you, we have picked four of the best CI/CD tools, and we will be comparing Jenkins vs. Travis vs. Bamboo vs. TeamCity in this article so that you can make an informed decision.

What is DevOps? A DevOps Tutorial in Plain English

DevOps… CI/CD… Docker… Kubernetes… I'm sure you've been bombarded with these words a lot the past year. Seems like the entire world is talking about it. The rate at which this segment is progressing, it won't be long before we reach the stage of NoOps.

Don’t worry. It’s okay to feel lost in the giant sea of tools and practices. It's about time we break down what DevOps really is.

The Art Institute of Chicago Adds Public API Access to Exhaustive Collection of Museum Data

The Art Institute of Chicago, one of the oldest and largest art museums in the United States, has outlined its recent efforts to make publicly accessible online as much content from its collection as possible. In addition to information on over 100,000 artworks from the collection, the API also provides access to information on every exhibition hosted throughout the museum’s existence.

JPA Goes Even Easier With its Buddy

So, Hello World... After almost a year of development, the first version of JPA Buddy has finally been released! This is a free tool that is supposed to become your faithful coding assistant for projects with JPA and everything related: Hibernate, Spring Data, Liquibase, and other mainstream stacks.

"Why would I use it?" - this is a fair question towards any new tool or framework. In short, if you use JPA for data persistence, JPA Buddy will help you to be more efficient. In this article, I'll present an overview of the tool. I hope it will take its fair place among the most loved tools for Java developers, ones who use JPA, Spring, Liquibase, and of course the most advanced Java IDE - IntelliJ IDEA.

Key-Factors to Consider in a CRM Implementation for Customer Service Success

Currently, most organizations are implementing a cloud-based CRM system for better scalability, high availability, and improved performance. There are several market leaders in cloud-based CRM solutions, such as Salesforce, Microsoft Dynamics, Zendesk, Pegasystems, etc.  However, when selecting a platform or implementing a solution, it is important to focus on the below factors, if the organization is aiming to provide best-in-class customer service.

Robust Ticket Management

Effective ticket management is the heart of any Customer Service. Therefore, the below features must be considered while implementing a CRM.

Automattic Launches the Blank Canvas WordPress Theme for Building Single-Page Websites

Split Screen pattern in the Blank Canvas WordPress theme.
Split Screen pattern from Blank Canvas.

On Monday, Automattic announced its Blank Canvas theme on WordPress.com. The goal is to allow end-users to build single-page websites, such as an “about me” or product landing page.

Blank Canvas is a child theme of Seedlet, which Automattic’s Theme Team has been using as a launchpad. One example is its recently-released Spearhead child theme. It also provided the foundational work for the recent Twenty Twenty-One default WordPress theme.

One-page themes are nothing new. Theme builders have been releasing them for years in various forms.

“We’ve been working on block patterns a lot lately, and it became clear that many of the single-page websites we come across daily — collections of links, newsletter signups, etc. — are basically just simple block patterns sitting on an otherwise blank page,” said Kjell Reigstad, the lead developer on the theme. “That being the case, it seemed like WordPress should be able to power these sorts of single-page sites pretty easily. Blank Canvas is an attempt to try that out.”

WordPress is not the ideal platform for the majority of one-page sites. Doing so includes setting up a database, installing the software, and keeping everything updated. The admin interface is not well-suited to those types of sites. WordPress is a content management system. One page is not enough content to need a full-blown CMS to manage. There simply is little upside for the average user to go through the hassle of doing this on even the cheapest of shared hosting.

However, if you have a network where someone else, such as WordPress.com, takes out all the hassle of maintaining the backend and when it does not cost you a dime, WordPress suddenly makes more sense. It becomes an ideal platform for these types of sites.

Frankly, I do not know why they have not pushed this concept sooner. Jason Schuller has made a go of it with Leeflets in the past. Since then, he and Philip Kurth have taken that idea further and launched WP Landing Kit, which builds on the same concept of creating multiple single-page landing sites from one WordPress installation.

In some respects, Blank Canvas offers a glimpse into Full Site Editing. It is almost a stepping stone or a small yet limited preview of things to come. The theme puts the entire design process into a single page and a single editor. Eventually, this will be extended to the whole website.

“I think that’s a great way to think about it,” said Reigstad. “Full Site Editing is coming soon, but in the meantime, Blank Canvas lets you do just a little bit more with Gutenberg than you could before.”

About the Theme

The theme is called Blank Canvas for a reason. Its demo page is literally a blank screen with a footer message. The idea is that the end-user designs their homepage — or their entire site in the case of a single-page website — via the block editor.

For those who need a starting point, the theme comes packaged with six block patterns:

  • About Me
  • Links
  • Invitation
  • Split Screen
  • Card
  • Email Signup
Invitation block pattern from the Blank Canvas theme in the WordPress editor.
Invitation block pattern.

Self-hosted WordPress users can install the theme too. It is currently awaiting review for the theme directory, but they can snag the ZIP file or SVN link from its Trac ticket. For those giving it a test, be sure to disable the title and tagline via the customizer so they do not appear on the front end. That is assuming you want to use the theme as intended. It will also work as a more traditional theme because the Seedlet parent theme covers all the necessary features.

There are differences between the theme on WordPress.com and that submitted to the WordPress.org theme directory. The .ORG version has only four block patterns. The .COM version includes an additional Card pattern, which integrates with Automattic’s Layout Grid plugin. The Email Signup pattern needs Jetpack’s form feature.

Simple conditional checks for Layout Grid or Jetpack before registering the patterns would suffice for users with those plugins installed. “That’s planned,” said Reigstad of adding the missing patterns, “but we just didn’t implement it yet.”

Email Signup pattern from the Blank Canvas WordPress theme in the block editor.
Email Signup block pattern.

WordPress.com users have something else to look forward to. In November, the service launched over 100 patterns. “One of the nice things is that there are already a lot of patterns out there that seem ready-made for single-page websites,” said Reigstad.

He did say the team is working on bundling more patterns in the future. These may include more “link in bio” designs that expand on the one already in the theme today.

Pioneering Block-Friendly Themes

Several of the ideas available in this theme seemed to have started from the WordPress Theme Experiments repository. It features block patterns similar in scope to the Carrd-like theme Reigstad built last October.

“In general, building block-based themes helped redefine our idea of what a theme needed to be,” he said. “We’d tended to think of a theme as a complicated piece of software that accounts for every scenario you throw at it: a blog, custom post types, category pages, search pages, the 404 page, etc.”

Reigstad said that the block-based themes paradigm has forced the Theme Team to start small. Because Full Site Editing is still in flux, its features not ready, the team has built proof-of-concept themes with limited functionality.

“The possibilities for block-based themes have grown considerably since then (as shown by TT1 Blocks, Q, Block-based Bosco, and others), but the early constraints helped spark ideas like that Carrd-inspired theme,” he said. “It turned out that you could build a pretty useful site with just a handful of blocks.

“That mindset definitely informed Blank Canvas — we started small, with just the functionality someone would need to build a single-page site. Since it’s based on a full-featured theme (Seedlet), you can grow with it too.”

#300: Exploring Custom Profiles

You’ve got two options for customzing your profile on CodePen, and endless possibility:

  • Open to everyone: Apply Custom CSS (which can be a link to a Pen on CodePen)
  • PRO only: Apply a Pen-as-Header-Background

In this video, Chris & Stephen look at a bunch of awesome profiles on CodePen, and find that many users (especially the coolest profiles) do both. Here’s the shuffle machine we used in the video to randomize cool profiles:

The post #300: Exploring Custom Profiles appeared first on CodePen Blog.

How to Use an API

In the video below, we take a closer look at the "How to use an API? | API Tutorial | Web Services Tutorial." Let's get started!

Building a Security-First Culture

Application Security Is Like Wearing a Mask

Wearing a face mask to prevent coronavirus is becoming the norm in my city. It was hit heavily by the COVID crisis, and now we have reached an unspoken consensus: wear masks wherever you go.

This is quite different from where we were just a few months ago. Face masks had a bad reputation, and the local health department had a hard time getting people to wear them. What was stopping people from wearing masks? It turns out, people hated masks because they make breathing difficult, make glasses foggy, and can look quite awkward. But the pros of masks outweigh the cons, and by wearing face masks, we protect ourselves and our communities from the virus.

A Rapid Overview of ISA-88 and How It Aligns With ISA-95 and IIoT Platforms

ISA-88 is a long-standing standard for managing batch processes, while ISA-95 is focused on defining the progressive complexity of information that is expected to be available at each layer.

Often, interconnect between the two standard definitions and further the evolving IIOT platform definitions cause some degree of confusion as manufacturing firms grapple with defining and shaping their IT strategies and new tech/platform/apps adoption frameworks.

Plugin Team Draws a Line: Plugins Must Not Change WordPress’ Default Automatic Update Settings

WordPress’ plugin team has published a statement regarding plugins making changes to users’ update services:

Unless your plugin has the purpose of managing updates, you must not change the defaults of WordPress’ update settings.

You may offer a feature to auto-update, but it has to honor the core settings. This means if someone has set their site to “Never update any of my plugins or themes” you are not to change those for them unless they opt-in and request it.

The statement was prompted by plugins overstepping this boundary, which, up until recently, has simply been understood but not explicitly forbidden. Mika Epstein said the practice “destroys the faith users have in you to not break their sites.” It also reflects poorly on WordPress as a whole when plugin authors abuse core features to serve their own interests.

“Sadly, this happened recently to a well used plugin, and the fallout has been pretty bad,” Epstein said.

She did not identify the plugin in question, but one particular incident that happened last month bears a strong likeness to this description. On December 21, 2020, the All in One SEO plugin turned on automatic updates without notifying its users, aside from a short, ambiguous note in the changelog.

All in One SEO was active on more than 2 million WordPress sites when it rolled out this update. Many users were frustrated to discover that their sites had been updated without permission, despite having auto updates turned off for the plugin. The plugin’s developers removed the auto updates wrapper functionality from the plugin earlier this month, in favor of letting WordPress handle updates.

After this incident, those who were affected were left with questions. Should WordPress allow this practice? Should plugin developers be required to place a notice in the dashboard if they are going to flip automatic updates on? While many users are willing to trust WordPress core to do automatic updates in a safe way, some are not willing to extend that trust to plugin developers, whose quality of updates vary widely. The plugin team offering guidance and communication on this matter was absolutely necessary to deter aggressive plugin developers from destroying what is still a fragile trust in automatic updates.

“At this time, we have no plans to spell this out in a guideline,” Epstein said. “We do currently, regularly flag plugins that go outside their dictated (self defined) boundaries, and this is not a change. Please, respect your users.”

The Soft Skills Are the Hard Skills of Today

There was a time where soft skills were considered common-sense practices that held no value within the confines of an organization. While soft skills played a vital role in the success of an organization and its employees, they often went unrecognized for the significant impact they played. However, in an ever-evolving employment landscape, it has quickly become obvious that the two youngest generations in today’s workforce, the Millennials, and Gen-Z, have simply not been equipped with the soft skills that generations before considered common sense.

While this younger workforce is highly equipped to handle the current hard skills required for their jobs, also known as the required aptitude, experience, and specific skills required for their work, basic soft skills such as setting an alarm, being on time, or respecting chains of command are often missing from their skill set. As the future leaders of our economy and workforce, employers should place equal importance on hard and soft skills during the recruiting cycle for new employees — placing an equal value on those who possess the soft skills that were otherwise thought of as common sense, as the hard skills necessary to complete their work. Furthermore, employers should work with current employees to build up their soft skills to help them become critical thinkers, communicators, and effective leaders.