Fine Tuning LLM: Parameter Efficient Fine Tuning (PEFT) — LoRA and QLoRA — Part 1

In the ever-evolving world of AI and Natural Language Processing (NLP), Large Language Models and Generative AI have become powerful tools for various applications. Achieving the desired results from these models involves different approaches that can be broadly classified into three categories: Prompt Engineering, Fine-Tuning, and Creating a new model. As we progress from one level to another, the requirements in terms of resources and costs increase significantly.

In this blog post, we’ll explore these approaches and focus on an efficient technique known as Parameter Efficient Fine-Tuning (PEFT) that allows us to fine-tune models with minimal infrastructure while maintaining high performance.

Prompt Engineering: Retrieval Augmented Generation(RAG)

The field of Natural Language Processing (NLP) has seen significant breakthroughs with the advent of transformer-based models like GPT-3. These language models have the ability to generate human-like text and have found diverse applications such as chatbots, content generation, and translation. However, when it comes to enterprise use cases where specialized and customer-specific information is involved, traditional language models might fall short. Fine-tuning these models with new corpora can be expensive and time-consuming. To address this challenge, we can use one of the techniques called “Retrieval Augmented Generation” (RAG).

In this blog, we will explore how RAG works and demonstrate its effectiveness through a practical example using GPT-3.5 Turbo to respond to a product manual as an additional corpus.

Platform Engineering With Pulumi (Part 2): Build and Deploy a React.js Application

In Chapter 1 of this blog, we built an AWS landing zone for our React.js/Node.js application. In this episode, we will build the application and deploy it manually. In the next chapter, we will use GitOps based automated deployment of both the Infrastructure and application code.

The app that we will be building is a very simple web application, that creates and fetches contact details to/from DynamoDB.

Java Serverless on Steroids With fn+GraalVM Hands-On

Function-as-a-Service or Serverless is the most economical way to run code and use the cloud resources to the minimum. The serverless approach runs the code when a request is received. The code boots up, executes, handles the requests, and shuts down. Thus, utilizing the cloud resources to the optimum. This provides a highly available, scalable architecture, at the most optimal costs. However, serverless architecture demands a faster boot, quicker execution, and shutdown.

GraalVM native images (ahead of time) is the best runtime. GraalVM native images have a very small footprint, they are fast to boot and they come with embedded VM (Substrate VM).

Avengers of Container World, Episode 2: Buildah and Skopeo Hands-On

In the last episode (Episode 1: Podman Hands on), we got Podman working on CentOS/VirtualBox. We also pulled the tomcat image and got it running. In this episode, we will explore the advantages of Buildah and Skopeo and build a complete custom image with our sample web application.

Why Buildah?

Docker provided a very sophisticated configuration file-based provisioning with Dockerfile and Docker Compose. It provided a simple YAML-based configuration that the Docker daemon would use to build custom images as well as configure and provision the container. Docker daemon has the functionality to build, pull, push, run, and manage containers.

Avengers of the Container World, Episode 1: Podman Hands-On

CRI-O and Podman have been widely adapted by most of the modern container platforms. In this blog, I will explore why everybody is gaga about this new ecosystem of tools/utilities and share some of my experience in this series.

I got a lot of feedback, after I published my blog on Containers and evolution of Containers (you can read it here 'Evolution of k8s worker nodes - CRI-O'). One of the common questions asked, is how Podman is different from Docker and how the new ecosystem of podman+buildah+cri-o+skopio different from what we do with docker... so I wanted to do a deep dive on these things, and share some of my experiences with this new ecosystem of container runtime and management tools/utilities.

Evolution of K8s Worker Nodes-CRI-O

Evolution

Just a few months back, I never used to call containers as containers…I used to call them docker containers…when I heard that OpenShift is moving to CRI-O, I thought what's the big deal…to understand the “big deal”…I had to understand the evolution of the k8s worker node.

If you look at the evolution of the k8s architecture, there has been a significant change and optimization in the way the worker nodes have been running the containers…here are significant stages of the evolution, that I attempted to capture…