Kafka JDBC Source Connector for Large Data

Recently I had a task to migrate some data from an old monolith’s Oracle database to a microservice with a PostgreSQL database. The problem was that the data needed for migration had a parent table with around 2 million records with 150 columns and on top of that, everything was brought into view with a payload column aggregating data from various tables in XML. As you can imagine, the SELECT from that view was pretty slow, and by pretty, I mean insanely slow which was not going to work very well for the connector. So, in this article we’ll take a look at a similar simplified use case and how can we deal with it.

Use case

We have a course-catalogue application with a PostgreSQL database that deals with instructors and their courses. Now we need to migrate some legacy instructors from another PostgreSQL database that soon is going to be decommissioned. So we have instructors-legacy-db and the course-catalog-db. In our case, both databases won’t be that overwhelmed with records, with just about 200 records for the instructors-legacy-db, but for the sake of the example, just imagine that instructors-legacy-db is that table with 2 million cumbersome records. 

CategoriesUncategorized