Accelerating Data Processing: SynergyTop’s Latest Utility Solution

portfolio-details

The client deals with a huge volume of data that is updated daily.

The CSV contains 1,50,000 rows of pricing data which have to be uploaded to the server daily. It includes the names of the supplier and the pricing among other columns.

Original Process

  1. Textual data is converted into numerical data. For example, the names of the suppliers in text format are converted into numerical ids.
  2. New pricing data is entered every day. During CSV upload it needs to be ensured that duplicate suppliers and pricing data are not added.
  3. The supplier data which doesn’t change has to be kept intact.

Challenge

  • The previous system took 20 to 30 minutes to upload the CSV file to the system
  • Converting the textual data into numerical format was an issue.
  • Processing the data to see which data needs to be replaced in the old file and which ones need to be deleted after 3 days was complicated and tough.
  • The complete client architecture runs on a server with 1GB RAM, so the processing speed had to be fast otherwise it will take up all the server capacity and the rest of the system will fail.

Benefit Of Our Solution

  • Reduce the time taken from 20-30 minutes to 30-40 seconds. That’s close to a 33% reduction in time taken.
  • We ensured that the platform is available for users while the data is being uploaded.

Optimized Process Established by SynergyTop’s Product Engineering Team

Bulk-file-upload-solution

  1. To begin with, It was not possible to reduce the processing time on a 1GB RAM server so the server was upgraded to a 4GB RAM.
  2. The data was originally stored in the “Pricing Table” A .temp pricing table was now created to ensure that the unwanted data gets deleted after 3 days.
  3. The CSV was split into groups of 1000 records which are run on a loop. The data is uniquely mapped and thus ensures no duplication of records.
  4. After the first loop, batching is carried out and groups of 1000 records are run in batches of 10 files. During this process, the Lambda function is called which ensures new data is updated quickly and as required.
  5. Based on all this, an SQL is generated which ensures bulk deletion and bulk uploading of data with proper conversion and data processing.

Also, to ensure that data is available to the users at all times, while the data was uploaded in the backend, the old data was still kept available for the users on the front end. This helped make sure that at no time was the platform devoid of data. It was made possible with the use of another database. Data processing happens in a separate database called pricing_upload and once processing is completed the pricing table data is deleted and new data is copied from the pricing_upload database.

The Result

The time taken to upload the CSV was brought down to 30-40 seconds.

Looking for similar solutions? Contact our product engineering team to discuss your project requirements today!

Let’s Explore

    *

    *

    *

    Our Recent Work

    Bringing Your Ideas to Life & Crafting Your Digital Success

    Our Related Blogs

    "Insights & Innovations: Stay Updated with Our Latest Blogs"

    Hello world!
    Hello world!

    Welcome to WordPress. This is your first post. Edit or delete it, then start writing!

    st-log

    At SynergyTop, we are more than just an IT company; we are your strategic partner for digital success. With a passionate team of experts, we craft innovative solutions that drive your business forward.

    Follow Us

    Social 1Social 2Social 3
    Social 4Social 5Social 6