How to Importing large CSV files in MySQL using Laravel

Published on Author Code Father
How to Importing large CSV files in MySQL using Laravel?

You seem to have already figured out the logic for interpreting the CSV lines and converting them to insert queries on the database, so I will focus on the memory exhaustion issue.

When working with large files in PHP, any approach that loads the entire file to memory will either fail, became unbearably slow or require a lot more RAM than you Droplet has.

So my advices are:

Read the file line by line using fgetcsv

$handle = fopen('file.csv', 'r');
if ($handle) {
    while ($line = fgetcsv($handle)) {
        // Process this line and save to database}}
    

This way only one row at a time will be loaded to memory. Then, you can process it, save to the database, and overwrite it with the next one.

Keep a separate file handle for logging

Your server is short on memory, so logging errors to an array may not be a good idea as all errors will be kept in it. That can become a problem if your csv has lots of entries with empty skus and category ids.

Laravel comes out of the box with Monolog and you can try to adapt it to your needs. However, if it also ends up using too much resources or not fitting your needs, a simpler approach may be the solution.

$log = fopen('log.txt', 'w');
if (some_condition) {
    fwrite($log, $text . PHP_EOL);}

Then, at the end of the script you can store the log file wherever you want.

Disable Laravel’s query log

Laravel keeps all your queries stored in memory, and that’s likely to be a problem for your application. Luckily, you can use the disableQueryLog method to free some precious RAM.

DB::connection()->disableQueryLog();

Use raw queries if needed

I think it’s unlikely that you will run out of memory again if you follow these tips, but you can always sacrifice some of Laravel’s convenience to extract that last drop of performance.

If you know your way around SQL, you can execute raw queries to the database.


Edit:

As for the timeout issue, you should be running this code as a queued task as suggested in the comments regardless. Inserting that many rows WILL take some time (specially if you have lots of indexes) and the user shouldn’t be staring at an unresponsive page for that long.

Comments

comments