![]() ![]() This script can be written in any language.Īside from that, bulk INSERT (column. You can then LOAD each file into the proper table. You can indeed use LOAD DATA INFILE just write a script that transforms your data into the desired format, separating it into per-table files in the process. Using an ORM to do this across 35 tables is even slower. Using an ORM on top of that is even slower: there's overhead for building objects, serialization, and so on. MySQL can easily handle data of this size.Īs you noticed, row-by-row insertion is slow. For this reason, I can't just use LOAD DATA INFILE (I'm using MySQL) or any other DBMS's feature that just reads in CSV files.Īlso, I can't use any Microsoft-specific solutions.įirst: 33MB is not big. I'm normalizing the data and putting it into dozens of different tables. Stack Overflow doesn't like subjective questions so I'm going to try to make this as un-subjective as possible: for those of you have not just an opinion but experience importing large CSV files, what tools/practices have you used in the past that have been successful?įor example, do you just use Django's ORM/OOP and you haven't had any problems? Or do you read the entire CSV file into memory and prepare a few humongous INSERT statements?Īgain, I want not just an opinion, but something that's actually worked for you in the past.Įdit: I'm not just importing an 85-column CSV spreadsheet into one 85-column database table. I'm not committed to any specific language, DBMS, or anything. I've found symfony to be an exceptional framework for building applications and I normally wouldn't consider using anything else, but in this case I'm willing to throw all my preconceptions out the window in the name of performance. Importing that 65,000 line file is simply not possible. By the time it's near the end, I'm at like 95% memory usage. The memory use is so bad that I have to split up my CSV files. It all works beautifully, except it's slow (each row takes about a quarter second) and it uses a lot of memory. My database has about 35 different tables and on the process of importing, I take these rows, split them up into their constituent objects and insert them into the database. Right now I have a symfony/Doctrine app (PHP) that reads these CSV files and imports them into a database. For example, one is about 33MB and about 65,000 lines. My company gets a set of CSV files full of bank account info each month that I need to import into a database.
0 Comments
Leave a Reply. |
AuthorWrite something about yourself. No need to be fancy, just an overview. ArchivesCategories |