I have some very large text files (anywhere up to 500MB each) which I need to read via PHP and act on the contents. I need to do this on a line by line basis. Obviously using file() in this case will just cause the memory usage to exceed the limit. What I need is some way to read x lines (say 1000), process those lines, read the next 1000 lines and so on until EOF.
At the moment i’m using something like this:
[code=php]
$this->_openFile();
while (!$this->_eof) {
$this->_readLines(1000);
foreach ($this->_lines as $line) {
// do stuff
}
}
$this->_closeFile();
_readLines uses fgets to populate $this->
This works OK, however ideally I’d like to split out the processing and retrieval of the data, because I have different types of file and the processing is different for each. I want to make a base class with the functionality for opening files, reading x lines etc, and a child class which handles the actual processing of the most recent x lines.
However my problem is, if I set the latest x lines for the child class to process, how can I then carry on reading from the next line? So for example if I read the first 1000 lines, how do I pick up again at line 1001?
So just to be clear, I want to do the following:
— BASE CLASS —
1. Take an array of files
2. Open the first file
3. Read x lines from the file
— CHILD CLASS —
4. Process the last x lines
— BASE CLASS —
5. Repeat 3-4 until EOF
6. Repeat 2-5 with each file in the array
Any thoughts on the best way to do this? I suppose what I’m really looking for is some way to read chunks of a file at a time but using lines rather than bytes.