Page 1 of 1

eliminating the batch limit

Posted: Sat Feb 28, 2009 9:25 am
by roedygr
I process the same number of files in Java as with the Validator and I have no problem with a batch limit.

I don't know how your code works, but I suggest doing this to effectively remove the batch limit.

Spawn a separate job in a separate address space, to chase through the files and create a sequential list of files to process on disk.

Optionally sort that list in RAM or with an external sort e.g. OptTech sort. If you don't have access to an externalssort, write a simple one by writing the records in batches of n records. then doing a single pass n-way merge using buffered sequential files.

Read the list with a buffered sequential binary read so you don't need the entire list in RAM at once. Again, run it as a separate job, so that file-by-file validation can proceed while the batch processes hum away.