Split one large file into multiple InputStream(s) to be processed in multithreading in Java

The name of the pictureThe name of the pictureThe name of the pictureClash Royale CLAN TAG#URR8PPP

Split one large file into multiple InputStream(s) to be processed in multithreading in Java



My codes:


public ArrayList<InputStream> getAllInputStreams()
ArrayList<InputStream> allStreams = new ArrayList<InputStream>();
InputStream stream = this.getNext();
while (stream != null)
allStreams.add(stream);
stream = this.getNext();

return allStreams;


public InputStream getNext()
if (done())
return null;

InputStream segment = createInputStream();
this.countStream++;
return segment;


protected InputStream createInputStream()
BoundedInputStream res = new BoundedInputStream(
Channels.newInputStream(this.randomAccessFile.getChannel().position(this.countStream * chunkSize)), chunkSize);
res.setPropagateClose(false) ;
return res ;



I am trying to split file into several InputStream(s) (private RandomAccessFile randomAccessFile = new RandomAccessFile(file, "r");. All InputStream(s) (got from getAllInputStreams()) to be processed by multiple threads, it seems that most of them are empty. Why?


file


private RandomAccessFile randomAccessFile = new RandomAccessFile(file, "r");


getAllInputStreams()



Any hints welcomed. Thanks



UPDATE



It seems that the following piece of codes working fine. Is the following piece of codes a good way to split the file into several chucks? Should the size of each chuck smaller than memory size?


protected InputStream createInputStream()
RandomAccessFile randomAccessFile = new RandomAccessFile(file, "r");
BoundedInputStream res = new BoundedInputStream(
Channels.newInputStream(randomAccessFile.getChannel().position(this.countStream * chunkSize)), chunkSize);
res.setPropagateClose(false) ;
return res ;





if (done()) { return null; Because most of them are "done"? Really I don't see how we could tell from the code you have shown.
– markspace
yesterday


if (done()) { return null;





What makes you think that using multiple streams that read from the same IO device, just in different locations will speed up things? Depending on the underlying hardware and software stack, you might actually see a slow down. Because your truly random read accesses to the same IO device will generate a lot of overhead.
– GhostCat
9 hours ago





@GhostCat, so, any way to speed up things? Thanks
– BAE
8 hours ago





1 Answer
1



I think that this statement:


res.setPropagateClose(false);



in your "UPDATE" solution will result in a resource leak. The RAF that you have opened is not shared with anything else. When the BoundedInputStream is closed, you need the close to propagate to the RAF.


BoundedInputStream






By clicking "Post Your Answer", you acknowledge that you have read our updated terms of service, privacy policy and cookie policy, and that your continued use of the website is subject to these policies.

Comments

Popular posts from this blog

Executable numpy error

Trying to Print Gridster Items to PDF without overlapping contents

Hystrix command on request collapser fallback