Java I/O and NIO
First, some background about Java I/O. Java models input/output as streams. InputStream (abstract) is the superclass of all input types that can be modeled as a stream. FileInputStream is a subclass of InputStream representing file I/O. A FileInputStream needs to be created based on a File. A File object contains many filesystem properties, including file type (isFile), directory structure (listFiles), etc.Extending from the I/O package, NIO (new I/O, or non-blocking I/O) package provides richer features by exposing lower level control. The central abstraction is a Buffer class. Another interesting abstraction is Channels, which are closely related to non-blocking I/O.
Starting from Java 7, the NIO2 package (java.nio.file) is available to expose even lower level filesystem control. A Path class is presented, abstracting a file's path in the file system. The Files class is capable of many types of file operations such as creating and managing symbolic links.
Java I/O Packages in HDFS
HDFS uses a new type of input/output stream named FSInputStream/FSOutputStream (abstract). They model HDFS stream input/output. The main purpose of having custom file input/output stream is for better position tracking (they don't do much).DFSInputStream/DFSOutputStream further extends their FS stream superclasses. DFS input/output streams handle the main HDFS logic of locating local files on DataNodes etc.
Java NIO Packages in HDFS
HDFS only uses 2 types of NIO buffers: ByteBuffer and MappedByteBuffer.NIO2 is not used in HDFS.
Example: ingest a local file into HDFS with copyFromLocal
(shell) CopyCommands / CommandsWithDestination
* run
* |-> {@link #processOptions(LinkedList)}
* \-> {@link #processRawArguments(LinkedList)}
* |-> {@link #expandArguments(LinkedList)}
* | \-> {@link #expandArgument(String)}*
* \-> {@link #processArguments(LinkedList)}
* |-> {@link #processArgument(PathData)}*
* | |-> {@link #processPathArgument(PathData)}
* | \-> {@link #processPaths(PathData, PathData...)}
* | \-> {@link #processPath(PathData)}*
* \-> {@link #processNonexistentPath(PathData)}
* |-> copyFileToTarget
* \-> Open an InputStream from the path
* |-> copyStreamToTarget
* \-> Create a TargetFileSystem, which is subclass if FilterFileSystem (subclass of FileSystem)
* |-> writeStreamToFile
* |-> create() a FSOutputStream from the path
* \-> IOUtils copyBytes() from input stream to output stream
* |-> create() a FSOutputStream from the path
* \-> IOUtils copyBytes() from input stream to output stream
No comments:
Post a Comment