reading file line by line in Java with BufferedReader

Reading files in Coffee is the cause for a lot of confusion. There are multiple means of accomplishing the same task and it'due south often not articulate which file reading method is best to employ. Something that's quick and muddied for a small example file might non be the best method to use when yous need to read a very big file. Something that worked in an earlier Java version, might not be the preferred method anymore.

This article aims to be the definitive guide for reading files in Java 7, 8 and 9. I'm going to cover all the ways yous can read files in Java. Too often, y'all'll read an article that tells y'all one way to read a file, only to discover later on there are other ways to do that. I'k really going to embrace 15 unlike ways to read a file in Coffee. I'm going to comprehend reading files in multiple ways with the core Java libraries as well as two third party libraries.

Simply that'southward not all – what expert is knowing how to do something in multiple ways if you lot don't know which way is best for your state of affairs?

I also put each of these methods to a existent operation examination and document the results. That way, you lot will accept some difficult information to know the performance metrics of each method.

Methodology

JDK Versions

Java code samples don't alive in isolation, especially when it comes to Java I/O, every bit the API keeps evolving. All code for this article has been tested on:

  • Java SE 7 (jdk1.7.0_80)
  • Coffee SE 8 (jdk1.8.0_162)
  • Java SE 9 (jdk-9.0.iv)

When there is an incompatibility, it volition be stated in that department. Otherwise, the lawmaking works unaltered for different Java versions. The main incompatibility is the utilise of lambda expressions which was introduced in Java 8.

Java File Reading Libraries

There are multiple ways of reading from files in Java. This article aims to be a comprehensive collection of all the unlike methods. I volition embrace:

  • java.io.FileReader.read()
  • java.io.BufferedReader.readLine()
  • java.io.FileInputStream.read()
  • java.io.BufferedInputStream.read()
  • java.nio.file.Files.readAllBytes()
  • java.nio.file.Files.readAllLines()
  • coffee.nio.file.Files.lines()
  • java.util.Scanner.nextLine()
  • org.apache.commons.io.FileUtils.readLines() – Apache Eatables
  • com.google.common.io.Files.readLines() – Google Guava

Closing File Resources

Prior to JDK7, when opening a file in Coffee, all file resources would need to be manually airtight using a try-catch-finally cake. JDK7 introduced the try-with-resource argument, which simplifies the process of endmost streams. You no longer demand to write explicit code to close streams considering the JVM will automatically close the stream for you, whether an exception occurred or non. All examples used in this article use the try-with-resources statement for importing, loading, parsing and endmost files.

File Location

All examples volition read test files from C:\temp.

Encoding

Character encoding is not explicitly saved with text files then Java makes assumptions virtually the encoding when reading files. Usually, the assumption is correct but sometimes you lot want to be explicit when instructing your programs to read from files. When encoding isn't correct, you'll run into funny characters appear when reading files.

All examples for reading text files use two encoding variations:
Default system encoding where no encoding is specified and explicitly setting the encoding to UTF-8.

Download Code

All code files are available from Github.

Code Quality and Code Encapsulation

There is a difference betwixt writing code for your personal or work project and writing lawmaking to explain and teach concepts.

If I was writing this code for my ain project, I would use proper object-oriented principles like encapsulation, brainchild, polymorphism, etc. But I wanted to make each instance stand lone and easily understood, which meant that some of the code has been copied from one example to the next. I did this on purpose because I didn't want the reader to accept to effigy out all the encapsulation and object structures I and so cleverly created. That would take away from the examples.

For the same reason, I chose Non to write these example with a unit testing framework like JUnit or TestNG because that's not the purpose of this commodity. That would add another library for the reader to understand that has zero to do with reading files in Java. That's why all the example are written inline inside the main method, without extra methods or classes.

My primary purpose is to brand the examples every bit like shooting fish in a barrel to understand as possible and I believe that having extra unit testing and encapsulation code will non help with this. That doesn't mean that'south how I would encourage you to write your own personal lawmaking. It's but the way I chose to write the examples in this article to brand them easier to empathize.

Exception Treatment

All examples declare any checked exceptions in the throwing method proclamation.

The purpose of this article is to show all the different ways to read from files in Java – it's not meant to show how to handle exceptions, which will be very specific to your situation.

So instead of creating unhelpful try catch blocks that just print exception stack traces and ataxia upward the code, all instance will declare whatsoever checked exception in the calling method. This volition brand the code cleaner and easier to understand without sacrificing whatsoever functionality.

Future Updates

As Coffee file reading evolves, I will exist updating this article with whatever required changes.

File Reading Methods

I organized the file reading methods into three groups:

  • Classic I/O classes that have been part of Coffee since earlier JDK 1.7. This includes the coffee.io and java.util packages.
  • New Java I/O classes that have been role of Java since JDK1.7. This covers the java.nio.file.Files form.
  • Third party I/O classes from the Apache Commons and Google Guava projects.

Archetype I/O – Reading Text

1a) FileReader – Default Encoding

FileReader reads in one character at a time, without any buffering. It's meant for reading text files. It uses the default character encoding on your arrangement, so I have provided examples for both the default case, every bit well as specifying the encoding explicitly.

          

one
two
three
4
5
6
7
eight
9
10
11
12
13
fourteen
15
16
17
18
19

import coffee.io.FileReader ;
import java.io.IOException ;

public course ReadFile_FileReader_Read {
public static void main( Cord [ ] pArgs) throws IOException {
String fileName = "c:\\temp\\sample-10KB.txt" ;

try ( FileReader fileReader = new FileReader (fileName) ) {
int singleCharInt;
char singleChar;
while ( (singleCharInt = fileReader.read ( ) ) != - 1 ) {
singleChar = ( char ) singleCharInt;

//display one character at a time
Organisation.out.print (singleChar) ;
}
}
}
}

1b) FileReader – Explicit Encoding (InputStreamReader)

It's actually not possible to set the encoding explicitly on a FileReader so you have to utilise the parent grade, InputStreamReader and wrap it around a FileInputStream:

          

1
ii
three
iv
5
half dozen
7
viii
9
10
11
12
xiii
14
15
sixteen
17
18
xix
20
21
22

import coffee.io.FileInputStream ;
import coffee.io.IOException ;
import java.io.InputStreamReader ;

public class ReadFile_FileReader_Read_Encoding {
public static void master( Cord [ ] pArgs) throws IOException {
Cord fileName = "c:\\temp\\sample-10KB.txt" ;
FileInputStream fileInputStream = new FileInputStream (fileName) ;

//specify UTF-8 encoding explicitly
try ( InputStreamReader inputStreamReader =
new InputStreamReader (fileInputStream, "UTF-8" ) ) {

int singleCharInt;
char singleChar;
while ( (singleCharInt = inputStreamReader.read ( ) ) != - 1 ) {
singleChar = ( char ) singleCharInt;
Organization.out.print (singleChar) ; //display ane character at a fourth dimension
}
}
}
}

2a) BufferedReader – Default Encoding

BufferedReader reads an entire line at a fourth dimension, instead of one character at a time like FileReader. Information technology's meant for reading text files.

          

1
two
3
4
5
half-dozen
7
eight
ix
10
xi
12
13
14
xv
16
17

import coffee.io.BufferedReader ;
import java.io.FileReader ;
import java.io.IOException ;

public class ReadFile_BufferedReader_ReadLine {
public static void main( String [ ] args) throws IOException {
String fileName = "c:\\temp\\sample-10KB.txt" ;
FileReader fileReader = new FileReader (fileName) ;

endeavour ( BufferedReader bufferedReader = new BufferedReader (fileReader) ) {
String line;
while ( (line = bufferedReader.readLine ( ) ) != zippo ) {
Organization.out.println (line) ;
}
}
}
}

2b) BufferedReader – Explicit Encoding

In a like way to how nosotros set up encoding explicitly for FileReader, nosotros need to create FileInputStream, wrap information technology inside InputStreamReader with an explicit encoding and pass that to BufferedReader:

          

1
2
3
iv
5
6
7
8
9
x
11
12
thirteen
xiv
xv
sixteen
17
18
19
20
21
22

import java.io.BufferedReader ;
import java.io.FileInputStream ;
import java.io.IOException ;
import java.io.InputStreamReader ;

public class ReadFile_BufferedReader_ReadLine_Encoding {
public static void primary( String [ ] args) throws IOException {
Cord fileName = "c:\\temp\\sample-10KB.txt" ;

FileInputStream fileInputStream = new FileInputStream (fileName) ;

//specify UTF-eight encoding explicitly
InputStreamReader inputStreamReader = new InputStreamReader (fileInputStream, "UTF-8" ) ;

attempt ( BufferedReader bufferedReader = new BufferedReader (inputStreamReader) ) {
String line;
while ( (line = bufferedReader.readLine ( ) ) != nix ) {
System.out.println (line) ;
}
}
}
}

Classic I/O – Reading Bytes

1) FileInputStream

FileInputStream reads in one byte at a time, without any buffering. While information technology'southward meant for reading binary files such as images or audio files, it can still be used to read text file. Information technology's similar to reading with FileReader in that you're reading one character at a time as an integer and you demand to cast that int to a char to see the ASCII value.

Past default, information technology uses the default graphic symbol encoding on your system, so I have provided examples for both the default instance, as well as specifying the encoding explicitly.

          

one
2
3
4
5
6
7
8
9
10
eleven
12
13
xiv
15
sixteen
17
18
19
20
21

import java.io.File ;
import coffee.io.FileInputStream ;
import java.io.FileNotFoundException ;
import coffee.io.IOException ;

public form ReadFile_FileInputStream_Read {
public static void primary( String [ ] pArgs) throws FileNotFoundException, IOException {
String fileName = "c:\\temp\\sample-10KB.txt" ;
File file = new File (fileName) ;

try ( FileInputStream fileInputStream = new FileInputStream (file) ) {
int singleCharInt;
char singleChar;

while ( (singleCharInt = fileInputStream.read ( ) ) != - 1 ) {
singleChar = ( char ) singleCharInt;
System.out.print (singleChar) ;
}
}
}
}

2) BufferedInputStream

BufferedInputStream reads a set of bytes all at in one case into an internal byte array buffer. The buffer size can be set explicitly or use the default, which is what we'll demonstrate in our case. The default buffer size appears to be 8KB but I take not explicitly verified this. All performance tests used the default buffer size so information technology will automatically re-size the buffer when it needs to.

          

one
2
three
4
5
6
7
eight
9
10
11
12
13
14
15
16
17
18
nineteen
20
21
22

import java.io.BufferedInputStream ;
import java.io.File ;
import java.io.FileInputStream ;
import java.io.FileNotFoundException ;
import java.io.IOException ;

public class ReadFile_BufferedInputStream_Read {
public static void primary( String [ ] pArgs) throws FileNotFoundException, IOException {
Cord fileName = "c:\\temp\\sample-10KB.txt" ;
File file = new File (fileName) ;
FileInputStream fileInputStream = new FileInputStream (file) ;

try ( BufferedInputStream bufferedInputStream = new BufferedInputStream (fileInputStream) ) {
int singleCharInt;
char singleChar;
while ( (singleCharInt = bufferedInputStream.read ( ) ) != - 1 ) {
singleChar = ( char ) singleCharInt;
System.out.print (singleChar) ;
}
}
}
}

New I/O – Reading Text

1a) Files.readAllLines() – Default Encoding

The Files class is part of the new Java I/O classes introduced in jdk1.7. It only has static utility methods for working with files and directories.

The readAllLines() method that uses the default character encoding was introduced in jdk1.eight and then this example volition not work in Java 7.

          

1
2
iii
4
5
6
7
eight
9
10
11
12
13
xiv
15
sixteen
17

import java.io.File ;
import coffee.io.IOException ;
import java.nio.file.Files ;
import java.util.List ;

public class ReadFile_Files_ReadAllLines {
public static void main( String [ ] pArgs) throws IOException {
String fileName = "c:\\temp\\sample-10KB.txt" ;
File file = new File (fileName) ;

Listing fileLinesList = Files.readAllLines (file.toPath ( ) ) ;

for ( String line : fileLinesList) {
Arrangement.out.println (line) ;
}
}
}

1b) Files.readAllLines() – Explicit Encoding

          

i
2
three
4
v
six
seven
8
ix
10
11
12
thirteen
14
15
16
17
eighteen
19

import coffee.io.File ;
import java.io.IOException ;
import java.nio.charset.StandardCharsets ;
import java.nio.file.Files ;
import coffee.util.List ;

public class ReadFile_Files_ReadAllLines_Encoding {
public static void main( Cord [ ] pArgs) throws IOException {
String fileName = "c:\\temp\\sample-10KB.txt" ;
File file = new File (fileName) ;

//use UTF-8 encoding
List fileLinesList = Files.readAllLines (file.toPath ( ), StandardCharsets.UTF_8 ) ;

for ( String line : fileLinesList) {
System.out.println (line) ;
}
}
}

2a) Files.lines() – Default Encoding

This code was tested to piece of work in Coffee 8 and ix. Java 7 didn't run considering of the lack of support for lambda expressions.

          

1
2
3
iv
5
6
7
eight
9
10
11
12
thirteen
fourteen
15
16
17

import coffee.io.File ;
import java.io.IOException ;
import coffee.nio.file.Files ;
import java.util.stream.Stream ;

public class ReadFile_Files_Lines {
public static void main( Cord [ ] pArgs) throws IOException {
Cord fileName = "c:\\temp\\sample-10KB.txt" ;
File file = new File (fileName) ;

attempt (Stream linesStream = Files.lines (file.toPath ( ) ) ) {
linesStream.forEach (line -> {
Organization.out.println (line) ;
} ) ;
}
}
}

2b) Files.lines() – Explicit Encoding

Just like in the previous instance, this code was tested and works in Java 8 and 9 but non in Coffee vii.

          

1
2
three
4
five
half-dozen
vii
8
9
10
xi
12
thirteen
14
xv
16
17
18

import java.io.File ;
import java.io.IOException ;
import java.nio.charset.StandardCharsets ;
import java.nio.file.Files ;
import java.util.stream.Stream ;

public course ReadFile_Files_Lines_Encoding {
public static void main( String [ ] pArgs) throws IOException {
String fileName = "c:\\temp\\sample-10KB.txt" ;
File file = new File (fileName) ;

try (Stream linesStream = Files.lines (file.toPath ( ), StandardCharsets.UTF_8 ) ) {
linesStream.forEach (line -> {
System.out.println (line) ;
} ) ;
}
}
}

3a) Scanner – Default Encoding

The Scanner class was introduced in jdk1.7 and can be used to read from files or from the panel (user input).

          

one
2
3
iv
5
6
seven
8
9
10
eleven
12
13
14
15
16
17
18
19

import coffee.io.File ;
import coffee.io.FileNotFoundException ;
import coffee.util.Scanner ;

public class ReadFile_Scanner_NextLine {
public static void main( String [ ] pArgs) throws FileNotFoundException {
String fileName = "c:\\temp\\sample-10KB.txt" ;
File file = new File (fileName) ;

effort (Scanner scanner = new Scanner(file) ) {
String line;
boolean hasNextLine = simulated ;
while (hasNextLine = scanner.hasNextLine ( ) ) {
line = scanner.nextLine ( ) ;
System.out.println (line) ;
}
}
}
}

3b) Scanner – Explicit Encoding

          

1
ii
three
iv
5
6
7
8
9
10
11
12
xiii
14
15
16
17
18
19
twenty

import java.io.File ;
import java.io.FileNotFoundException ;
import java.util.Scanner ;

public class ReadFile_Scanner_NextLine_Encoding {
public static void principal( String [ ] pArgs) throws FileNotFoundException {
String fileName = "c:\\temp\\sample-10KB.txt" ;
File file = new File (fileName) ;

//use UTF-eight encoding
attempt (Scanner scanner = new Scanner(file, "UTF-8" ) ) {
String line;
boolean hasNextLine = imitation ;
while (hasNextLine = scanner.hasNextLine ( ) ) {
line = scanner.nextLine ( ) ;
System.out.println (line) ;
}
}
}
}

New I/O – Reading Bytes

Files.readAllBytes()

Fifty-fifty though the documentation for this method states that "it is not intended for reading in large files" I found this to exist the absolute best performing file reading method, even on files as large as 1GB.

          

i
ii
three
4
5
6
7
8
nine
ten
11
12
13
xiv
fifteen
sixteen
17

import coffee.io.File ;
import coffee.io.IOException ;
import java.nio.file.Files ;

public class ReadFile_Files_ReadAllBytes {
public static void master( Cord [ ] pArgs) throws IOException {
String fileName = "c:\\temp\\sample-10KB.txt" ;
File file = new File (fileName) ;

byte [ ] fileBytes = Files.readAllBytes (file.toPath ( ) ) ;
char singleChar;
for ( byte b : fileBytes) {
singleChar = ( char ) b;
System.out.impress (singleChar) ;
}
}
}

3rd Political party I/O – Reading Text

Eatables – FileUtils.readLines()

Apache Commons IO is an open source Java library that comes with utility classes for reading and writing text and binary files. I listed it in this article because it can be used instead of the built in Java libraries. The course we're using is FileUtils.

For this article, version 2.6 was used which is compatible with JDK 1.seven+

Note that you demand to explicitly specify the encoding and that method for using the default encoding has been deprecated.

          

1
2
3
4
5
6
vii
8
9
10
11
12
xiii
14
15
xvi
17
18

import java.io.File ;
import coffee.io.IOException ;
import java.util.Listing ;

import org.apache.commons.io.FileUtils ;

public course ReadFile_Commons_FileUtils_ReadLines {
public static void chief( String [ ] pArgs) throws IOException {
String fileName = "c:\\temp\\sample-10KB.txt" ;
File file = new File (fileName) ;

List fileLinesList = FileUtils.readLines (file, "UTF-eight" ) ;

for ( String line : fileLinesList) {
Organization.out.println (line) ;
}
}
}

Guava – Files.readLines()

Google Guava is an open source library that comes with utility classes for common tasks like collections handling, cache management, IO operations, string processing.

I listed it in this article because it tin can exist used instead of the built in Coffee libraries and I wanted to compare its functioning with the Coffee congenital in libraries.

For this article, version 23.0 was used.

I'm not going to examine all the different ways to read files with Guava, since this commodity is non meant for that. For a more detailed look at all the different means to read and write files with Guava, have a await at Baeldung's in depth article.

When reading a file, Guava requires that the character encoding be set explicitly, simply like Apache Commons.

Compatibility note: This code was tested successfully on Java 8 and 9. I couldn't go it to piece of work on Java seven and kept getting "Unsupported major.minor version 52.0" error. Guava has a split API doc for Java 7 which uses a slightly different version of the Files.readLine() method. I thought I could become it to work but I kept getting that error.

          

1
ii
3
4
5
6
7
8
9
10
11
12
13
fourteen
15
16
17
18
19

import coffee.io.File ;
import java.io.IOException ;
import coffee.util.List ;

import com.google.mutual.base.Charsets ;
import com.google.common.io.Files ;

public grade ReadFile_Guava_Files_ReadLines {
public static void master( Cord [ ] args) throws IOException {
String fileName = "c:\\temp\\sample-10KB.txt" ;
File file = new File (fileName) ;

Listing fileLinesList = Files.readLines (file, Charsets.UTF_8 ) ;

for ( String line : fileLinesList) {
Arrangement.out.println (line) ;
}
}
}

Performance Testing

Since there are then many ways to read from a file in Java, a natural question is "What file reading method is the all-time for my state of affairs?" So I decided to exam each of these methods against each other using sample information files of different sizes and timing the results.

Each code sample from this article displays the contents of the file to a string and and then to the console (System.out). Nonetheless, during the performance tests the System.out line was commented out since it would seriously slow down the performance of each method.

Each functioning test measures the time it takes to read in the file – line past line, character by character, or byte by byte without displaying annihilation to the console. I ran each test 5-10 times and took the boilerplate so as not to let any outliers influence each exam. I also ran the default encoding version of each file reading method – i.e. I didn't specify the encoding explicitly.

Dev Setup

The dev environment used for these tests:

  • Intel Core i7-3615 QM @2.3 GHz, 8GB RAM
  • Windows 8 x64
  • Eclipse IDE for Java Developers, Oxygen.2 Release (iv.seven.two)
  • Java SE 9 (jdk-9.0.four)

Information Files

GitHub doesn't allow pushing files larger than 100 MB, so I couldn't notice a practical way to shop my large test files to permit others to replicate my tests. So instead of storing them, I'm providing the tools I used to generate them then yous can create test files that are like in size to mine. Obviously they won't be the same, but you'll generate files that are similar in size as I used in my performance tests.

Random String Generator was used to generate sample text then I merely copy-pasted to create larger versions of the file. When the file started getting too large to manage inside a text editor, I had to use the command line to merge multiple text files into a larger text file:

copy *.txt sample-1GB.txt

I created the post-obit 7 data file sizes to test each file reading method across a range of file sizes:

  • 1KB
  • 10KB
  • 100KB
  • 1MB
  • 10MB
  • 100MB
  • 1GB

Functioning Summary

There were some surprises and some expected results from the performance tests.

As expected, the worst performers were the methods that read in a file character by grapheme or byte by byte. But what surprised me was that the native Java IO libraries outperformed both 3rd party libraries – Apache Commons IO and Google Guava.

What's more – both Google Guava and Apache Commons IO threw a java.lang.OutOfMemoryError when trying to read in the 1 GB test file. This also happened with the Files.readAllLines(Path) method but the remaining seven methods were able to read in all test files, including the 1GB test file.

The following tabular array summarizes the boilerplate time (in milliseconds) each file reading method took to complete. I highlighted the top three methods in light-green, the average performing methods in xanthous and the worst performing methods in reddish:

The following chart summarizes the above table simply with the following changes:

I removed java.io.FileInputStream.read() from the nautical chart because its performance was so bad it would skew the entire nautical chart and you wouldn't see the other lines properly
I summarized the data from 1KB to 1MB considering after that, the nautical chart would become too skewed with so many under performers and likewise some methods threw a java.lang.OutOfMemoryError at 1GB

The Winners

The new Java I/O libraries (java.nio) had the best overall winner (java.nio.Files.readAllBytes()) only it was followed closely behind by BufferedReader.readLine() which was also a proven top performer across the lath. The other excellent performer was coffee.nio.Files.lines(Path) which had slightly worse numbers for smaller examination files merely really excelled with the larger test files.

The absolute fastest file reader beyond all information tests was java.nio.Files.readAllBytes(Path). Information technology was consistently the fastest and fifty-fifty reading a 1GB file just took about one 2nd.

The following nautical chart compares functioning for a 100KB exam file:

Y'all can run into that the lowest times were for Files.readAllBytes(), BufferedInputStream.read() and BufferedReader.readLine().

The following chart compares performance for reading a 10MB file. I didn't carp including the bar for FileInputStream.Read() because the operation was so bad it would skew the unabridged chart and you couldn't tell how the other methods performed relative to each other:

Files.readAllBytes() really outperforms all other methods and BufferedReader.readLine() is a distant second.

The Losers

Every bit expected, the absolute worst performer was java.io.FileInputStream.read() which was orders of magnitude slower than its rivals for most tests. FileReader.read() was likewise a poor performer for the same reason – reading files byte by byte (or graphic symbol by character) instead of with buffers drastically degrades functioning.

Both the Apache Commons IO FileUtils.readLines() and Guava Files.readLines() crashed with an OutOfMemoryError when trying to read the 1GB test file and they were about average in performance for the remaining exam files.

java.nio.Files.readAllLines() likewise crashed when trying to read the 1GB test file but information technology performed quite well for smaller file sizes.

Performance Rankings

Here'south a ranked list of how well each file reading method did, in terms of speed and handling of large files, as well as compatibility with different Java versions.

Rank File Reading Method
1 java.nio.file.Files.readAllBytes()
ii coffee.io.BufferedFileReader.readLine()
3 java.nio.file.Files.lines()
iv java.io.BufferedInputStream.read()
5 java.util.Scanner.nextLine()
6 java.nio.file.Files.readAllLines()
seven org.apache.commons.io.FileUtils.readLines()
eight com.google.mutual.io.Files.readLines()
9 java.io.FileReader.read()
10 java.io.FileInputStream.Read()

Determination

I tried to present a comprehensive set up of methods for reading files in Coffee, both text and binary. Nosotros looked at 15 dissimilar ways of reading files in Java and we ran performance tests to see which methods are the fastest.

The new Coffee IO library (java.nio) proved to exist a great performer just so was the classic BufferedReader.