How to know when to use byteStream for reading data and when to use charStream for reading data from a file?

64 Views Asked by At

I am trying to understand if I need to read data from different types of files (.properties file, json file, text file etc) or from console, which java class should I use and why exactly.

Some classes are reading data in the form of bytes (8 bits) and some are reading in the form of characters (16 bits unicode). So, how do I decide which class to use for reading data?

enter image description here

In the above sample code, I am trying to read a .properties file. So, how to decide if I need to use FileInputStream or any other class to read the file?

I tried looking for answers online but I am still not clear.

1

There are 1 best solutions below

0
vanje On

It depends on the individual case.

For property files, there is an overloaded Properties.load() method. You can pass either an InputStream or a Reader. Here it would be best to use the InputStream option (byte stream) because the load() function handles the correct character set on its own. If you use a Reader you are responsible to read the file with the correct encoding.

Same for parsing XML files. The encoding is part of the XML header (or UTF-8 as default) so it is the best option to let the parser read and handle it.

For JSON the default encoding is UTF-8. Other encodings are possible but I don't know whether it is possible to declare the encoding inside a JSON document like in XML.

So for other file types it depends on the use case. If you have a text file encoded as UTF-8 and you want to copy it to another location as is, you can simply treat the file as a byte block. But if you have to extract some words it is necessary to interpret the byte block as characters so you need a Reader and the right character encoding (in conjunction with an InputStreamReader).

Sometimes you get the right encoding via an API e.g. if you call a REST service via HTTP you can extract the encoding out of a HTTP header. Otherwise, you simply have to know the encoding.