Can read Java Excel files

Parse Java - CSV or Excel file in object

Excel files are ubiquitous in the world of work. Above all, in my experience, Excel is primarily used by project managers and other employees who work more number-oriented. Although Excel and CSV files can differ greatly in their structure, these two file formats are quite similar in terms of their basic structure. Anyone who has ever used the "Save As" function in Excel or LibreOffice, for example, knows that the two file formats are only two clicks away from each other.

During my time as a programmer, it has often happened that I had to read and process CSV files. After all, these types of files have a very simple structure and are ideal for exchanging data between several systems. For example, for the nightly transmission of product data. The whole thing is actually quite simple. The following steps are necessary:

  1. Read in CSV file line by line (one data record each)
  2. Separate each line by its separator (e.g. "," or ";") (list of values ​​per data record)
  3. Write the list of values ​​in a line of the result list
  4. Back to point 1, until the CSV file has no more new lines

So far so good. However, this often results in very unstructured data, as shown in the following figure. At the end of the day, who knows what a value was in place 0-3? Was that the name or the number?



So it makes more sense to pack the corresponding lines in a new Java object. Thus we would have a list of objects, which in turn keep the values ​​structured in attributes. If I want to read out the name, I don't have to remember whether it was in position 2, 3 or 65. So great idea!

Unfortunately, the preparation of these lines is associated with a lot of effort, since each column of a line has to call the corresponding setter. However, there is a relatively simple solution that I would like to introduce here. - Have fun


Structure of a CSV file

Okay, I'll admit it was a bit abstract now. Perhaps we should first work through the structure of a CSV file and clarify the general terminology.
In general, a CSV file is structured as follows:

  • Each line describes a data record that is self-contained.
  • Each column is separated from one another with a certain separator and describes certain characteristics.
  • The number of the column determines which characteristic is described in the column.

For example, in a CSV file assumed as an example, the number of the product is always in place zero. The name of the product is always in place five of the CSV file. In place seven is the price of the product. So we could theoretically continue this list indefinitely.
If the CSV file has a header line, this is always located in the first line and describes in which column which characteristic value can be found. In line 0 and column 0, "ID of the product" (description of the characteristics) would appear. "76345" (characteristic value) is in line 1 and column 0.

In the following, our aim is to read in every data record and to write every characteristic value per data record in its corresponding attribute.

I wanted to demonstrate the whole thing again in a small diagram: P



Read in CSV file line by line

A CSV file with a header is to be read in (the first line of the file describes what the individual columns contain).
The separator is a comma.

A free Apache library is used:


Read in and iterate over the lines:


As you can see, I use a class with the name "to read in a CSV or Excel (that works too!")CSV format"which in turn with the command"parse (in)" the file "liste.csv"reads.

So far we have been able to read in a CSV file and iterate over each line. What is missing, however, is the insertion of the corresponding values ​​per column in the attributes of a Java object. In addition, we have not yet created a Java object and would still have to think about how we can map this data into the respective attributes without much additional effort. However, before we can think about how we fill the attributes and read out the corresponding columns which correspond to the attributes, we should think about how it could work in general. I would like to provide a short example of this and then demonstrate how it could be much faster. In the first instance we will use the example from the picture above. This picture demonstrates that we could read in a CSV file which consists of the following features per column: number of the product, name of the product, price of the product, category of the product
It would of course be conceivable that we not only have four different characteristics and thus attribute values ​​per line, but possibly 100 characteristics and thus 100 characteristic values.

First of all, there is the code of the "Product" class:


Then we look at the cumbersome example, since each attribute has to be filled in individually:


And now we come to the real main problem. If we should find a class or CSV file that contains not just four attributes but maybe 10.15 or 75, we have to fill in each attribute individually via its setter.
Alternatively, we can use the following method to automatically fill all attributes of the class with the contents of the line:


However, there is a small downer. The CSV file always needs a header line and all attributes to be filled must be set to Public within the class to be generated. If these requirements are met, we can fill several hundred attributes at the same time with just a few lines of code, without having to program them individually by hand.


Marvin

I am a person who, besides programming, is interested in a thousand other things that are sometimes more and sometimes less crazy. Above all, however, I am enthusiastic about programming my own little apps and programs that enrich my life.