Java HTML Encoding (HTML Entities)

Some XSS attacks can be prevented by using HTML Encoding.

HTML encoding function is built into many languages, In .NET WebUtility.HtmlEncode  can do it, in PHP we can use htmlentites  function, in Python cgi.escape  can be used.

But there is no built-in function to do HTML Encode (or HTML Entities) in Java.

We can use Apache Commons Lang library to do this work.

 

Above code will output following result using HTML Encoding

<>

 

Note that not all XSS attacks can be prevented by HTML encoding (https://stackoverflow.com/questions/53728/will-html-encoding-prevent-all-kinds-of-xss-attacks).

Apache POI Sheet.getPhysicalNumberOfRows()

In the past I use following code to display first cell string of all rows.

But above code will be incorrect if there is empty row.

 

Following is explanation from POI official documentation

Sheet.getPhysicalNumberOfRows()

Returns the number of physically defined rows (NOT the number of rows in the sheet)

This method will ignore empty rows, e.g. if a sheet last row is at row 7, but second row is empty, getPhysicalNumberOfRows()  will return 6.

 

Solution

To get row count of a sheet (no matter the row is empty or not), we should use getLastRowNum()  method.

So above code can be changed to

Because getLastRowNum()  method returns 0-based row index, so we use i<=sheet.getLastRowNum()  as the loop condition.

Dom4j Writing File Not Working

Dom4j is a XML processing library which supports XPath, DOM, SAX,  JAXP. In this post we will see a problem that dom4j could not write document to file.

 

Following XML is sample data which we will save to file using dom4j.

To generate above XML and write it to file, following code will be used

Above code may generate an empty file with no content.

 

This is because the content is stored in buffer and not really written to file when calling write  method. To write to file, we should call flush  method or close  method.

So above code should be changed to

Or