How to Convert org.w3c.dom.Document to String in Java

start here featured

1. Overview

When handling XML in Java, we’ll often have an instance of a org.w3c.dom.Document that we need to convert to a String. Typically we might want to do this for a number of reasons, such as serialization, logging, and working with HTTP requests or responses.

In this quick tutorial, we’ll see how to convert a Document to a String. To learn more about working with XML in Java, check out our comprehensive series on XML.

2. Creating a Simple Document

Throughout this tutorial, the focus of our examples will be a simple XML document describing some fruit:

<fruit>
    <name>Apple</name>
    <color>Red</color>
    <weight unit="grams">150</weight>
    <sweetness>7</sweetness>
</fruit>

Let’s go ahead and create an XML Document object from that string:

private static final String FRUIT_XML = "<fruit><name>Apple</name><color>Red</color><weight unit=\"grams\">150</weight><sweetness>7</sweetness></fruit>"; 
public static Document getDocument() throws SAXException, IOException, ParserConfigurationException {
    DocumentBuilderFactory factory = DocumentBuilderFactory.newInstance();
    Document document = factory.newDocumentBuilder()
      .parse(new InputSource(new StringReader(FRUIT_XML)));
    return document;
}

As we can see we create a factory for building a new Document, and then we call the parse method with the content of the given input source. In this case, our input source is a StringReader object containing our Fruit XML string payload.

3. Conversion Using XML Transformation APIs

The javax.xml.transform package contains a set of generic APIs for performing transformations from a source to a result. In our case, the source is the XML document and the result is the output string:

public static String toString(Document document) throws TransformerException {
    TransformerFactory transformerFactory = TransformerFactory.newInstance();
    Transformer transformer = transformerFactory.newTransformer();
    StringWriter stringWriter = new StringWriter();
    transformer.transform(new DOMSource(document), new StreamResult(stringWriter));
    return stringWriter.toString();
}

Let’s walk through the key parts of our toString method:

First, we start by creating our TransformerFactory. We’ll use this factory to create the transformer, and in this example, the transformer will simply use the platform’s default.

Now, we can specify the source and result of the transformation. Here, we’ll use our Document to construct a DOM source and a StringWriter to hold the result.

Finally, we call toString on our StringWriter object, which returns the character stream’s current value as a string.

4. Unit Testing

Now we have a simple way to convert XML documents to strings, let’s go ahead and test it works properly:

@Test
public void givenXMLDocument_thenConvertToStringSuccessfully() throws Exception {
    Document document = XmlDocumentToString.getDocument();
    String expectedDeclartion = "<?xml version=\"1.0\" encoding=\"UTF-8\" standalone=\"no\"?>";
    assertEquals(expectedDeclartion + XmlDocumentToString.FRUIT_XML, XmlDocumentToString.toString(document));
}

Note that our conversion adds the standard XML declaration to the start of the string by default. In our test, we simply check that the converted string matches the original fruit XML, including the standard declaration.

5. Customizing the Output

Now, let’s take a look at our output. By default, our transformer doesn’t apply any kind of output formatting:

<?xml version="1.0" encoding="UTF-8" standalone="no"?><fruit><name>Apple</name><color>Red</color><weight unit="grams">150</weight><sweetness>7</sweetness></fruit>

Obviously, it doesn’t take long for our XML documents to become difficult to read using this one-line formatting, especially for large documents. Fortunately, the Transformer interface provides a variety of output properties to help us

Let’s refactor our transformation code a little bit using some of these output properties:

public static String toStringWithOptions(Document document) throws TransformerException {
    Transformer transformer = getTransformer();
    transformer.setOutputProperty(OutputKeys.OMIT_XML_DECLARATION, "yes");
    transformer.setOutputProperty(OutputKeys.INDENT, "yes");
    transformer.setOutputProperty("{http://xml.apache.org/xslt}indent-amount", "4");
    StringWriter stringWriter = new StringWriter();
    transformer.transform(new DOMSource(document), new StreamResult(stringWriter));
    return stringWriter.toString();
}
private static Transformer getTransformer() throws TransformerConfigurationException {
    TransformerFactory transformerFactory = TransformerFactory.newInstance();
    return transformerFactory.newTransformer();
}

Sometimes, we might want to exclude the XML declaration. We can configure our transformer to do this by setting the OutputKeys.OMIT_XML_DECLARATION property.

Now, to apply some indentation, we can use two properties: OutputKeys.INDENT and the indent-amount property to specify the amount of indentation. This will indent the output correctly, as by default, the indentation uses zero spaces.

With the above properties set, we get a much nicer-looking output:

<fruit>
    <name>Apple</name>
    <color>Red</color>
    <weight unit="grams">150</weight>
    <sweetness>7</sweetness>
</fruit>

6. Conclusion

In this short article, we learned how to create an XML Document from a Java String object, and then we saw how to convert this Document back into a String using the javax.xml.transform package.

In addition to this, we also saw several ways we can customize the output of the XML, which can be useful when logging the XML to the console.

As always, the full source code of the article is available over on GitHub.

       

\"IT電腦補習
立刻註冊及報名電腦補習課程吧!

Find A Teacher Form:
https://docs.google.com/forms/d/1vREBnX5n262umf4wU5U2pyTwvk9O-JrAgblA-wH9GFQ/viewform?edit_requested=true#responses

Email:
public1989two@gmail.com






www.itsec.hk
www.itsec.vip
www.itseceu.uk

Be the first to comment

Leave a Reply

Your email address will not be published.


*