Get a node's inner XML as String in Java DOM

The name of the pictureThe name of the pictureThe name of the pictureClash Royale CLAN TAG#URR8PPP

Get a node's inner XML as String in Java DOM



I have an XML org.w3c.dom.Node that looks like this:


<variable name="variableName">
<br /><strong>foo</strong> bar
</variable>



How do I get the <br /><strong>foo</strong> bar part as a String?


<br /><strong>foo</strong> bar




8 Answers
8



There is no simple method on org.w3c.dom.Node for this. getTextContent() gives the text of each child node concatenated together. getNodeValue() will give you the text of the current node if it is an Attribute,CDATA or Text node. So you would need to serialize the node using a combination of getChildNodes(), getNodeName() and getNodeValue() to build the string.


org.w3c.dom.Node


getTextContent()


getNodeValue()


Attribute


CDATA


Text


getChildNodes()


getNodeName()


getNodeValue()



You can also do it with one of the various XML serialization libraries that exist. There is XStream or even JAXB. This is discussed here: XML serialization in Java?


XStream



Same problem. To solve it I wrote this helper function:


public String innerXml(Node node)
DOMImplementationLS lsImpl = (DOMImplementationLS)node.getOwnerDocument().getImplementation().getFeature("LS", "3.0");
LSSerializer lsSerializer = lsImpl.createLSSerializer();
NodeList childNodes = node.getChildNodes();
StringBuilder sb = new StringBuilder();
for (int i = 0; i < childNodes.getLength(); i++)
sb.append(lsSerializer.writeToString(childNodes.item(i)));

return sb.toString();





thanks , exactly what i needed
– yossi
May 25 '11 at 13:33





This method keeps adding the XML definition tag at the front of the string... is there any way to prevent that, besides simply trimming it off afterwards?
– Nyerguds
Aug 8 '11 at 9:58





I solved it. The solution to this is to add the line lsSerializer.getDomConfig().setParameter("xml-declaration", false);
– Nyerguds
Aug 8 '11 at 10:27


lsSerializer.getDomConfig().setParameter("xml-declaration", false);



If you're using jOOX, you can wrap your node in a jquery-like syntax and just call toString() on it:


toString()


$(node).toString();



It uses an identity-transformer internally, like this:


ByteArrayOutputStream out = new ByteArrayOutputStream();
Transformer transformer = TransformerFactory.newInstance().newTransformer();
transformer.setOutputProperty(OutputKeys.OMIT_XML_DECLARATION, "yes");
Source source = new DOMSource(element);
Result target = new StreamResult(out);
transformer.transform(source, target);
return out.toString();



Extending on Andrey M's answer, I had to slightly modify the code to get the complete DOM document. If you just use the


NodeList childNodes = node.getChildNodes();



It didn't include the root element for me. To include the root element (and get the complete .xml document) I used:


public String innerXml(Node node)
DOMImplementationLS lsImpl = (DOMImplementationLS)node.getOwnerDocument().getImplementation().getFeature("LS", "3.0");
LSSerializer lsSerializer = lsImpl.createLSSerializer();
lsSerializer.getDomConfig().setParameter("xml-declaration", false);
StringBuilder sb = new StringBuilder();
sb.append(lsSerializer.writeToString(node));
return sb.toString();



If you dont want to resort to external libraries, the following solution might come in handy. If you have a node "" and you want to extract the childre of the parent element proceed as follows:


StringBuilder resultBuilder = new StringBuilder();
// Get all children of the given parent node
NodeList children = parent.getChildNodes();
try

// Set up the output transformer
TransformerFactory transfac = TransformerFactory.newInstance();
Transformer trans = transfac.newTransformer();
trans.setOutputProperty(OutputKeys.OMIT_XML_DECLARATION, "yes");
trans.setOutputProperty(OutputKeys.INDENT, "yes");
StringWriter stringWriter = new StringWriter();
StreamResult streamResult = new StreamResult(stringWriter);

for (int index = 0; index < children.getLength(); index++)
Node child = children.item(index);

// Print the DOM node
DOMSource source = new DOMSource(child);
trans.transform(source, streamResult);
// Append child to end result
resultBuilder.append(stringWriter.toString());

catch (TransformerException e)
//Errro handling goes here

return resultBuilder.toString();



I had the problem with the last answer that method 'nodeToStream()' is undefined; therefore, my version here:


public static String toString(Node node)
String xmlString = "";
try
Transformer transformer = TransformerFactory.newInstance().newTransformer();
transformer.setOutputProperty(OutputKeys.OMIT_XML_DECLARATION, "yes");
//transformer.setOutputProperty(OutputKeys.INDENT, "yes");

Source source = new DOMSource(node);

StringWriter sw = new StringWriter();
StreamResult result = new StreamResult(sw);

transformer.transform(source, result);
xmlString = sw.toString ();

catch (Exception ex)
ex.printStackTrace ();


return xmlString;



Building on top of Lukas Eder's solution, we can extract innerXml like in .NET as below


public static String innerXml(Node node,String tag)
String xmlstring = toString(node);
xmlstring = xmlstring.replaceFirst("<[/]?"+tag+">","");
return xmlstring;




public static String toString(Node node)
String xmlString = "";
Transformer transformer;
try
transformer = TransformerFactory.newInstance().newTransformer();
transformer.setOutputProperty(OutputKeys.OMIT_XML_DECLARATION, "yes");
//transformer.setOutputProperty(OutputKeys.INDENT, "yes");
StreamResult result = new StreamResult(new StringWriter());

xmlString = nodeToStream(node, transformer, result);

catch (TransformerConfigurationException e)
// TODO Auto-generated catch block
e.printStackTrace();
catch (TransformerFactoryConfigurationError e)
// TODO Auto-generated catch block
e.printStackTrace();
catch (TransformerException e)
// TODO Auto-generated catch block
e.printStackTrace();
catch (Exception ex)
ex.printStackTrace();


return xmlString;



Ex:


If Node name points to xml with string representation "<Name><em>Chris</em>tian<em>Bale</em></Name>"
String innerXml = innerXml(name,"Name"); //returns "<em>Chris</em>tian<em>Bale</em>"



Here is an alternative solution to extract the content of a org.w3c.dom.Node.
This solution works also if the node content contains no xml tags:


private static String innerXml(Node node) throws TransformerFactoryConfigurationError, TransformerException
StringWriter writer = new StringWriter();
String xml = null;
Transformer transformer = TransformerFactory.newInstance().newTransformer();
transformer.setOutputProperty(OutputKeys.OMIT_XML_DECLARATION, "yes");
transformer.transform(new DOMSource(node), new StreamResult(writer));
// now remove the outer tag....
xml = writer.toString();
xml = xml.substring(xml.indexOf(">") + 1, xml.lastIndexOf("</"));
return xml;






By clicking "Post Your Answer", you acknowledge that you have read our updated terms of service, privacy policy and cookie policy, and that your continued use of the website is subject to these policies.

Comments

Popular posts from this blog

Executable numpy error

PySpark count values by condition

Mass disable jenkins jobs