在Java开发中,XML作为一种标准的数据交换格式,其解析技术尤为重要。Java提供了三种主要的XML解析方式:DOM(Document Object Model)、SAX(Simple API for XML)和StAX(Streaming API for XML)。本文将深入浅出地探讨这三种解析方式的原理、优缺点、常见问题、易错点及避免策略,并通过代码示例加以说明。
DOM将整个XML文档加载到内存中,形成一个树状结构,允许随机访问文档中的任何部分。
import javax.xml.parsers.DocumentBuilder;
import javax.xml.parsers.DocumentBuilderFactory;
import org.w3c.dom.Document;
import org.w3c.dom.NodeList;
import org.xml.sax.InputSource;
import java.io.StringReader;
public class DomExample {
public static void main(String[] args) throws Exception {
String xml = "<root><item id='1'>Text1</item><item id='2'>Text2</item></root>";
DocumentBuilderFactory factory = DocumentBuilderFactory.newInstance();
DocumentBuilder builder = factory.newDocumentBuilder();
Document doc = builder.parse(new InputSource(new StringReader(xml)));
NodeList itemList = doc.getElementsByTagName("item");
for (int i = 0; i < itemList.getLength(); i++) {
System.out.println(itemList.item(i).getTextContent());
}
}
}
SAX采用事件驱动模型,逐行读取XML,当遇到标签开始、结束、文本等内容时触发相应事件。
import org.xml.sax.Attributes;
import org.xml.sax.SAXException;
import org.xml.sax.helpers.DefaultHandler;
import javax.xml.parsers.SAXParser;
import javax.xml.parsers.SAXParserFactory;
public class SaxExample {
public static void main(String[] args) throws Exception {
SAXParserFactory factory = SAXParserFactory.newInstance();
SAXParser saxParser = factory.newSAXParser();
DefaultHandler handler = new DefaultHandler() {
public void startElement(String uri, String localName, String qName, Attributes attributes) throws SAXException {
if (qName.equalsIgnoreCase("item")) {
System.out.println("Item ID: " + attributes.getValue("id"));
}
}
public void characters(char ch[], int start, int length) throws SAXException {
System.out.println(new String(ch, start, length));
}
};
saxParser.parse(new InputSource(new StringReader("<root><item id='1'>Text1</item><item id='2'>Text2</item></root>")), handler);
}
}
StAX也是基于事件驱动的流式解析,但它是“拉模式”,由程序员控制解析流程。
import javax.xml.stream.XMLInputFactory;
import javax.xml.stream.XMLStreamConstants;
import javax.xml.stream.XMLStreamReader;
public class StaxExample {
public static void main(String[] args) throws Exception {
XMLInputFactory factory = XMLInputFactory.newInstance();
XMLStreamReader reader = factory.createXMLStreamReader(
new StringReader("<root><item id='1'>Text1</item><item id='2'>Text2</item></root>")
);
while (reader.hasNext()) {
int event = reader.next();
switch (event) {
case XMLStreamConstants.START_ELEMENT:
if ("item".equals(reader.getLocalName())) {
System.out.println("Item ID: " + reader.getAttributeValue(null, "id"));
}
break;
case XMLStreamConstants.CHARACTERS:
System.out.println(reader.getText());
break;
}
}
}
}
DOM、SAX、StAX各有优势,选择哪种方式取决于具体需求。DOM适合小文件或需要频繁修改的操作;SAX和StAX更适合处理大文件,其中StAX提供了更多的控制权。理解它们的工作原理和适用场景,能够帮助你更有效地处理XML数据