I'm using version 2.3.0 of the XML parser but there seems to be a difference between the Scala2 version and the Scala3 version. In particular, CDATA constructs in the XML do not appear to be converted into PCData objects in Scala3, as they were in Scala2. I wasn't able to find any revision notes that address this issue. Is there a way to parse such that the fact that characters were encoded inside CDATA is retained so that such output can look identical to the original? Here's the essence of the logic that I've been successfully using with Scala2:
def extract(node: Node): Try[CharSequence] =
node match {
case x: xml.Text =>
Success(x.data)
case CDATA(x) =>
Success(x)
case _ =>
... default behavior ...
}
case class CDATA(content: String, pre: String, post: String) extends CharSequence
object CDATA {
def unapply(node: Node): Option[CDATA] = {
node.child match {
case Seq(pre: Node, PCData(x), post: Node) =>
Some(CDATA(x, pre.text, post.text))
case Seq(PCData(x)) =>
Some(CDATA(x))
case x =>
None
}
}
The purpose of the CDATA class/object is so that when rendering to XML, it appears identical to the original. With the Scala3 code, it does not match on PCData and so the result of the unapply method is always None.
Here is an example of a unit test that works fine in Scala2:
it should "parse BalloonStyle" in {
val xml: Elem = <xml>
<Style id="noDrivingDirections">
<BalloonStyle>
<text>
<![CDATA[
<b>$[name]</b>
<br /><br />
$[description]
]]>
</text>
</BalloonStyle>
</Style>
</xml>
extractAll[Seq[StyleSelector]](xml) match {
case Success(ss) =>
ss.size shouldBe 1
val style: StyleSelector = ss.head
println(style)
style match {
case Style(styles) =>
styles.size shouldBe 1
styles.head match {
case b@BalloonStyle(text, _, _, _) =>
val expectedText =
"""
| <b>$[name]</b>
| <br /><br />
| $[description]
| """.stripMargin
text.$.asInstanceOf[CDATA].content shouldBe expectedText
val wy = TryUsing(StateR())(sr => Renderer.render(b, FormatXML(), sr))
wy.isSuccess shouldBe true
val expectedBalloonStyle =
"""<BalloonStyle>
| <text>
|<![CDATA[
| <b>$[name]</b>
| <br /><br />
| $[description]
| ]]>
|</text>
|</BalloonStyle>""".stripMargin
wy.get shouldBe expectedBalloonStyle
}
case _ => fail(s"wrong sort of StyleSelector: $style")
}
case Failure(x) => fail(x)
}
}