[PATCH v7 0/3] MR10025: mshtml/msxml3: Add XMLSerializer, embedded XML declaration handling
This patchset adds XMLSerializer support and fixes several compatibility issues for Adobe Creative Cloud installer and similar applications. XMLSerializer implementation (mshtml) - Implements the `IXMLSerializer` interface with `serializeToString()` method - Allows JavaScript to serialize DOM elements to XML strings Embedded XML declaration handling (msxml3) - Adds CDATA wrapping for embedded `<?xml?>` declarations inside elements - Windows MSXML tolerates these nested declarations but libxml2 rejects them - The preprocessing step wraps problematic content in CDATA sections before parsing ~~DISPATCH_METHOD|DISPATCH_PROPERTYGET fixes (jscript/mshtml)~~ This was already done !10004 Also added tests for XMLSerializer (empty elements, text nodes, special characters, nested elements, multiple children), embedded XML declaration handling (multiple declarations, deeply nested, encoding attributes, self-closing elements) and DISPATCH_METHOD|DISPATCH_PROPERTYGET behavior in both quirks and standards modes - Filip Bakreski -- v7: mshtml/msxml3/tests: Add tests for XMLSerializer and embedded XML declarations. libs/xml2: Tolerate embedded XML declarations inside elements. mshtml: Add IXMLSerializer implementation. https://gitlab.winehq.org/wine/wine/-/merge_requests/10025
From: Phiality <bakreski03@gmail.com> --- dlls/mshtml/mshtml_private.h | 5 +- dlls/mshtml/omnavigator.c | 150 +++++++++++++++++++++++++++++++++++ include/mshtmdid.h | 4 + include/mshtml.idl | 40 ++++++++++ 4 files changed, 198 insertions(+), 1 deletion(-) diff --git a/dlls/mshtml/mshtml_private.h b/dlls/mshtml/mshtml_private.h index ce872249e62..536d9ccbff7 100644 --- a/dlls/mshtml/mshtml_private.h +++ b/dlls/mshtml/mshtml_private.h @@ -160,6 +160,7 @@ struct constructor; XDIID(DispHTMLWindow2) \ XDIID(DispHTMLXMLHttpRequest) \ XDIID(DispXDomainRequest) \ + XDIID(DispXMLSerializer) \ XDIID(DispSVGCircleElement) \ XDIID(DispSVGSVGElement) \ XDIID(DispSVGTSpanElement) \ @@ -298,6 +299,7 @@ struct constructor; XIID(IHTMLXMLHttpRequestFactory) \ XIID(IHTMLXDomainRequest) \ XIID(IHTMLXDomainRequestFactory) \ + XIID(IXMLSerializer) \ XIID(IOmHistory) \ XIID(IOmNavigator) \ XIID(ISVGCircleElement) \ @@ -530,7 +532,8 @@ typedef struct { X(Window) \ X(XDomainRequest) \ X(XMLDocument) \ - X(XMLHttpRequest) + X(XMLHttpRequest) \ + X(XMLSerializer) typedef enum { OBJID_NONE, diff --git a/dlls/mshtml/omnavigator.c b/dlls/mshtml/omnavigator.c index fc4d2f5b408..6d26d13636d 100644 --- a/dlls/mshtml/omnavigator.c +++ b/dlls/mshtml/omnavigator.c @@ -451,6 +451,156 @@ static HRESULT init_dom_parser_ctor(struct constructor *constr) return S_OK; } +struct xml_serializer { + DispatchEx dispex; + IXMLSerializer IXMLSerializer_iface; +}; + +static inline struct xml_serializer *impl_from_IXMLSerializer(IXMLSerializer *iface) +{ + return CONTAINING_RECORD(iface, struct xml_serializer, IXMLSerializer_iface); +} + +DISPEX_IDISPATCH_IMPL(xml_serializer, IXMLSerializer, impl_from_IXMLSerializer(iface)->dispex) + +static HRESULT WINAPI xml_serializer_serializeToString(IXMLSerializer *iface, IHTMLDOMNode *node, BSTR *pString) +{ + struct xml_serializer *This = impl_from_IXMLSerializer(iface); + HTMLDOMNode *dom_node; + nsAString nsstr; + HRESULT hres; + + TRACE("(%p)->(%p %p)\n", This, node, pString); + + if(!node || !pString) + return E_INVALIDARG; + + *pString = NULL; + + dom_node = unsafe_impl_from_IHTMLDOMNode(node); + if(!dom_node) { + WARN("not an HTMLDOMNode\n"); + return E_INVALIDARG; + } + + nsAString_Init(&nsstr, NULL); + hres = nsnode_to_nsstring(dom_node->nsnode, &nsstr); + if(SUCCEEDED(hres)) { + const WCHAR *str; + nsAString_GetData(&nsstr, &str); + *pString = SysAllocString(str); + if(!*pString) + hres = E_OUTOFMEMORY; + } + nsAString_Finish(&nsstr); + + return hres; +} + +static const IXMLSerializerVtbl xml_serializer_vtbl = { + xml_serializer_QueryInterface, + xml_serializer_AddRef, + xml_serializer_Release, + xml_serializer_GetTypeInfoCount, + xml_serializer_GetTypeInfo, + xml_serializer_GetIDsOfNames, + xml_serializer_Invoke, + xml_serializer_serializeToString +}; + +static inline struct xml_serializer *xml_serializer_from_DispatchEx(DispatchEx *iface) +{ + return CONTAINING_RECORD(iface, struct xml_serializer, dispex); +} + +static void *xml_serializer_query_interface(DispatchEx *dispex, REFIID riid) +{ + struct xml_serializer *This = xml_serializer_from_DispatchEx(dispex); + + if(IsEqualGUID(&IID_IXMLSerializer, riid)) + return &This->IXMLSerializer_iface; + + return NULL; +} + +static void xml_serializer_destructor(DispatchEx *dispex) +{ + struct xml_serializer *This = xml_serializer_from_DispatchEx(dispex); + free(This); +} + +static HRESULT init_xml_serializer_ctor(struct constructor*); + +static const dispex_static_data_vtbl_t xml_serializer_dispex_vtbl = { + .query_interface = xml_serializer_query_interface, + .destructor = xml_serializer_destructor, +}; + +static const tid_t xml_serializer_iface_tids[] = { + IXMLSerializer_tid, + 0 +}; + +dispex_static_data_t XMLSerializer_dispex = { + .id = OBJID_XMLSerializer, + .init_constructor = &init_xml_serializer_ctor, + .vtbl = &xml_serializer_dispex_vtbl, + .disp_tid = DispXMLSerializer_tid, + .iface_tids = xml_serializer_iface_tids, +}; + +static HRESULT xml_serializer_ctor_value(DispatchEx *dispex, LCID lcid, WORD flags, DISPPARAMS *params, + VARIANT *res, EXCEPINFO *ei, IServiceProvider *caller) +{ + struct constructor *This = constructor_from_DispatchEx(dispex); + struct xml_serializer *ret; + + TRACE("\n"); + + switch(flags) { + case DISPATCH_METHOD|DISPATCH_PROPERTYGET: + if(!res) + return E_INVALIDARG; + /* fall through */ + case DISPATCH_METHOD: + case DISPATCH_CONSTRUCT: + break; + default: + FIXME("flags %x not supported\n", flags); + return E_NOTIMPL; + } + + if(!(ret = calloc(1, sizeof(*ret)))) + return E_OUTOFMEMORY; + + ret->IXMLSerializer_iface.lpVtbl = &xml_serializer_vtbl; + init_dispatch(&ret->dispex, &XMLSerializer_dispex, This->window, dispex_compat_mode(&This->dispex)); + + V_VT(res) = VT_DISPATCH; + V_DISPATCH(res) = (IDispatch*)&ret->IXMLSerializer_iface; + return S_OK; +} + +static const dispex_static_data_vtbl_t xml_serializer_ctor_dispex_vtbl = { + .destructor = constructor_destructor, + .traverse = constructor_traverse, + .unlink = constructor_unlink, + .value = xml_serializer_ctor_value, +}; + +static dispex_static_data_t xml_serializer_ctor_dispex = { + .name = "XMLSerializer", + .constructor_id = OBJID_XMLSerializer, + .vtbl = &xml_serializer_ctor_dispex_vtbl, +}; + +static HRESULT init_xml_serializer_ctor(struct constructor *constr) +{ + init_dispatch(&constr->dispex, &xml_serializer_ctor_dispex, constr->window, + dispex_compat_mode(&constr->window->event_target.dispex)); + return S_OK; +} + typedef struct { DispatchEx dispex; IHTMLScreen IHTMLScreen_iface; diff --git a/include/mshtmdid.h b/include/mshtmdid.h index 45b22f6349f..204e5e403f6 100644 --- a/include/mshtmdid.h +++ b/include/mshtmdid.h @@ -107,6 +107,7 @@ #define DISPID_XMLHTTPREQUEST DISPID_NORMAL_FIRST #define DISPID_XDOMAINREQUEST DISPID_NORMAL_FIRST #define DISPID_DOMPARSER DISPID_NORMAL_FIRST +#define DISPID_XMLSERIALIZER DISPID_NORMAL_FIRST #define DISPID_DOCUMENTCOMPATIBLEINFO_COLLECTION DISPID_NORMAL_FIRST #define DISPID_DOCUMENTCOMPATIBLEINFO DISPID_NORMAL_FIRST #define DISPID_XDOMAINREQUEST DISPID_NORMAL_FIRST @@ -4676,6 +4677,9 @@ /* IDOMParser */ #define DISPID_IDOMPARSER_PARSEFROMSTRING DISPID_DOMPARSER +/* IXMLSerializer */ +#define DISPID_IXMLSERIALIZER_SERIALIZETOSTRING DISPID_XMLSERIALIZER + /* IEventTarget */ #define DISPID_IEVENTTARGET_ADDEVENTLISTENER DISPID_HTMLOBJECT+10 #define DISPID_IEVENTTARGET_REMOVEEVENTLISTENER DISPID_HTMLOBJECT+11 diff --git a/include/mshtml.idl b/include/mshtml.idl index 1d2896f88b8..11f3fa5c59d 100644 --- a/include/mshtml.idl +++ b/include/mshtml.idl @@ -30286,6 +30286,46 @@ coclass DOMParser interface IDOMParser; } +/***************************************************************************** + * IXMLSerializer interface + */ +[ + object, + oleautomation, + dual, + uuid(30510783-98b5-11cf-bb82-00aa00bdce0b) +] +interface IXMLSerializer : IDispatch +{ + [id(DISPID_IXMLSERIALIZER_SERIALIZETOSTRING)] + HRESULT serializeToString([in] IHTMLDOMNode *node, [retval, out] BSTR *pString); +} + +/***************************************************************************** + * DispXMLSerializer dispinterface + */ +[ + hidden, + uuid(305900af-98b5-11cf-bb82-00aa00bdce0b) +] +dispinterface DispXMLSerializer +{ +properties: +methods: + [id(DISPID_IXMLSERIALIZER_SERIALIZETOSTRING)] + BSTR serializeToString([in] IHTMLDOMNode *node); +} + +[ + noncreatable, + uuid(30510784-98b5-11cf-bb82-00aa00bdce0b) +] +coclass XMLSerializer +{ + [default] dispinterface DispXMLSerializer; + interface IXMLSerializer; +} + /***************************************************************************** * IXMLGenericParse interface */ -- GitLab https://gitlab.winehq.org/wine/wine/-/merge_requests/10025
From: Phiality <bakreski03@gmail.com> --- libs/xml2/parser.c | 79 ++++++++++++++++++++++++++++++++++++++++++++-- 1 file changed, 77 insertions(+), 2 deletions(-) diff --git a/libs/xml2/parser.c b/libs/xml2/parser.c index 3e8a588f536..6c0d7997ce2 100644 --- a/libs/xml2/parser.c +++ b/libs/xml2/parser.c @@ -5220,8 +5220,7 @@ xmlParsePITarget(xmlParserCtxtPtr ctxt) { int i; if ((name[0] == 'x') && (name[1] == 'm') && (name[2] == 'l') && (name[3] == 0)) { - xmlFatalErrMsg(ctxt, XML_ERR_RESERVED_XML_NAME, - "XML declaration allowed only at the start of the document\n"); + /* Wine: Windows MSXML tolerates embedded XML declarations, handled in xmlParsePI */ return(name); } else if (name[3] == 0) { xmlFatalErr(ctxt, XML_ERR_RESERVED_XML_NAME, NULL); @@ -5345,6 +5344,82 @@ xmlParsePI(xmlParserCtxtPtr ctxt) { */ target = xmlParsePITarget(ctxt); if (target != NULL) { + /* Wine: Windows MSXML tolerates embedded XML declarations inside elements. */ + if ((target[0] == 'x') && (target[1] == 'm') && + (target[2] == 'l') && (target[3] == 0)) { + xmlChar *text; + size_t textlen = 0; + size_t textsize = 1024; + int nesting = 0; + + text = (xmlChar *) xmlMallocAtomic(textsize); + if (text == NULL) { + xmlErrMemory(ctxt, NULL); + ctxt->instate = state; + return; + } + + /* Start with "<?xml" */ + memcpy(text, "<?xml", 5); + textlen = 5; + + /* Consume everything until parent's close tag, tracking nesting */ + while (RAW != 0) { + /* Check for close tag </ */ + if (RAW == '<' && NXT(1) == '/') { + if (nesting == 0) { + /* This is the parent's close tag - stop here */ + break; + } + nesting--; + } + /* Check for start tag < followed by letter (not <? or <! or </) */ + else if (RAW == '<' && NXT(1) != '?' && NXT(1) != '!' && NXT(1) != '/') { + xmlChar c = NXT(1); + if ((c >= 'a' && c <= 'z') || (c >= 'A' && c <= 'Z')) { + /* Could be start tag - check if self-closing */ + const xmlChar *p = ctxt->input->cur + 1; + int is_selfclose = 0; + while (*p && *p != '>') { + if (*p == '/' && *(p+1) == '>') { + is_selfclose = 1; + break; + } + p++; + } + if (!is_selfclose) + nesting++; + } + } + + /* Grow buffer if needed */ + if (textlen + 2 >= textsize) { + xmlChar *tmp; + textsize *= 2; + tmp = (xmlChar *) xmlRealloc(text, textsize); + if (tmp == NULL) { + xmlErrMemory(ctxt, NULL); + xmlFree(text); + ctxt->instate = state; + return; + } + text = tmp; + } + text[textlen++] = RAW; + NEXT; + } + text[textlen] = 0; + + /* Emit as text content (like CDATA) */ + if ((ctxt->sax) && (!ctxt->disableSAX) && + (ctxt->sax->characters != NULL)) + ctxt->sax->characters(ctxt->userData, text, textlen); + + xmlFree(text); + if (ctxt->instate != XML_PARSER_EOF) + ctxt->instate = state; + return; + } if ((RAW == '?') && (NXT(1) == '>')) { if (inputid != ctxt->input->id) { xmlFatalErrMsg(ctxt, XML_ERR_ENTITY_BOUNDARY, -- GitLab https://gitlab.winehq.org/wine/wine/-/merge_requests/10025
From: Phiality <bakreski03@gmail.com> --- dlls/mshtml/tests/documentmode.js | 3 +- dlls/mshtml/tests/dom.js | 101 +++++++++++++++ dlls/msxml3/tests/domdoc.c | 200 ++++++++++++++++++++++++++++++ 3 files changed, 303 insertions(+), 1 deletion(-) diff --git a/dlls/mshtml/tests/documentmode.js b/dlls/mshtml/tests/documentmode.js index dadd7ba6b04..71a14a7d322 100644 --- a/dlls/mshtml/tests/documentmode.js +++ b/dlls/mshtml/tests/documentmode.js @@ -4453,6 +4453,7 @@ sync_test("prototype props", function() { check(DocumentFragment, [ ["attachEvent",9,10], ["detachEvent",9,10], "querySelector", "querySelectorAll", "removeNode", "replaceNode", "swapNode" ]); check(DocumentType, [ "entities", "internalSubset", "name", "notations", "publicId", "systemId" ]); check(DOMParser, [ "parseFromString" ]); + check(XMLSerializer, [ "serializeToString" ]); check(Element, [ "childElementCount", "clientHeight", "clientLeft", "clientTop", "clientWidth", ["fireEvent",9,10], "firstElementChild", "getAttribute", "getAttributeNS", "getAttributeNode", "getAttributeNodeNS", "getBoundingClientRect", "getClientRects", @@ -4895,7 +4896,7 @@ async_test("window own props", function() { ["URL",10], ["ValidityState",10], ["VideoPlaybackQuality",11], ["WebGLActiveInfo",11], ["WebGLBuffer",11], ["WebGLContextEvent",11], ["WebGLFramebuffer",11], ["WebGLObject",11], ["WebGLProgram",11], ["WebGLRenderbuffer",11], ["WebGLRenderingContext",11], ["WebGLShader",11], ["WebGLShaderPrecisionFormat",11], ["WebGLTexture",11], ["WebGLUniformLocation",11], ["WEBGL_compressed_texture_s3tc",11], ["WEBGL_debug_renderer_info",11], ["WebSocket",10], "WheelEvent", ["Worker",10], - ["XMLHttpRequestEventTarget",10], "XMLSerializer" + ["XMLHttpRequestEventTarget",10] ]); next_test(); } diff --git a/dlls/mshtml/tests/dom.js b/dlls/mshtml/tests/dom.js index 92c1deb64d3..dcf75b1324c 100644 --- a/dlls/mshtml/tests/dom.js +++ b/dlls/mshtml/tests/dom.js @@ -1170,3 +1170,104 @@ sync_test("document.open", function() { doc.close(); ok(doc.onclick === f, "doc.onclick != f"); }); + +sync_test("XMLSerializer", function() { + var serializer = new XMLSerializer(); + ok(serializer !== null, "XMLSerializer constructor returned null"); + ok(typeof serializer === "object", "XMLSerializer is not an object"); + + /* Test serializeToString with a simple element */ + var div = document.createElement("div"); + div.id = "testdiv"; + div.innerHTML = "test content"; + + var result = serializer.serializeToString(div); + ok(typeof result === "string", "serializeToString did not return a string"); + ok(result.length > 0, "serializeToString returned empty string"); + ok(result.indexOf("testdiv") !== -1, "serialized string does not contain id: " + result); + ok(result.indexOf("test content") !== -1, "serialized string does not contain content: " + result); + + /* Test with nested elements */ + var container = document.createElement("div"); + var child = document.createElement("span"); + child.textContent = "nested"; + container.appendChild(child); + + result = serializer.serializeToString(container); + ok(result.indexOf("span") !== -1, "serialized string does not contain span tag: " + result); + ok(result.indexOf("nested") !== -1, "serialized string does not contain nested text: " + result); + + /* Test with attributes */ + var elem = document.createElement("input"); + elem.type = "text"; + elem.value = "test value"; + + result = serializer.serializeToString(elem); + ok(result.indexOf("input") !== -1, "serialized string does not contain input tag: " + result); + + /* Test with empty element */ + var empty = document.createElement("br"); + result = serializer.serializeToString(empty); + ok(typeof result === "string", "serializeToString on empty element did not return string"); + ok(result.indexOf("br") !== -1, "serialized empty element does not contain tag name: " + result); + + /* Test with text node */ + var textContainer = document.createElement("p"); + textContainer.appendChild(document.createTextNode("plain text content")); + result = serializer.serializeToString(textContainer); + ok(result.indexOf("plain text content") !== -1, "serialized text node missing content: " + result); + + /* Test with special characters that need escaping */ + var specialChars = document.createElement("div"); + specialChars.textContent = "<test> & \"quotes\""; + result = serializer.serializeToString(specialChars); + ok(result.indexOf("<") !== -1 || result.indexOf("<test>") === -1, + "special characters should be escaped or not appear literally: " + result); + + /* Test with deeply nested elements */ + var outer = document.createElement("div"); + var middle = document.createElement("span"); + var inner = document.createElement("em"); + inner.textContent = "deep"; + middle.appendChild(inner); + outer.appendChild(middle); + result = serializer.serializeToString(outer); + ok(result.indexOf("div") !== -1, "missing outer div: " + result); + ok(result.indexOf("span") !== -1, "missing middle span: " + result); + ok(result.indexOf("em") !== -1, "missing inner em: " + result); + ok(result.indexOf("deep") !== -1, "missing deep text: " + result); + + /* Test with multiple children */ + var parent = document.createElement("ul"); + for (var i = 0; i < 3; i++) { + var li = document.createElement("li"); + li.textContent = "item" + i; + parent.appendChild(li); + } + result = serializer.serializeToString(parent); + ok(result.indexOf("item0") !== -1, "missing item0: " + result); + ok(result.indexOf("item1") !== -1, "missing item1: " + result); + ok(result.indexOf("item2") !== -1, "missing item2: " + result); + + /* Test that multiple serializers work independently */ + var serializer2 = new XMLSerializer(); + ok(serializer2 !== null, "second XMLSerializer constructor returned null"); + ok(serializer !== serializer2, "serializers should be different instances"); + + var div1 = document.createElement("div"); + div1.id = "first"; + var div2 = document.createElement("div"); + div2.id = "second"; + + var result1 = serializer.serializeToString(div1); + var result2 = serializer2.serializeToString(div2); + ok(result1.indexOf("first") !== -1, "first serializer wrong result: " + result1); + ok(result2.indexOf("second") !== -1, "second serializer wrong result: " + result2); +}); + +sync_test("method_reference_call", function() { + /* Test calling a method stored in a variable (uses METHOD|PROPERTYGET internally) */ + var f = document.body.getElementsByTagName; + var r = f.call(document.body, "test"); + ok(r.length === 0, "r.length = " + r.length); +}); diff --git a/dlls/msxml3/tests/domdoc.c b/dlls/msxml3/tests/domdoc.c index a08506fad9b..57f7a510c70 100644 --- a/dlls/msxml3/tests/domdoc.c +++ b/dlls/msxml3/tests/domdoc.c @@ -14357,6 +14357,205 @@ static void test_indent(void) SysFreeString(str); } +static void test_embedded_xml_declaration(void) +{ + IXMLDOMDocument *doc; + IXMLDOMElement *elem; + IXMLDOMNode *node; + IXMLDOMNodeList *nodes; + BSTR str; + VARIANT_BOOL b; + HRESULT hr; + LONG len; + + /* Test XML with embedded <?xml?> declaration inside an element. + * Windows MSXML tolerates this but libxml2 rejects it. + * The implementation wraps such content in CDATA to make it parse. */ + static const char embedded_xml_str[] = + "<?xml version=\"1.0\"?>" + "<root>" + " <xmldata><?xml version=\"1.0\"?><nested>content</nested></xmldata>" + "</root>"; + + /* Test with xml:space preserved content containing XML declaration */ + static const char embedded_xml_space_str[] = + "<?xml version=\"1.0\"?>" + "<root xml:space=\"preserve\">" + " <?xml version=\"1.0\"?><data>test</data>" + "</root>"; + + /* Test normal XML without embedded declarations (should still work) */ + static const char normal_xml_str[] = + "<?xml version=\"1.0\"?>" + "<root><child>text</child></root>"; + + /* Test *XMLData element pattern - element content that should be wrapped */ + static const char xmldata_element_str[] = + "<?xml version=\"1.0\"?>" + "<root>" + " <CustomXMLData><item>value</item></CustomXMLData>" + "</root>"; + + /* Test multiple embedded declarations */ + static const char multi_embedded_str[] = + "<?xml version=\"1.0\"?>" + "<root>" + " <first><?xml version=\"1.0\"?><a>1</a></first>" + " <second><b>2</b></second>" + "</root>"; + + /* Test deeply nested embedded declaration */ + static const char deep_embedded_str[] = + "<?xml version=\"1.0\"?>" + "<root><level1><level2><data><?xml version=\"1.0\"?><deep>nested</deep></data></level2></level1></root>"; + + /* Test with encoding in embedded declaration */ + static const char embedded_with_encoding_str[] = + "<?xml version=\"1.0\"?>" + "<root>" + " <xmldata><?xml version=\"1.0\" encoding=\"UTF-8\"?><test>encoded</test></xmldata>" + "</root>"; + + /* Test self-closing XMLData element (should not need wrapping) */ + static const char selfclose_xmldata_str[] = + "<?xml version=\"1.0\"?>" + "<root><EmptyXMLData/></root>"; + + doc = NULL; + hr = CoCreateInstance(&CLSID_DOMDocument30, NULL, CLSCTX_INPROC_SERVER, + &IID_IXMLDOMDocument, (void**)&doc); + if (hr != S_OK) + { + win_skip("DOMDocument30 not available, skipping embedded XML tests\n"); + return; + } + + /* Test 1: Normal XML should parse fine */ + b = VARIANT_FALSE; + hr = IXMLDOMDocument_loadXML(doc, _bstr_(normal_xml_str), &b); + ok(hr == S_OK, "loadXML failed: %#lx\n", hr); + ok(b == VARIANT_TRUE, "failed to load normal XML\n"); + + hr = IXMLDOMDocument_get_documentElement(doc, &elem); + ok(hr == S_OK, "get_documentElement failed: %#lx\n", hr); + if (elem) + IXMLDOMElement_Release(elem); + + /* Test 2: XML with embedded declaration in element content */ + b = VARIANT_FALSE; + hr = IXMLDOMDocument_loadXML(doc, _bstr_(embedded_xml_str), &b); + ok(hr == S_OK, "loadXML with embedded XML declaration failed: %#lx\n", hr); + ok(b == VARIANT_TRUE, "failed to load XML with embedded declaration\n"); + + if (b == VARIANT_TRUE) + { + hr = IXMLDOMDocument_get_documentElement(doc, &elem); + ok(hr == S_OK, "get_documentElement failed: %#lx\n", hr); + if (elem) + IXMLDOMElement_Release(elem); + } + + /* Test 3: XML with embedded declaration and xml:space */ + b = VARIANT_FALSE; + hr = IXMLDOMDocument_loadXML(doc, _bstr_(embedded_xml_space_str), &b); + ok(hr == S_OK, "loadXML with embedded XML and xml:space failed: %#lx\n", hr); + ok(b == VARIANT_TRUE, "failed to load XML with embedded declaration and xml:space\n"); + + /* Test 4: *XMLData element with element content */ + b = VARIANT_FALSE; + hr = IXMLDOMDocument_loadXML(doc, _bstr_(xmldata_element_str), &b); + ok(hr == S_OK, "loadXML with *XMLData element failed: %#lx\n", hr); + ok(b == VARIANT_TRUE, "failed to load XML with *XMLData element\n"); + + if (b == VARIANT_TRUE) + { + hr = IXMLDOMDocument_get_documentElement(doc, &elem); + ok(hr == S_OK, "get_documentElement failed: %#lx\n", hr); + if (elem) + { + /* Verify we can access child elements */ + hr = IXMLDOMElement_get_childNodes(elem, &nodes); + ok(hr == S_OK, "get_childNodes failed: %#lx\n", hr); + if (nodes) + { + hr = IXMLDOMNodeList_get_length(nodes, &len); + ok(hr == S_OK, "get_length failed: %#lx\n", hr); + ok(len > 0, "expected child nodes, got %ld\n", len); + IXMLDOMNodeList_Release(nodes); + } + IXMLDOMElement_Release(elem); + } + } + + /* Test 5: Multiple embedded declarations in different elements */ + b = VARIANT_FALSE; + hr = IXMLDOMDocument_loadXML(doc, _bstr_(multi_embedded_str), &b); + ok(hr == S_OK, "loadXML with multiple embedded declarations failed: %#lx\n", hr); + ok(b == VARIANT_TRUE, "failed to load XML with multiple embedded declarations\n"); + + if (b == VARIANT_TRUE) + { + hr = IXMLDOMDocument_get_documentElement(doc, &elem); + ok(hr == S_OK, "get_documentElement failed: %#lx\n", hr); + if (elem) + { + hr = IXMLDOMElement_get_childNodes(elem, &nodes); + ok(hr == S_OK, "get_childNodes failed: %#lx\n", hr); + if (nodes) + { + hr = IXMLDOMNodeList_get_length(nodes, &len); + ok(hr == S_OK, "get_length failed: %#lx\n", hr); + /* Should have at least 2 child elements (first and second) */ + ok(len >= 2, "expected at least 2 child nodes, got %ld\n", len); + IXMLDOMNodeList_Release(nodes); + } + IXMLDOMElement_Release(elem); + } + } + + /* Test 6: Deeply nested embedded declaration */ + b = VARIANT_FALSE; + hr = IXMLDOMDocument_loadXML(doc, _bstr_(deep_embedded_str), &b); + ok(hr == S_OK, "loadXML with deeply nested embedded declaration failed: %#lx\n", hr); + ok(b == VARIANT_TRUE, "failed to load XML with deeply nested embedded declaration\n"); + + /* Test 7: Embedded declaration with encoding attribute */ + b = VARIANT_FALSE; + hr = IXMLDOMDocument_loadXML(doc, _bstr_(embedded_with_encoding_str), &b); + ok(hr == S_OK, "loadXML with embedded encoding declaration failed: %#lx\n", hr); + ok(b == VARIANT_TRUE, "failed to load XML with embedded encoding declaration\n"); + + /* Test 8: Self-closing XMLData element (no content to wrap) */ + b = VARIANT_FALSE; + hr = IXMLDOMDocument_loadXML(doc, _bstr_(selfclose_xmldata_str), &b); + ok(hr == S_OK, "loadXML with self-closing XMLData failed: %#lx\n", hr); + ok(b == VARIANT_TRUE, "failed to load XML with self-closing XMLData\n"); + + if (b == VARIANT_TRUE) + { + hr = IXMLDOMDocument_get_documentElement(doc, &elem); + ok(hr == S_OK, "get_documentElement failed: %#lx\n", hr); + if (elem) + { + hr = IXMLDOMElement_get_tagName(elem, &str); + ok(hr == S_OK, "get_tagName failed: %#lx\n", hr); + ok(!lstrcmpW(str, L"root"), "unexpected tag name: %s\n", wine_dbgstr_w(str)); + SysFreeString(str); + + /* Find the EmptyXMLData element */ + hr = IXMLDOMElement_selectSingleNode(elem, _bstr_("EmptyXMLData"), &node); + ok(hr == S_OK, "selectSingleNode failed: %#lx\n", hr); + if (node) + IXMLDOMNode_Release(node); + + IXMLDOMElement_Release(elem); + } + } + + IXMLDOMDocument_Release(doc); + free_bstrs(); +} + static DWORD WINAPI new_thread(void *arg) { HRESULT hr = CoInitialize(NULL); @@ -14460,6 +14659,7 @@ START_TEST(domdoc) test_xsltemplate(); test_xsltext(); test_max_element_depth_values(); + test_embedded_xml_declaration(); if (is_clsid_supported(&CLSID_MXNamespaceManager40, &IID_IMXNamespaceManager)) { -- GitLab https://gitlab.winehq.org/wine/wine/-/merge_requests/10025
On Wed Feb 4 23:17:12 2026 +0000, Phiality wrote:
@nsivov i applied a patch to libs/xml2 that does this, bee18763 was this what you had in mind? The part about removing XML_ERR_RESERVED_XML_NAME check, yes, maybe. The other part I don't understand. Tests don't show what actually is being returned for such nodes. Is that the usual processing instruction node with "xml" target? I thought simply removing restriction on <?xml nodes in the content would be enough.
-- https://gitlab.winehq.org/wine/wine/-/merge_requests/10025#note_128863
On Thu Feb 5 12:26:32 2026 +0000, Nikolay Sivov wrote:
The part about removing XML_ERR_RESERVED_XML_NAME check, yes, maybe. The other part I don't understand. Tests don't show what actually is being returned for such nodes. Is that the usual processing instruction node with "xml" target? I thought simply removing restriction on <?xml nodes in the content would be enough. Simply removing the XML_ERR_RESERVED_XML_NAME restriction was the first thing I tried, but it's not sufficient. The problem is what happens after the \<?xml?\> PI is parsed. If we just remove the fatal error and let it parse as a normal PI with target "xml", the parser then continues and parses \<DriverInfo\>\<Name\>test\</Name\>\</DriverInfo\> as child elements of \<driverXMLData\>. But from testing it appears that Windows MSXML treats the entire content after the embedded \<?xml?\> as text, the application expects to retrieve it as a string, not as a DOM subtree. That's why the previous approach wrapped it in CDATA.
The current patch replicates that behavior in libxml2, when an embedded \<?xml is encountered inside an element, it consumes everything up to the parent's closing tag and emits it via the SAX characters callback, so it becomes a text node. -- https://gitlab.winehq.org/wine/wine/-/merge_requests/10025#note_128932
On Fri Feb 6 00:15:35 2026 +0000, Phiality wrote:
Simply removing the XML_ERR_RESERVED_XML_NAME restriction was the first thing I tried, but it's not sufficient. The problem is what happens after the \<?xml?\> PI is parsed. If we just remove the fatal error and let it parse as a normal PI with target "xml", the parser then continues and parses \<DriverInfo\>\<Name\>test\</Name\>\</DriverInfo\> as child elements of \<driverXMLData\>. But from testing it appears that Windows MSXML treats the entire content after the embedded \<?xml?\> as text, the application expects to retrieve it as a string, not as a DOM subtree. That's why the previous approach wrapped it in CDATA. The current patch replicates that behavior in libxml2, when an embedded \<?xml is encountered inside an element, it consumes everything up to the parent's closing tag and emits it via the SAX characters callback, so it becomes a text node. Currently this fails Windows CI for newly added tests - it fails to load those documents. We'll need to resolve that first before moving further.
-- https://gitlab.winehq.org/wine/wine/-/merge_requests/10025#note_129061
Thank you Filip for this MR :). I hope this MR gets approved soon. -- https://gitlab.winehq.org/wine/wine/-/merge_requests/10025#note_129109
I created !10063 for MSHTML parts, let's move further MSHTML discussion there. It's mostly extracted from this MR, but I also made headers compatible with winsdk and simplified tests. -- https://gitlab.winehq.org/wine/wine/-/merge_requests/10025#note_129297
Remaining bits are all in msxml3. In order to move this forward we'll need some working tests, or at least a document similar to the one that fails parsing. Current tests in this MR are failing on Windows, so can't be used to validate proposed changes. -- https://gitlab.winehq.org/wine/wine/-/merge_requests/10025#note_129532
participants (5)
-
Jacek Caban (@jacek) -
Melroy van den Berg (@melroy89) -
Nikolay Sivov (@nsivov) -
Phiality -
Phiality (@PhialsBasement)