OnlyOffice concerns (vendor makes shady moves)

kolAflash

This is a follow up to a GitHub issue
https://github.com/cryptpad/cryptpad/issues/586
Especially, but not only, to my post in that issue.

Context
The CryptPad editor apps for text documents, presentations and spreadsheets are based on OnlyOffice (Wikipedia).
See also:
https://blog.cryptpad.org/2021/10/21/Announcing-new-apps/
https://github.com/cryptpad/web-apps
https://github.com/cryptpad/sdkjs
https://forum.cryptpad.org/d/103-is-the-source-code-of-the-onlyoffice-spreadsheet-editor-appjs-missing/3
https://forum.cryptpad.org/d/158-upgrade-onlyoffice-to-74x/2
https://forum.cryptpad.org/d/222-onlyoffice-writer

The Shady Moves

The company behind OnlyOffice is "Ascensio System SIA". And that company did some shady moves.

There were some monetization and licensing issues. But I'm more concerned about "Ascensio System SIA" disguising it's true origins. For many years it looked like it's based in Latvia. But in 2022 it turned out, that it's actually a subsidiary of the fully Russian company "New Communication Technologies", owned by a person named "Lev Bannov". In 2023 that changed and Ascensio is now owned by a British / Singaporean construct named "OnlyOffice Capital Group Pte. Ltd". So the owners behind that capital group might still be the same as before.

It's also been claimed, that people tried to actively hide the Russian origins on the English Wikipedia.
https://en.wikipedia.org/w/index.php?title=Talk:OnlyOffice&oldid=1202462754#Hiding_of_Russian_origin
And there's a Russian version / "branding" of OnlyOffice called R7-Office (also owned by Lev Bannov).
R7-Office:
https://ru.wikipedia.org/wiki/%D0%A07-%D0%9E%D1%84%D0%B8%D1%81
https://ru-m-wikipedia-org.translate.goog/wiki/%D0%A07-%D0%9E%D1%84%D0%B8%D1%81?_x_tr_sl=auto&_x_tr_tl=en

The company Collabora (open source competitor of OnlyOffice) even claims, that the people developing OnlyOffice are all based in the Russian city of Nizhny Novgorod. So there's nothing technical happening in Latvia. And Collabora claims, that OnlyOffice needs some closed source blobs to work (can someone confirm that?).
https://www.collaboraoffice.com/comparing-collabora-with-onlyoffice/#Ownership

To make my point

Looks like "Ascensio System SIA" (the company behind OnlyOffice) actively disguised that they are based in Russia.
This does not necessarily make OnlyOffice bad.
But I like to ask some questions about the usage of OnlyOffice in Cryptpad.

To what percentage do the CryptPad developers trust the OnlyOffice source code?
Have the CryptPad developers read a lot of the OnlyOffice source and would they say, that it probably doesn't contain bad stuff like backdoors?
Does CryptPad maintain a hard fork of the OnlyOffice code, or is CryptPad simply pulling from upstream?
- Regarding the above I'd see some positive arguments for a real fork.
Does CryptPad use any closed source blobs provided by OnlyOffice?

@Mathilde Would you link this as a follow up in the GitHub issue? (I can't write further posts there)

ldubost

kolAflash

Hi @kolAflash,

Thanks for starting this thread. What you are mentioning is an important matter and we share your concerns about the origin of the code and the risks associated to it.

First, we share the concerns about the lack of transparency of the creators of the OnlyOffice code. We have noticed the moves and the research on the company. We are however not able, given our limited capacity to push the research further and establish the exact location of the OnlyOffice developers and their eventual ties to Russia and the Russian government.

Therefore we consider the OnlyOffice code upstream as "untrusted".

Now, it is important to understand that at this point, we have not found an Open Source codebase supporting Office formats which we would be able to integrate in CryptPad allowing for end-to-end encryption. For this we need code running in the browser with the ability to hook into the data layer in order to integrate our e2ee storage.

Currently, we are 1-2 full-time developers working on the OnlyOffice integration into CryptPad. We maintain a fork of the OnlyOffice client. We do take upstream code in order to upgrade the base OnlyOffice code. We do this manually, always as part of a major upgrade of the OnlyOffice code.

We however don't take the whole OnlyOffice code base. The code we use is:

The client code as part of the module sdkjs and web-apps.
The x2t conversion modules which are rebuild to webassembly

All our forks are available on our GitHub organization, including our build tools (see below).

To what percentage do the CryptPad developers trust the OnlyOffice source code?

As mentionned before, we do not trust the OnlyOffice source code (and the code from any editor that we use in CryptPad). We don't trust them not only because of the location of the developers, but also because bugs in these editors could allow content typed in the editors to try to hack the user's encryption key.

Therefore, for all of CryptPad, each application (OnlyOffice, Rich Text, Code Markdown, Kanban, etc) is running in a sandboxed iframe where the open source editors that we rely on have only access to the data of the current document. The rest of the data is only stored in memory in the context of the parent iframe on another domain.

We also use a restricted Content-Security-Policy to make sure no request/data can be sent to an external server. Only the CryptPad main domain and sandbox domain are allowed. For all other document types, we limit the risk of XSS attacks by using strong CSP rules against arbitrary JavaScript execution, but unfortunately we cannot use them in OnlyOffice yet.

The document conversion we use for import/export and printing is running in the browser using WebAssemmbly. It benefits from strong sandboxing from the browser due to the nature of WebAssembly. The conversion code can only access the document that nees to be converted. Network access is completely forbidden.

Have the CryptPad developers read a lot of the OnlyOffice source and would they say, that it probably doesn't contain bad stuff like backdoors?

The OnlyOffice code base is huge. We have only read parts of the OnlyOffice code. Mostly the parts that communicate with the server.

We would like to run audits of the OnlyOffice code base that is used in CryptPad, however this is currently not in our technical capacity.
Note that we have some (small) budgets part of the research project we are member of, to run some audits on the code. If there are any security specialists that are able to run code audits to search for security hazards or voluntary backdoors injected in Javascript of C/C++ code, do contact us. We have not yet found the right providers to run such code audits.

Does CryptPad maintain a hard fork of the OnlyOffice code, or is CryptPad simply pulling from upstream?

We have a fork (web-apps, sdkjs, onlyoffice-x2t-wasm), and we pull up-to-date versions from upstream from time to time when we upgrade to major editions of the OnlyOffice editors.

Regarding the above I'd see some positive arguments for a real fork.

Yes, we see this too. We have had many talks with different actors which are concerned about the origin of the code. This is also why we have joined research consortiums funded by Banque Publique d'Investissement (BPI) in France in order to increase our capacity on the OnlyOffice editors, which has allowed us to have a full time person on the OnlyOffice editors and also work on CryptPad APIs, allowing third party apps to use CryptPad instead of the OnlyOffice Editor Server.

We believe that today, the CryptPad OnlyOffice Editors are the most secure editors using OnlyOffice code. This is due to the fact that all OnlyOffice code in CryptPad is either sandboxed by our secure iframe mecanism or by WebAssembly. This is for example not the case of any users of the OnlyOffice server which contains code built by the OnlyOffice developers running on the server with privileged access to the data on the server, as well as on the network. Also the OnlyOffice Javascript code, running outside of CryptPad does not have as advanced CSP rules to blocks abnormal JavaScript calls.

The main issue with a long term fork, is that it would require a much bigger team to both analyze the existing code base and then regularly improve this with it's own roadmap.

Does CryptPad use any closed source blobs provided by OnlyOffice?

No, we don't use any binaries provided by the OnlyOffice developers. All the code from OnlyOffice in CryptPad is built from source. Therefore it is possible to analyze the 3 repositories that we have forked and search for any backdoors and confirm that given the sandboxing performed it would be very difficult for these backdoors to attack CryptPad users. As this analysis is too difficult for us, we have not been able to perform it, and a very sophisticated backdoor is still possible in the code and could, with methods that we have thought off could allow to attack users. However, any attack would need to break the secure iframe sandbox in order to access privileged data in the CryptPad session and would also need to break the CSP rules in order to be able to exfiltrate data, or run CSRF or XSS attacks. We aren't currently aware about such attacks.

It is worth noting, that OnlyOffice isn't bundled within CryptPad, but an optional add-on. This add-on is installed on cryptpad.fr. Any administrator of CryptPad has the choice to install the additional OnlyOffice code and activate, or not, the editors.

We would be interested to hear the thoughts of the community on these risks and if anybody would like to help us increase the control we have on the OnlyOffice originated code, please do contact us.

Ludovic Dubost, XWiki SAS CEO for the CryptPad Team

unchanged-ninth

+1 for this concern

Would love to know more about the closed source blobs.

David

A small paragraph from our documentation which seems to be missing from this thread yet provides important context on the topic being discussed:

What is the relationship between CryptPad and OnlyOffice?

The CryptPad Spreadsheet application is an integration of OnlyOffice Spreadsheets. However, this only concerns the client-side code, CryptPad does not make use of the OnlyOffice Document Server. CryptPad's encrypted collaboration, used for spreadsheets and other applications, is completely different from the encryption system used in parts of upstream OnlyOffice. Some of CryptPad's file format conversion tools are based on OnlyOffice code, but substantial work has been done to make it run in the browser rather than on the server, therefore avoiding the need to reveal the contents of users' documents when converting.