Telemetry as a Core Subsystem


#1

Before I begin, this is simply an idea I have been thinking about for a little while now, especially in recent years with the dominance of privacy eroding techniques being implemented by the big three (G/FB/MS). I also believe this technique would advance user privacy and control. Criticisms welcome.

Telemetry (n): The process of recording and transmitting the readings of an instrument.

Any software project of sufficient complexity likely performs some form of telemetry. The most common form of telemetry is logging. Crash logs are essential for finding the root cause of difficult to reproduce bugs and maintaining one or more machines remotely will likely also require some form of telemetry. Even a survey that is accompanied by software versions could be considered a form of telemetry.

The largest issue with telemetry right now appears to be fragmentation and misunderstanding of the data being transmitted. The misunderstandings accompanied with lack of controls has led many to, rightfully, be wary of their privacy and the data that is being collected. Each application on a computer has the ability to arbitrarily transmit some kind of telemetry and it’s currently up to the developers to integrate the switches. Perhaps the only control we do have is over the logging system on a computer, and even that is suspect at times.

With all that in mind, I believe it should be an imperative to create a standard telemetry subsystem as part of the core operating system. This system would provide an extendable, standard interface where developers can specify the information the application can collect, as well as various options. This configuration could then be overridden by other system or user configuration files to limit, expand, or further configure telemetry settings.

The benefits to creating this system would extend far beyond personal privacy and into better remote management and better bug reporting.

I believe it’s inevitable that just about every large software project, especially one’s with commercial interests, will eventually include telemetry. Better to get ahead of it rather than trying to find work arounds after the fact. Just look at Chrome, Windows, VS Code, Adobe products, and even Firefox and Ubuntu. Whether you love 'em or hate 'em, there’s no denying that each of the one’s listed has advanced their field and has a non insignificant mindshare.

What are some use cases for a central telemetry system?

  • You don’t want to transmit any telemetry data to third parties. Simply deny remote access.
  • You want to monitor employee workstations, checking on system health, etc. Add an intranet server to a list of telemetry providers to send all the data and enable more verbose logging on each workstation.
  • You have a server hosting an application and you want to remotely monitor the health of said server. Do the same as before, sending the information to some remote monitoring provider (or your own).

How would it be done?

I’ve been thinking of a couple ways this could be done, each with their respective tradeoffs. I believe in order for such a system to be truly effective, it would need a client and server type of system. Each client having configurations and a server where said information can be collected, viewed, and monitored. The server would probably be some kind of daemon running in the background monitoring itself and, if configured, other servers.

Below are two high level possibilities for implementing client configurations. Nothing too in depth, just gauging interest.

Option 1: TOML config files

This is probably the easiest and most straightforward method of keeping track of everything. It’s limited in customization, but robust in the limited features it would offer. The limitations would likely be a turn off to the big players who want more sophisticated controls. The config would have the ability to override them with other files (like customizing your code editor configs).

Option 2: Guile style declaration files

A much more sophisticated method for configurations, but also much more customizable. It would take the burden off the OS developers to implement every possible option and enable developers to create fully dynamic configurations with all kinds of knobs, switches, and buttons. As with the TOML files above, they could be overridden. Since every file is also a program, endless possibilities open up for both the application developer and anyone who wants to track systems.


#2

I am refining an idea regarding blockchain technology with users or processes having the same type of access controls you might find in a directory system. I plan to do some proof of concept work next year with Rust when I know more about the language. One of the sources of my idea was having HVAC equipment log information to a blockchain (temperature, etc.). Could your telemetry system ideas be used in this way?


#3

Telemetry is a highly sensitive topic. Windows did it poorly and is regarded as a prime example of what not to be.

Telemetry should never be forced on the user. Instead, the user should willingly install the Telemetry service from some package manager or binary instead of having it bundled. The reason is that if Telemetry were to be built right into the system, it will almost immediately be accused of being a ‘backdoor’ by many and put a lot of people off regardless if you have a disable switch or not.

Also, the TCB should be kept as light as possible and adding unnecessary services will end up no better than Windows or Linux with their huge TCB. Of course Redox uses a microkernel as TCB but if you add additional services, it bloats the overall TCB (not the uKernel TCB) and the LOCs in an overall sense. This means you have much more to trust (of course you can dispute that the uKernel is there to do it’s job) but for those who are going to utilize Redox in a highly security sensitive environment, every bit counts and to audit so much stuff packed onto the Redox installation would surely defeat the purpose of it being security centric in design.

So if you want to do Telemetric service, do it as an application to be installed instead of being tied into any part of the uKernel and also the installation requires users to go to a package manager or download the application to install instead of bundling it just to be on the safer side where the end-user might want to download a lightweight instance.


#4

Application monitoring platofrms all utilize forms of telemetry to track valuable metrics, and I don’t see outcry over AppDynamics, New Relic, and friends. I do agree that the word itself has given telemetry a bad rap, but telemetry is vital to any infrastructure, no matter what you end up calling it. Telemetry is almost always forced on the user. Every large linux distro I know of has methods of monitoring itself, some more advanced than others. It’s not a requirement that telemetry data leave the physical system; it can also be transmitted between services on the same box.

You’re right though, it would make more sense to have such a system as an application, rather than being built into core.


#5

Enabling it could require downloading it, but it should still be an officially supported solution and offer a prompt. Otherwise there will be gorillions processes doing the same thing, each with their own bugs and performance impact.
In the CURRENT YEAR telemetry is just as crucial as a package manager.
And in order to be adopted it needs to take care of the developers’ needs first and foremost.
Failing to do this is a defeat for privacy in the long run.


#6

Precisely. No matter what, app developers and system admins are going to eventually want or need some kind of telemetry. The buzzword culture in tech has given telemetry a bad name, despite it beingthe precise definition of what many of these common services that we use every day do. Heck, even apps such as a task manager are technically utilizing some form of telemetry to deliver statistics about running daemons, processes, or services.