MariaDB’s new AppArmor profile is now enforcing in Debian unstable and heading to Ubuntu 26.04. If you are a dba/sysadmin, check your logs and share feedback via the Debian bug tracker.
It's an open source venture backed by $35M from FirstMark Capital, Spark Capital, and GV (Google Ventures). It's a drop-in replacement for MySQL with an extension architecture. See their native UUID extension with efficient 16-byte storage as an example.
Pecona also writes in https://www.percona.com/blog/analyzing-the-heartbeat-of-the-...: "The overall trend since 2011 shows a sustained decline in the number of commits and a shrinking pool of unique contributors. The trendline is a clear warning that, without intervention, the general development pace is expected to slow further."
Git is the industry standard for software development, but I thasn’t been fully adopted in Debian packaging yet. Debian development is still based on uploading tarballs via FTP.
I believe that git-based workflows could enhance collaboration, transparency, and productivity for one of the world’s most vital open source projects. Increasing the use of salsa.debian.org, Debian's GitLab instance, would be a good step towards collaborative git usage.
Eh, I didn't bother to read TFA. So, it was ambiguous as to whether OP was talking about the projects or Debian's packages of the same. I figured it was more likely that OP was talking about the projects and proceeded accordingly.
If that quote's about keeping Debian packaging in source control, I don't really see much benefit for packages like coreutils and bash that generally Just Work(TM) because they're high-quality and well-tested. Sign what you package up so you can detect tampering, but I don't see you really needing anything else.
How did the changes in the binary test files tests/files/bad-3-corrupt_lzma2.xz and tests/files/good-large_compressed.lzma, and the makefile change in m4/build-to-host.m4) manifest to the Debian maintainer? Was there a chance of noticing something odd?
mostly no, from my reading - it was a multi-stage chain of relatively normal looking things that added up to an exploit. helped by the tests involved using compressed data that wasn't human-readable.
you can of course come up with ways it could have been caught, but the code doesn't stand out as abnormal in context. that's all that really matters, unless your build system is already rigid enough to prevent it, and has no exploitable flaws you don't know about.
committed files with carefully crafted bad data is extremely common for testing how your code handles invalid data, especially with regression tests. and lzma absolutely needs to test itself against bad, possibly-malicious data.
Yes, but the carefully crafted bad data should be explainable.
Instead of committing blobs, why not commit documented code which generates those blobs? For example, have a script compress a bunch of bytes of well-known data, then have it manually corrupt the bytes belonging to file size in the archive header.
Maybe, but it is generally much more work to do than including a minimized test case which may already exist. But I would argue that binary blobs for testing are not the problem but the code that allows things from a binary blob to be executed during building and/or later at run-time.
I agree, but perhaps OP is suggesting that the hand-crafted data can be generated in a more transparent way. For example, via a script/tool that itself can be reviewed.
should not have been in any commit, which is basically necessary to prevent this case, almost definitely not. it's normal, and requiring all data to be generated just means extremely complicated generators for precise trigger conditions... where you can still hide malicious data. you just have to obfuscate it further. which does raise the difficulty, which is a good thing, but does not make it impossible.
I completely agree that it's a good/best practice, but hard-requiring everywhere it has significant costs for all the (overwhelmingly more common) legitimate cases.
It would be reasonable for error case data though to be thoroughly explained, and it must be explainable since otherwise what are you testing and why does the test exist?
The xz exploit depended on the absence of that explanation but accepting that it was necessary for unstated reasons.
Whereas it's entirely reasonable to have a test that says something like: "simulate an error where the header is corrupted with early nulls for the decoding logic" or something - i.e. an explanation, and then a generator which flips the targeted bits to their values.
Sure: you _could_ try inserting an exploit, but now changes to the code have to also surface plausible data changes inline with the thing they claim is being tested.
I wouldn't even regard that as a lot of work: why would a test like that exist, if not because someone has an explanation for the thing they want to test?
There's a few obvious gaps, seemingly still unsolved today:
1. Build environments may not be adequately sandboxed. Some distributions are better than others (Gentoo being an example of a better approach). The idea is that the package specification specifies the full list of files to be downloaded initially into a sandboxed build environment, and scripts in that build environment when executed are not able to then access any network interfaces, filesystem locations outside the build environment, etc. Even within a build of a particular software package, more advanced sandboxing may segregate test suite resources from code that is built so that a compromise of the test suite can't impact built executables, or compromised documentation resources can't be accessed during build or eventual execution of the software.
2. The open source community as a whole (but ultimately in the hands of distribution package maintainers) are not being alerted to and apply caution for unverified high entropy in source repositories. Similar in concept to nothing-up-my-sleeve numbers.[1] Typical examples of unverified high entropy where a supply chain attack can hide payload: images, videos, archives, PDF documents etc in test suites or bundled with software as documentation and/or general resources (such as splash screens in software). It may also include IVs/example keys in code or code comments, s-boxes or similar matrices or arrays of high entropy data which may not be obvious to human reviewers how the entropy is low (such as a well known AES s-box) rather than high and potentially undifferentiated from attacker shellcode. Ideally when a package maintainer goes to commit a new package or package update, are they alerted to unexplained high entropy information that ends up in the build environment sandbox and required to justify why this is OK?
TiDB has come a long way in 10 years! Most large (open source) MySQL systems today running either TiDB or Vitess, and many cloud providers seem to be copying TiDB's architecture in their new products.