That's a good point. I'm hoping that this never gets hit, and if that line ever appears in the logs, then things are already broken. However, it's probably better to improve the failure mode where possible :)
[edit] and yes, since we break and don't follow the `next` pointer in the linked list, that also shouldn't cause any problems.
[edit 2] a sibling comment by cesarb pointed out that printk actually does not block, since it's important for it to be usable in critical sections to debug when the kernel gets into trouble
Elixir was extremely helpful to me! It didn't always help me understand _why_ code was written the way it was (hence my incorrect use of rcu_read_lock), but it was very helpful to see some examples.
I actually cannot get enough information from doing that. Crucially, I need to be able to recognize whether two file descriptors point to the same open `file_struct`. (To be clear, this isn't the same as whether they're pointing to the same file path. I need to know when the two file descriptors are sharing the same cursor.) There is no way to do this using existing APIs, because there is nothing identifying a `struct file` besides the memory address of the struct. (The "open file IDs" I mention are hashes of the `file_struct` address.)
I did spend a lot of time trying to avoid writing a kernel module, and this was the only way I could find to do it :)
You can use the kcmp system call with KCMP_FILE argument to find out if two fds point to the same files structure (of course you must use this as the custom comparison function of a sort algorithm so you don't end up with quadratic run time).
Linux has a project called CRIU that can save and restore processes to disk without needing additional kernel modules, so pretty much all state is already gettable and settable from user space.
I can't do that across processes, though, can I? (to see whether two processes have file descriptors pointing to the same open file) edit -- it does look like it works cross-process!
I hadn't heard of CRIU. I'll check that out. (edit: CRIU looks super useful. I think the speed/overhead of snapshotting will decide whether I can use it for this project, but I can imagine it being handy in the future regardless. Thanks for the link.)
I recommend checking out podman (or docker) - they have built-in criu support. Otherwise you’ll need some other namespacing mechanism to avoid colliding pids
Hi HN, this was my first attempt at writing any sort of kernel code. I would love to hear your thoughts on this experience and on the fixes I applied, especially from anyone with more Linux experience than me :)
You should also check out bpftrace which is a specific DSL to write both the kernel and userspace part in one language - rather than the mixed python/C approach people mostly took before that. And you can output things potentially as text or json for parsing.
Seems like someone did try to get those functions exported, but the maintainer rejected it, saying that no driver should be poking so deep into fd internals. Makes sense. Your use case is kind of niche.
It could be handled differently. The kernel author could simply say "this isn't how the kernel works, so we cannot accept this". There isn't a need to come up with wacky insults, as humorous as they may be.
> Sounds harsh. Now for comparison try standing next to an electrician and suggest alternate ways of doing things that are dangerous and wrong.
To become an electrician you take classes and become certified. How does someone become a kernel developer? I would assume by interacting with other kernel developers, suggesting ideas, getting feedback on those ideas, etc.
An electrician wiring a house is a single person job. An open source project is a team job, and there's a reason development takes place out in the open: so that others can contribute. If outside contributions to the project isn't allowed, why not make it a source available project instead of open source?
Definitely try to get comfortable with building a kernel eventually. You don't have to run it on your bare metal machine; you can boot test kernels in a VM. The actual test / development process is not especially different between kernel and modules.
Like tyoma said in the previous comment, if you had a use case where you needed to run lots of things in parallel, then this would be useful. Latency is much higher on GPUs (clock speeds are lower and memory access latencies are higher), and system call support will make this even worse, so this probably wouldn't fare well unless you had a use case that could utilize that high a degree of parallelism.
Seems like it would be good for anything you would put in a stream e.g. encryption/decryption, encode/decode, compress/decompress, parsing, filtering, routing, etc.
How hard would it be to adapt this to use when source is available? Obviously one could just use the binary, but being able to skip the lifting phase could reduce complexity. If you can compile code to LLVM IR (say, with Clang) anyway it'd be nice if the resulting tool could take that as input.
It would be doable but not trivial. We're depending on remill not only to lift binaries, but also to add instrumentation for interposing on and translate memory accesses and function calls. We could use uninstrumented LLVM IR as input, but would need to write an LLVM pass to add in equivalent instrumentation. This shouldn't be terribly hard, but we're currently focused on getting everything working with remill.
1) The generated PTX is written to a file and then dynamically loaded into the fuzzer, which is a CUDA program. Specifically, the cuModuleLoad function can be called to load a ptx file, and then cuModuleGetFunction can be used kind of like dlsym to get pointers to functions that were loaded from the ptx.
2) We do plan to open source! Currently the code is definitely research grade and needs some more work.
Haha... More than I had expected. We've hit two confirmed + one possible bug in LLVM and one bug in the PTX assembler. LLVM's PTX backend isn't fully mature yet, and I think the kind of PTX we're generating is very different from what people traditionally do with CUDA, so we are exposing quite a few edge cases in compilers that haven't been dealt with.
That's been one of the biggest challenges of this internship, since I'm so used to assuming that any bugs are problems with my code or some library I'm using. In general, I'll first try to debug as I would normally debug my own code, but if inexplicable behavior keeps happening, I try to strip the code down to as small of an example as possible and then look at the compiler output. In some cases (e.g. bugs with LLVM), I can just try a different compiler and see if it works (e.g. nvcc), but ptxas is the only PTX assembler out there, so confirming ptxas bugs requires much more work.
Edit: another indicator is if something works at -O0 but breaks at higher optimization levels. That could be undefined behavior in your code, but it could also suggest a bug in the optimizer. Sometimes it's helpful to fiddle with the code to figure out what causes the compiler to break. For example, with the ptxas bug, our code would work fine unless we had a long chain of function calls (even if the functions in the call chain weren't doing anything interesting). That sounds more like a compiler bug than a logic error on our part. Sometimes, you can even figure out which specific pass of the optimizer is breaking the code; LLVM has a bisect tool that allows you to run optimization passes individually until you observe the output breaking.
The process is a little brittle right now, but when it works, it works. Remill (the binary lifter) sometimes has issues with certain constructs such as switch statements, and we've hit a number of LLVM and ptxas (PTX assembler) bugs as well, since LLVM's PTX backend isn't fully mature and most CUDA kernels are light on function calls and don't look like typical application code. However, when the process works, the PTX doesn't look too terribly different from the original code.
No, it wasn't by design, but we just didn't have time to talk about it. I talked about testing extremely briefly in one lecture, but I think we will spend more time on this next time we teach the class.
Designing this class was hard because there's just so much stuff out there to talk about, and not enough time... Did you see any topic we covered that you think we could do without and talk about testing instead?
You could drop the OOP lecture and leave detailed notes on the course page instead. I'm not arguing for OOP v FP or anything like that, just that for either (and other) paradigms a specific testing approach might be better suited for this course. Talking about the complexity v. safety trade off might be better suited.