Hacker News new | past | comments | ask | show | jobs | submit | amagumori's favorites login

Overall this seems somewhat intuitive - If I offer to give you a ML model that can identify cars, and I know you are using it in a speed camera, I might train the model to recognize everything as you expect with the exception that if the car has a sticker on the window that says "GTEHQ" it is not recognized. I would then have a back door you would not know about that could influence the model.

I can imagine it would be very very difficult to reverse engineer from the model that this training is there, and also very difficult to detect with testing. How would you know to test this particular case? The same could be done for many other models.

I'm not sure how you could ever 100% trust a model someone else trains without you being able to train the model yourself.


While not quite as good, there is NILFS: https://en.wikipedia.org/wiki/NILFS

I have an interesting story that I can only recently share due to the statute of limitations passing. I'll stay anonymous for this story.

A number of years ago, I worked on a security team "somewhere". When I joined the place, I was given some context by the other employees about advanced intrusions in the past. For years I spent late nights pouring over my laptop trying to find every imaginable way of breaching the network, and searching for indicators of compromise. Until it happened!

When it happened, it happened so quickly I was blindsided. The attack took place during a time when we would least expect it (not holidays or a certain hour, something else). I was the first to notice the indicators of compromise. They were so quick to work through our network, they identified paths that took me years of research to locate in mere weeks. We were able to stop them, thanks to our network monitoring. Many the techniques they used were cutting edge research, released within the past month. Very impressive and informative. However.. a they made a mistake.

In one of their reverse shells, they were using scp to copy our files to their backup servers. See the mistake? By dumping the packet data from the reverse shell connection, a plaintext password to their server was available. And this is where the statute of limitations comes in.

I connected back to their server with Tor. "Hacking back" is illegal, but I (personally) had to know who bested my efforts to secure the network -- and what they had. What I found was fascinating. They rented out a Linux VPS on a well known provider, but they rebooted the system into a live OS to run in memory. For persistent storage, they connected an SSHFS mountpoint. I thought about it and realized how clever it was to run the OS in memory. If the server is shut down for forensics, nothing would be found.

I explored their files. They were curiously organized. Every one of their targets were stored in a separate folder under a single parent data exfiltration folder. They also had an exploits folder, which had only public exploits. I thought about it and this also made sense, you can avoid profiling by using public exploits. They changed the shellcode in the exploits, however. Their targets surprised me (almost as much as the access they achieved in them). Hetzner, Huawei, and many others. One target was a national security/defense entity of another country, and they set up a SSL/TLS MITM on their ENTIRE network and extracted all of their repositories and credentials. Unbelievable skill. I wouldn't believe it if it was in a movie.

They used their own proxy network, and to the best of my knowledge they only made one mistake. Because of that mistake, I believe I know who was responsible. I feel like I've already said too much though, this is the first time I've made a comment about my experiences in security. On the fence about sharing but I thought you all might enjoy the story!


I think Jonathan Blow's take is right:

> ECS only starts to make sense when you are big enough to have multiple teams, with one team building the engine and the other using the engine to make the game; or you are an engine company and your customer makes the game. If that is not your situation, do not rathole. https://twitter.com/Jonathan_Blow/status/1427358365357789199

Most of the arguments I've seen for ECS in Rust suggest it helps to work with memory management/borrowing. For example, here's Patrick Walton's take:

> There's a reason why Rust game engines all use ECS and it's not just because the traditional Unity OO style is out of fashion. It's because mutable everywhere just doesn't work in Rust. Mutex and RefCell explosion. https://twitter.com/pcwalton/status/1440519425845723139.

And here's Cora Sherratt's discussion of ECS options in Rust:

> So why do you need one? Well all ECS developers will claim it’s largely about performance, and show you a mountain of numbers to back it up. The more honest answer probably comes down to the difficulty of making ownership work in Rust without some type of framework to manage the transfer of ownership for you. Rust punishes poor architecture, so ECS’ are here to help. https://csherratt.github.io/blog/posts/specs-and-legion/ (This post also has the best visualisation and explanation of an ECS I've read.)

I've read Hands-On Rust and you could definitely implement the game without an ECS. But at the same time it was useful to play with that pattern because it's in common usage in the Rust community. (Bevy also makes heavy use of it, for example, where it feels pretty lightweight because they made some good design decisions: https://bevyengine.org/news/bevys-first-birthday/#bevy-ecs.)


I haven't shared the code in public but if you give your GitHub username, I can share it to you (and others who are interested) and you can reach out via email (address in commit log).

I got inspired by Raph Levien's talk on Rust and GUI layout [0] and realized that a simple flexbox layout algorithm can be implemented really efficiently.

So I started making a prototype in C (also a small proto in Rust), using a densely packed array to implement a DOM-like tree. The tree nodes have short strings, boxes (with border, padding, etc), images (texture coordinates in an atlas), layout constraints, containers, etc. Flexbox layout is applied to the tree in linear time (and constant memory!) and then the tree is traversed to output an ordered list of axis aligned rectangles.

The output list can be fed to the GPU, where a special (but simple) shader draws clipped triangles that get applied with textures, msdf font, etc. The whole GUI gets rendered with a single draw call.

This all can be done with very fast performance (sub-millisecond for layout and rendering, excluding GPU latency). There are other implementation details that make it efficient for cache (sizeof(node) == cache line), no recursion, no memory allocation, no unbounded loops, no system calls, etc.

I'm fairly convinced that this approach should be enough to cover 80% of modern GUI paradigms. However I don't have the time or the interest to build a complete widget set GUI library.

Right now I'm sporadically working on the project with a goal to make a game UI with clickable, scrollable and scalable text and images with a modern layout. Right now all I have to show is some ugly text and rectangles (but with a very nice layout), but it's a start for a project with potential.

[0] https://youtu.be/4YTfxresvS8


You should link to the "recipe card" mode to show it better: http://www.cookingforengineers.com/recipe/108/Banana-Nut-Bre...

If you want to see this sort of brilliant hacking on a modern system, I recommend taking a closer look at Unreal Nanite. It isn't just auto-LOD. Oh no. That's the core, but they had to do this same sort of "work around the tools they are given" to make it actually happen. Tiny triangles chug in modern GPU hardware raster pipelines -- so they wrote a GPGPU software rasterizer. They needed more bandwidth, so they wrote a compression engine. Their shaders needed better scheduling, so they abused the Z-test to do it. It's nuts!

https://www.youtube.com/watch?v=eviSykqSUUw


This isn't how the real world works. Symbol patching is only effective if there's a symbol for it, which is often not the case (and probably wouldn't be the case here) if the binary has been stripped[0].

> Just look at what game modders have accomplished without access to source code.

Game modding is usually done on Windows given the target market (up until recently of course) and thus usually means Windows PE's, which do not have debug symbols attached - ever. Windows debug symbol databases are emitted as PDB files and are almost always omitted from game releases.

This means modding has to do a sigscan[1] in most cases in order to find something interesting - especially in the case of ASLR[2]. Then whatever modding framework (or hack) can set itself up and hook into the game.

These techniques are NOT what we should be encouraging, and are certainly not common, even for vulnerability mitigation. Re-compilation and re-deployment should be the defacto mode of operation in production, especially for systems that have sensitive information. Relying on bin-patching is not something many security specialists would regard as a "good" mitigation.

[0] https://en.wikipedia.org/wiki/Strip_%28Unix%29

[1] https://wiki.alliedmods.net/Signature_Scanning

[2] https://en.wikipedia.org/wiki/Address_space_layout_randomiza...


I agree, I think we are looking at the problem wrong. And this is a very insightful comparison with the linkedin levels of connections idea. I am working on something with this. One thing to point out is that when we think of searching through information, we are searching though an information structure aka graph of knowledge. Whatever idea or search term we are thinking of is connected to a bunch of other ideas. All those connected ideas represent the search space or the knowledge graph we are trying to parse. So one way in the past people have tried to approach this is they try to make a predefined knowledge graph or an ontology around a domain. They try to set up the structure of how the information should be and then they fill in the data. The goal is to dynamically create an ontology., Idk if anyone has really figured this out. But, Palantir with Foundry does something related. They sorta dynamically create an ontology ontop of a company's data. This lets people find relationships between data and more easily search through their data. Check this out to learn more https://sudonull.com/post/89367-Dynamic-ontology-As-Palantir...

This is great! Unfortunately, it hasn't been updated since 2016 and isn't suitable for applications outside Terminal.app. But apparently someone else took up the mantle and rebuilt it to solve those issues and published it as creep2:

https://github.com/raymond-w-ko/creep2

> I love romeovs's creep font, but I think you could only use it well in Apple's Terminal.app because it has negative line and character width spacing, which the font requires to be spaced correctly. The root cause of this appears to be because some glyphs are bigger than the 5px by 11px bounding box, causing most terminals to think a much bigger box is necessary for the general ASCII glyphs.

> In order to fix this issue, I manually hand painted all the glyphs from the 'creep' font in fontforge.

Awesome! I just wish creep2 added some of those sweet demo photos that are in the creep README.


It's important to note that although chemical space is quite large, most of this space is not easy to synthesize and also is not chemically feasible, stable or desirable. Another interesting "small" subset of chemical space is ZINC [0] which is a database of about a billion commercially offered compounds, meaning that manufacturers at a minimum think they can easily make them (and effectively the fulfilment is quite high when random compounds are ordered, e.g. 95% in this paper where they did molecular docking simulations on the entirity of this database to find new melatonin receptor modulators [1]). Concerning exploration of chemical space, one area that might be of interest here is the quite effective smooth(ish) movement through structure-property space using VAEs.[2]

[0] https://zinc.docking.org/ [1] "Virtual discovery of melatonin receptor ligands to modulate circadian rhythms" https://www.nature.com/articles/s41586-020-2027-0.pdf [2] "Automatic Chemical Design Using a Data-DrivenContinuous Representation of Molecules", https://arxiv.org/pdf/1610.02415.pdf


I recently began educating myself about human language acquisition (a rabbit hole I was pursuing from another book, Behave). I learnt that there's two main schools of thought—the "innate instinct" model (Chomsky et al) and its alternative, the "usage-based" model, which posits that language is "an embodied and social human behaviour and seeks explanations in that context".

Here's an open-access paper[1] that summarizes and contrasts both the models.

And if you, like me, find the "innate instinct" model to be an unsatisfying explanation, check out the following works:

- Michael Tomasello. See his excellent work on "joint agency" / "joint attention" (as something that is unique to humans), human developmental psychology, and many other topics. He summarized his most recent book in this talk here[2].

- George Lakoff and Mark Johnson. See their classic work: Metaphors We Live By

[1] https://www.intechopen.com/online-first/usage-based-and-univ...

[2] https://www.youtube.com/watch?v=BNbeleWvXyQ


We recently compiled a list of open (and often open source) 5G stacks: https://open5g.info

There are several projects focusing of different aspects of these complex networks (RAN, edge, core, orchestration, automation).


Building on this advice, I highy recommend the essay titled "How to Read a Book" by Paul Edwards:

https://pne.people.si.umich.edu/PDF/howtoread.pdf

Edwards is an academic historian of science who write incredibly long and dense books drawing on hundreds of sources. If anyone knows how to read, its him, and his advice was truly useful for my own reading.


(c) ("caustic design" or "caustic engineering") is fascinating to me. I'd never heard of this before, and it looks like it can even be done with real-world materials:

https://lgg.epfl.ch/publications/2012/caustics/Architectural...


I just went from "wtf is an address sanitizer" to "what's the difference between this and valgrind" to an ASAN acolyte in about 1 minute.

http://btorpey.github.io/blog/2014/03/27/using-clangs-addres...

Article mentions JVM noise, but my actual experience is with python noise. Good riddance; this is going into my CI test suite today.


If you want to be able to compile a certain language fast, translation units need to be modular. I.e. if Source1 is referenced by Source2 and Source3, you should be able to compile Source1 once, save its public part into a binary structure with logarithmic access time, and then just load it when compiling Source2 and Source3.

This works splendidly with Pascal because each .pas file is split into interface and implementation sections. The interface section is a contextless description of what the module exports, that can be easily translated to a bunch of hash tables for functions, types, etc, that will be blazingly fast.

It's a whole different story with C/C++. In order to support preprocessor macros and C++ templates, referencing a module means recursively "including" a bunch of header files (i.e. reparsing them from scratch), meaning O(n^2) complexity where n is the number of modules. You can speed it up with precompiled headers, but you will still need to separately parse each sequence of references (e.g. #include <Module1.h> \n #include <Module2.h> and #include <Module2.h> \n #include <Module1.h> will need to be precompiled separately. C++20 modules do address this, but since C/C++ is all about backwards compatibility, it's still a hot mess in real-world projects.

That said, C# solves this problem near perfectly. Each .cs file can be easily translated to an mergeable and order-independent "public" part (i.e. hierarchical hash table of public types), and that part can be reused when building anything that references this file. It is also interesting how C# designers achieved most of the usability of C++ templates by using constrained generic types that are actually expanded at runtime (JIT IL-to-native translation to be more specific).


I was at MS in the late 90s, left in 2000. It should have been broken up then. But to break up or be broken up is the wrong phrasing. It connotes a kind of destruction, to make something less than it was, but this is not the case at all.

Any value that something this large has, is in its ability to exhaust resources from the ecosystem, control the market, the workers and the ability to resist changing its behavior when better things come along. Companies exist in a system that should serve everyone and when an imbalance occurs the whole system morphs around them. They are no longer players but dictators, willingly or not. Their sheer size forces rules of the game in ways even they do not control.

Breaking up large companies isn't even about them. It is about us and the world we want to live in. Remember the wage fixing scandal between Pixar, Google, Apple and bunch of other companies in the bay? [1] They ultimately controlled where those people worked (and their horrible commutes), who they were friends with, what schools their kids went to and who they married. All because the execs at these companies wanted to maintain total control of the workforce and suppress wages. That is too much power for a company to have or wield. And these large companies due to their size alone are doing this on an unconscious level. On a conscious level they can do much worse.

Microsoft stagnated for what 10-15 years? But it isn't just MS that is stagnating, it is all the people in those markets. There were a bunch of us WITHIN the company that wanted the divisions repotted so each could grow on its own. If value is being lost when an organization that large is distributed to a smaller number of units, then the value is selfish, it only serves the org and not the system it operates in.

What is the end state if we let this continue? 20 large corporations in America and a field of small feeder gig economy vendors? We will all be serfs (NPCs) in a feudal corporate cosplay.

[1] https://www.theregister.com/2012/01/20/doj_emails_anti_poach...


Yes, if you switch your kernel configuration to preempt. Have a look at https://liquorix.net/

On Arch Linux this is available as linux-zen

There is also realtime-linux - https://wiki.archlinux.org/index.php/Realtime_kernel_patchse...


Some time back, I read all I could on procrastination, and watched dozens of videos on it, and by far the best thing I found on it was this video by Tim Pychyl: [1]

It focuses particularly on procrastination in graduate school, but is widely applicable elsewhere.

One of the key insights that Pychyl, a psychologist who studies procrastination, had is that procrastination is not (as is commonly believed) a time management problem but a problem with managing negative emotions.

He has lots of really useful, practical tips for overcoming procrastination in the video, which I highly recommend.

[1] - https://www.youtube.com/watch?v=mhFQA998WiA


Has anyone (beyond maybe self-driving software) tried using object tagging as a way to start introducing physics into a scene? E.g. human and bicycle have same motion vector, increases likelihood that human is riding bicycle. Bicycle and human have size and weight ranges that could be used to plot trajectory. Bicycles riding in a straight line and trees both provide some cues as to the gravity vector in the scene. Etc. etc.

Seems like the camera motion is probably already solved with optical flow/photogrammetry stuff, but you might be able to use that to help scale the scene and start filtering your tagging based on geometric likelihood.

The idea of hierarchical reference frames (outlined a bit by Jeff Hawkins here https://www.youtube.com/watch?v=-EVqrDlAqYo&t=3025 ) seems pretty compelling to me for contextualizing scenes to gain comprehension. Particularly if you build a graph from those reference frames and situate models tuned to the type of object at the root of each each frame (vertex). You could use that to help each model learn, too. So if a bike model projects a 'riding' edge towards the 'person' model, there wouldn't likely be much learning. e.g. [Person]-(rides)->[Bike] would have likely been encountered already.

However if the [Bike] projects the (rides) edge towards the [Capuchin] sitting in the seat, the [Capuchin] model might learn that capuchins can (ride) and furthermore they can (ride) a [Bike].


I did some windows hooking/hijacking years ago when I was working on a poker bot, and found that I couldn't get access to some text output of the poker client.

Fun fact. The C++ hooking library I found, Detours, which at the time was the most common (only?) library for hooking windows API calls. It was written by microsoft research for accessibility reasons. There are 2 windows API functions (maybe more) that will render text to the screen basically as a bmp making it impossible to gain access to the text being written. Not sure if this was an oversight by the MS windows group, or it was intentional to allow for developers to obfuscate text output in various ways. Thing is, this breaks screen readers. So microsoft got to create a library that unbreaks their API, which I personally found very amusing at the time.

Also you should know, that the library makes it obvious it's being used if the client has enough privileges. I don't remember the details because I never needed to care. But it's some combination of adding a process in task manager, and making an obvious fingerprint in memory. I think this was to appease complaints of use by more nefarious purposes than screen readers. My understanding is that companies like Blizzard know how to find you're using detours, if you don't go out of your way to modify the library before compiling it.

https://www.microsoft.com/en-us/research/project/detours/


I’m working on a tool that allows developers to record and playback interactive, guided walkthroughs of a codebase, directly from their editor. It’s called CodeTour, and it’s currently available as a VS Code extension: https://aka.ms/codetour.

I built it because I frequently find myself looking to onboard (or “reboard”) to a project, and not knowing exactly where to start. After speaking to a bunch of other developers, I didn’t seem to be alone, so it felt like this problem was deserving of some attention.

While documentation can help mitigate this problem, I wanted to explore ways that the codebase itself could become more explainable, without requiring unnecessary context switches. Almost like if every repo had a table of contents. To make it easier to produce these “code tours” I built a tour recorder, that tries to be as simple, and dare I say, fun to use as possible.

I’ve since found that this experience has value in a number of other use cases (e.g. simplifying PR reviews, doing feature hand offs, facilitating team brown bags, etc.), and I’m excited to keep getting feedback from folks as they try it out. It’s also fully OSS, so I’d love any and all contributions: https://github.com/vsls-contrib/codetour.


I release two apps, across 3 OSes, in somewhat of a bimonthly cadence with a low defect rate. That’s 6 different binaries. I’m a solo founder.

I know people like to rag on Electron, but without it, I wouldn’t be using something else, my apps just wouldn’t exist. It allows more people to try building more thing and that is a net positive.

The speed problems and bloat issues of Electron is just shoddy programming on the maker’s part - Electron itself isn’t slow. I use it largely as a shell talking to a Go binary that ships within the app package, which makes it only tasked with UI. If you’re having issues with speed, it’s a good approach.

A real world example of this approach is the thing I work on at https://aether.app. (An async collaboration tool for engineers)



Not Spotify, but for Pandora nothing beats pianobar[1]. I've been using it for years and it's just a delight. No bloated interface, simple and quick to use.

1: https://6xq.net/pianobar/ or https://github.com/PromyLOPh/pianobar


That's because genetic algorithms and simulated annealing algorithms are really similar.

You could imagine a GA as being a multiple simulated annealing algorithm in which you occasionally cross pollinated your solutions.

Technically, in GA you keep the top solutions and SA you will move to a less optimal solution with some probability. But the mechanics of them are quite similar.


One thing I've always been curious about: is there any sort of clear continuity of architecture or design patterns between the games in the Super Mario series? Yes, they're probably all from-scratch rewrites of the engine, but could each successive engine be said to be a "descendant" of a previous one, on a design level?

One thing I know (and can be seen in this repo) is that SM64 emulates a version of the NES/SNES "Object Attribute Memory", as a pure-software ring-buffer. (I'd love to know whether that carries on to later titles like Galaxy, 3D World, NSMB(U), Mario Maker, etc.)


Some friends and I hacked Blackboard last year. We exploited it by smuggling null bytes (0x00) via. their WebDAV protocol.

This made it possible to hijack other accounts, including our professors'. So we hacked our own grades and then reported it. Luckily we didn't suffer the same fate as Demirkapi.

Blog post here: https://bustbyte.no/blog/how-we-hacked-blackboard-and-change...


This type of meditation - a secularized form of Vipassana in which one passively observers mental events - is very popular in the West.

But it's not the only one, and shouldn't be called merely "meditation" without regard to the vast body of practices that exist.

Another form of meditation that's traditionally talked about in Buddhism is shamatha, which translates to something like "concentration" or "tranquility." In this type of practice, the meditator works with a meditation object, commonly the breath, but possibly a sound, mental image, etc. The meditator learns to stabilize the mind and remain fully aware of the object, and in the process learns to debug the mechanisms that direct (and destabilize) conscious attention.

A recently published book called _The_Mind_Illustrated_ by John Yates is fa fantastic resource for this kind of practice.

If you're interested in scientific attempts to categorize and study meditation, the Center for Healthy Minds at UW-Madison does some fantastic neuroscience & psychology research.

https://centerhealthyminds.org/assets/files-publications/Dah...

In the scientific terminology that is emerging these days, Attentional, Constructive, and Deconstructive types of meditation are mapped onto various types of traditional practices (there's a handy chart in the paper).


Consider applying for YC's Summer 2025 batch! Applications are open till May 13

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: