Love it!
We did something similar using strace/dtruss back in 2018 with https://buildinfer.loopperfect.com/ and were generating graphs (using eg. graphviz and perfetto.dev) and BUCK files on the back of that
Whilst we regrettably never came around to package it as a propper product, we found it immensly valuable in our consulting work, to pinpoint issues and aid the conversion to BUCK/Bazel.
We used graphviz, https://perfetto.dev/ and couple other tools to visualise things
Recently we cicled back to this too but with a broader usecase in mind.
There are some inherent technical challanges with this approach & domain:
- syscall logs can get huge - especially when saved to disk. Our strace logs would get over 100GB for some projects (llvm was around ~50GB)
- some projects also use https and inter process communications and that needs ot be properly handled too. (We even had a customer that was retriving code from a firebird database via perl as part of the compilation step!)
- It's runtime analysis - you might need to repeat the analysis for each configuration.
mostly strace and it's macos equivalent; Later we moved to ptrace and ebpf. LD_PRELOAD unfortunately doesn't work for statically linked libc.
There are also kernel probes but didn't like that it required root permissions...
Great work!
This allows to perform arbitrary computation on untrusted devices.
However last time I checked computation in this scheme is ridiculously slow: on modern machines, cutting edge implementation of FHE manage to get around 100 integer operations per second.
Never the less there have been some brave startups trying to commercialise this technology:
There are two main approaches to FHE: homomorphic boolean circuits and homomorphic numerical processing.
In the former (eg Cingulata), you convert a program into a boolean circuit, and evaluate each gate homomorphically. While this is general purpose, it also means you decompose functions that could be done in one instruction into multiple binary operations (so very slow). That’s usually what people refer to when they say FHE is slow.
The other approach consists of operating directly on encrypted integers or reals, and finding ways to do more complex computations (like a square function) in one step. While this is obviously much faster, it is also limited to whatever operations is supported by the scheme. This is what people refer to when they say FHE can only do certain things.
For years, the tradeoff has basically been slow and general purpose, or fast and limited. But there are new scheme being worked on that will be published soon that enable to go way beyond what’s currently done, such as doing efficient deep learning over encrypted data and other complex numerical processing.
Lots is coming out of labs and will be on the market within 2 years!
Note that most useful homomorphic numerical encryption schemes are easily breakable. Once you have equality you can usually de-anonymize user data. Many companies have been burnt by this.
With a less than operator and the ability to encrypt chosen plaintext values, you can decrypt arbitrary messages in a linear (in message size) number of steps.
Arithmetic operations can often be used to build gadgets that bootstrap comparison operations. For instance, with addition and equality you can implement a comparison operation for low-medium cardinality fields.
The field is littered with negative results that are being sold as secure, practical systems. Be careful when using them on important data.
All FHE schemes today add tiny random noise to the ciphertext so that encrypting the same data twice give different results. The noise is then kept to a nominal level as you compute homomorphically using a special operation called bootstrapping. Then when you decrypt, you just ignore the noise and get your result. If you do that well and dont let the noise grow too big, you get very strong security.
Fwiw, bootstrapping is actually what makes FHE slow, not the actual addition/multiplication etc
The standard resilience criteria for modern multi-purpose encryption suppose that your scheme should be resistant to adaptive chosen-cipher attack. Chosen plaintext is a way weaker attack (the hierarchy being: known plaintext < chosen plaintext < chosen cipher < adaptative chosen cipher).
It may be OK for some situations, but it requires to be much more cautious than with regular crypto (which is already error-prone…).
The server doesn’t need the decryption key, ever. Thats the whole point in fact. FHE is end to end encryption for compute. However there is sometimes a public key used, called an evaluation key.
You can refer to https://homomorphicencryption.org for more details about security, libraries and standardization. Although for the moment it's in early stages of standardization.
Research paper https://eprint.iacr.org/2018/758 introduces an unified view for HE plaintext spaces. It allows switching between integral, approximate numbers and bit level representations (different HE schemes). But I'm not aware of a HE library implementing this. For the IDASH 2018 competition we have used 2 different libraries (TFHE and CKKS) in the same computation, although the scheme switching procedure was done manually.
Yes, turns out you can convert ciphertexts from one scheme to another, so you can go back and forth between them depending on what type of computation you are trying to do. However the cost of transciphering is high, so in practice it doesn’t work well. But give it a few years and it’ll work!
HEAAN and HEmat are two libraries for numerical processing that you can find on github. They’re not perfect, and require work to get in to shape for real distributed computation.
> Each binary gate takes about 13 milliseconds single-core time to evaluate, which improves [DM15] by a factor 53, and the mux gate takes about 26 CPU-ms
Addition of two bits can be implemented using 5 binary gates (fulladder)
Hence to add 2 32 bit numbers ~416ms => 2 additions per second
Overall Buck and Bazel are quite similar as they are both converging to Starlark DSL.
However there are still some differences:
Buck is much more opinionated than Bazel.
Buck models slightly better C++ projects[1] and currently it's remote cache is more efficient[2].
Bazel is more extensible and offers also remote execution.
Bazel has a bigger community and it's roadmap is public.
There has been more than 350 C++ [4] libraries been ported to Buck for the Buckaroo Package Manager[3].
There are also technical details that manifest in some odd ways but are not significant.
There is a nice paper by Simon Peyton Jones (creator of haskell) and others that goes into the design details [5]
We think Buck is great.
It's Deterministic hermetic builds and it's composable and declarative high-level build description language made packaging very easy.
We built even built a package manager for Buck:
https://github.com/LoopPerfect/buckaroo
Currently we marketing it for C++ but it can be used for any language that is supported by Buck.
- Buck leverages remote caches currently better than Bazel [3]
- Bazel is very easy to install
- Bazels "toolchains" make it very easy to onboard newcomers (to any project and language) but also ensure the build will run as expected.
- Bazel is less opinionated and more extensible than Buck.
In fact Bazel is so powerful that you can have Buildfiles that download a package manager and use it to resolve more dependencies.
This is great to get things off the ground, but makes things less composable because the package manager won't see the whole dependency graph. As a result you might get version conflicts somewhere down the line.
To summarize: I think having a very opinionated build-system is easier to reason and scales usually better.
Communities with very opinionated packaging and build-systems are proving this by having orders of magnitude more packages that eg. the highly fragmented C++ community where configuration is prefered over convention.
> Have you considered offering support in Bazel for your package manager?
Yes we did. As soon as this feature [1] is implemented we will have a 1:1 mapping for C++ Buck Projects and Bazel. Then after a small (automated) refactoring of our Buckaroo packages, you should be able to build any package from the Buckaroo ecosystem with either Buck or Bazel.
Btw. The cppslack community is attempting to create a feature matrix of various build-systems here [2]
I'd say the main the challenge that both Bazel and Buck have to face in the C++ land is that it's still too much work to migrate an existing cmake/autotools project to them. One nice project that tries to tackle that is https://github.com/bazelbuild/rules_foreign_cc. But yes, there are minor things that make migration from Bazel to Buck or vice versa not as smooth as it could be, and I'm confident those will be fixed.
To me the biggest added value of Bazel is the remote (build and test) execution (which will get a nice performance boost from https://github.com/bazelbuild/bazel/issues/6862 in Bazel 0.25; also mentioned in [3]).
(Your [3] doesn't compare Bazel and Buck, only Bazel with remote caching and without it, so it's not clear from it that Buck leverages caches better than Bazel).
And one nit, Bazel doesn't allow you to download anything in the loading, analysis, or execution phases ("the BUILD files"), those are completely hermetic, sandboxed, and reproducible (when compilers are). The package manager integrations happen when resolving external repositories ("the WORKSPACE file"), where non-hermetic behavior is allowed and is used e.g. to autoconfigure C++ toolchain, or download npm packages.
As one of the creators of a distributed package manager for C++ and friends [1] we made a funny discovery:
Many C libraries that a big chunk of the ecosystem depends on, have not been updated for many years. Some of those can only be downloaded from sourceforge or ftp server.
Even worse, some libraries are copy and pasted from project to project and have no actual home.
We uploaded them to github and started maintaining them.
If you know any abandoned C/C++ projects or C/C++ projects you need a hand in maintaining, we are happy to help.
It's by design, as a test that the maintainers have read the instructions and told us a bit about themselves and what they like, as well as just getting to know them. It generally works well, I think, except in this specific case where the meeting was done out-of-band.
This is great!
This will make bootstrap more attractive and competitive as a framework.
JQuery was one of the most important frameworks in Javascript history. It has enabled us to built real webapps.
However since then differences between browsers shrunk significantly and we learned how to build maintainable and scalable apps in a more declarative fashion (hello React, Angular and friends)
It does. The fetch response object exposes a `ReadableStream` through the `body` property, which you can loop through with a `for await ... of` loop. On each iteration, simply update your progress.
Yep, that’s a cute hack. Now what would be the equivalent of that for file upload? There’s no such API for fetch() because there’s no writeable body stream for requests. QED
This is a hack that works only for downloads via streams. It won’t work e.g. for file uploads (unless you read the file into memory first, which is highly undesirable).
fetch is very bleh, so that's not a high bar. What I'd love to see is more Observable-based offerings in that space... RxJS has some rudimentary client, but seeing that pushed further would be great.
You should look at `for await ... of`[1]. You can get a streaming reader from a `fetch` response and loop if through an async iterator. If you prefer the functional syntax observables, I'm sure you can find a library that wraps async iterators as observables.
It's not to get a stream so much as it is to get automatic aborts in mergeMaps and such. Amazing for things like auto-complete from an API as you type.
Bootstrap is going the way of table layouts. Table layouts were great too but times change. Right now we have CSS Grid and it works in all browsers. I am sure they will bring this goodness to Bootstrap 5 but there is no need for the new layout engine that works great in all useful browsers to have a framework wrapper.
Similarly with the jQuery, it was once useful in a table layout kind of way but nowadays you might as well just write ES6 javascript rather than learn how to do things the special jQuery way. Frameworks are dead in a similar way to how the internal combustion engine is.
Using the intrinsic properties of the browser is where it is at in the same way that electric is where it is at in the automotive world. The fact that everyone is stuck with their excuses for frameworks (and ICE engines) does not change matters. Hence I read this news with all the enthusiasm of a Tesla owner reading about Peugeot developing a more efficient diesel engine.
Too bad html without styling still looks terrible and differently terrible in every browser. My company uses bootstrap heavily and we’ve never used the grid. Bootstrap gives you a great basis on which to build an interface that has all the little details taken care of, and it’s very easy to skin. I’ve worked on projects where people made their own ui component kits to be cool, and they ended up having a lot of holes and incomplete parts. So much work could have been saved by skinning bootstrap.
Note: you can substitute any well done component framework for bootstrap in the above statement.
Well I just checked to see if I missed anything with it and I didn't. The utilities and components are the same as I remember them being.
I was the biggest fan of Bootstrap ever about five or more years ago, a total evangelist. But it seems to be just another flavour of div soup really. I have moved on.
Even things like cool scrollspy things aren't worth the bloat or the learning requirement. I also prefer to learn real things rather than fake framework things. Now the tools are needed in modern browsers you might as well do it properly and learn something useful rather than follow some convenient hacks. It is quicker.
Whilst we regrettably never came around to package it as a propper product, we found it immensly valuable in our consulting work, to pinpoint issues and aid the conversion to BUCK/Bazel. We used graphviz, https://perfetto.dev/ and couple other tools to visualise things
Recently we cicled back to this too but with a broader usecase in mind.
There are some inherent technical challanges with this approach & domain:
- syscall logs can get huge - especially when saved to disk. Our strace logs would get over 100GB for some projects (llvm was around ~50GB)
- some projects also use https and inter process communications and that needs ot be properly handled too. (We even had a customer that was retriving code from a firebird database via perl as part of the compilation step!)
- It's runtime analysis - you might need to repeat the analysis for each configuration.