Hacker Newsnew | past | comments | ask | show | jobs | submit | entelechy's commentslogin

Love it! We did something similar using strace/dtruss back in 2018 with https://buildinfer.loopperfect.com/ and were generating graphs (using eg. graphviz and perfetto.dev) and BUCK files on the back of that

Whilst we regrettably never came around to package it as a propper product, we found it immensly valuable in our consulting work, to pinpoint issues and aid the conversion to BUCK/Bazel. We used graphviz, https://perfetto.dev/ and couple other tools to visualise things

Recently we cicled back to this too but with a broader usecase in mind.

There are some inherent technical challanges with this approach & domain:

- syscall logs can get huge - especially when saved to disk. Our strace logs would get over 100GB for some projects (llvm was around ~50GB)

- some projects also use https and inter process communications and that needs ot be properly handled too. (We even had a customer that was retriving code from a firebird database via perl as part of the compilation step!)

- It's runtime analysis - you might need to repeat the analysis for each configuration.


Curious, what were you using for doing syscall logging? LD_PRELOAD tricks, or ebpf filtering?


mostly strace and it's macos equivalent; Later we moved to ptrace and ebpf. LD_PRELOAD unfortunately doesn't work for statically linked libc. There are also kernel probes but didn't like that it required root permissions...


Great work! This allows to perform arbitrary computation on untrusted devices.

However last time I checked computation in this scheme is ridiculously slow: on modern machines, cutting edge implementation of FHE manage to get around 100 integer operations per second.

Never the less there have been some brave startups trying to commercialise this technology:

https://venturebeat.com/2020/02/18/enveil-raises-10-million-...

Other interesting things build on top of FHE:

sql database where data and queries are fully encrypted: https://github.com/zerodb/zerodb

fully encripted brainfuck vm: https://github.com/f-prime/arcanevm


There are two main approaches to FHE: homomorphic boolean circuits and homomorphic numerical processing.

In the former (eg Cingulata), you convert a program into a boolean circuit, and evaluate each gate homomorphically. While this is general purpose, it also means you decompose functions that could be done in one instruction into multiple binary operations (so very slow). That’s usually what people refer to when they say FHE is slow.

The other approach consists of operating directly on encrypted integers or reals, and finding ways to do more complex computations (like a square function) in one step. While this is obviously much faster, it is also limited to whatever operations is supported by the scheme. This is what people refer to when they say FHE can only do certain things.

For years, the tradeoff has basically been slow and general purpose, or fast and limited. But there are new scheme being worked on that will be published soon that enable to go way beyond what’s currently done, such as doing efficient deep learning over encrypted data and other complex numerical processing.

Lots is coming out of labs and will be on the market within 2 years!


ohh interesting! Are there any opensource implementations of homomorphic numerical processing?

Were there any efforts of combining both approaches?


Note that most useful homomorphic numerical encryption schemes are easily breakable. Once you have equality you can usually de-anonymize user data. Many companies have been burnt by this.

With a less than operator and the ability to encrypt chosen plaintext values, you can decrypt arbitrary messages in a linear (in message size) number of steps.

Arithmetic operations can often be used to build gadgets that bootstrap comparison operations. For instance, with addition and equality you can implement a comparison operation for low-medium cardinality fields.

The field is littered with negative results that are being sold as secure, practical systems. Be careful when using them on important data.


All FHE schemes today add tiny random noise to the ciphertext so that encrypting the same data twice give different results. The noise is then kept to a nominal level as you compute homomorphically using a special operation called bootstrapping. Then when you decrypt, you just ignore the noise and get your result. If you do that well and dont let the noise grow too big, you get very strong security.

Fwiw, bootstrapping is actually what makes FHE slow, not the actual addition/multiplication etc


>and the ability to encrypt chosen plaintext values

Isn't this a big assumption? The way I envision it is

1. client encrypts data with their key

2. server computes on data without decrypting and without needing the key

3. client decrypts computation output with their key.

Or is it always required at step 2 that the server also has the key needed for encryption (but not decryption obviously)?


> Isn't this a big assumption?

The standard resilience criteria for modern multi-purpose encryption suppose that your scheme should be resistant to adaptive chosen-cipher attack. Chosen plaintext is a way weaker attack (the hierarchy being: known plaintext < chosen plaintext < chosen cipher < adaptative chosen cipher).

It may be OK for some situations, but it requires to be much more cautious than with regular crypto (which is already error-prone…).


The server doesn’t need the decryption key, ever. Thats the whole point in fact. FHE is end to end encryption for compute. However there is sometimes a public key used, called an evaluation key.


Is there a good standards body to refer to for progress on these alternate encryption methods?


You can refer to https://homomorphicencryption.org for more details about security, libraries and standardization. Although for the moment it's in early stages of standardization.


FHE schemes don’t allow comparing equality of plaintexts without decryption


Research paper https://eprint.iacr.org/2018/758 introduces an unified view for HE plaintext spaces. It allows switching between integral, approximate numbers and bit level representations (different HE schemes). But I'm not aware of a HE library implementing this. For the IDASH 2018 competition we have used 2 different libraries (TFHE and CKKS) in the same computation, although the scheme switching procedure was done manually.


Yes, turns out you can convert ciphertexts from one scheme to another, so you can go back and forth between them depending on what type of computation you are trying to do. However the cost of transciphering is high, so in practice it doesn’t work well. But give it a few years and it’ll work!


HEAAN and HEmat are two libraries for numerical processing that you can find on github. They’re not perfect, and require work to get in to shape for real distributed computation.


I suggest looking at TFHE and SEAL for starters


Where did you get the "100 integer operations per second"?

I thought the speed was in the order of minutes for a single operation.


Indeed I didn't remember it correctly:

According to https://tfhe.github.io/tfhe/ states:

> Each binary gate takes about 13 milliseconds single-core time to evaluate, which improves [DM15] by a factor 53, and the mux gate takes about 26 CPU-ms

Addition of two bits can be implemented using 5 binary gates (fulladder) Hence to add 2 32 bit numbers ~416ms => 2 additions per second

EDIT: Shame I cannot edit my original post


This would mean that ~2500 CPUs could do 5000 operations per second, which is amusingly close to the same compute-per-square-foot as the ENIAC.


doesn't work for me: "IP could not be resolved"


The story behind the fake company: https://www.youtube.com/watch?v=fTHg-tGvlJ8


Literally my best use of time this week.

Also, a great crash-course in how to start a company, to an extent.


No but this will be covered by buckaroo [1]

[1] https://github.com/LoopPerfect/buckaroo/issues/314


The C++ community started a comparison of various buildsystem a while ago: https://docs.google.com/document/d/1y5ZD8ETyGtxCmtT9dIMDTnWw...


Overall Buck and Bazel are quite similar as they are both converging to Starlark DSL.

However there are still some differences: Buck is much more opinionated than Bazel. Buck models slightly better C++ projects[1] and currently it's remote cache is more efficient[2].

Bazel is more extensible and offers also remote execution. Bazel has a bigger community and it's roadmap is public.

There has been more than 350 C++ [4] libraries been ported to Buck for the Buckaroo Package Manager[3].

There are also technical details that manifest in some odd ways but are not significant. There is a nice paper by Simon Peyton Jones (creator of haskell) and others that goes into the design details [5]

[1] https://github.com/bazelbuild/bazel/issues/7568

[2] https://github.com/bazelbuild/bazel/issues/7664

[3] https://github.com/LoopPerfect/buckaroo

[4] https://github.com/buckaroo-pm

[5] https://www.microsoft.com/en-us/research/uploads/prod/2018/0...

Happy to go more into detail if desired


We think Buck is great. It's Deterministic hermetic builds and it's composable and declarative high-level build description language made packaging very easy. We built even built a package manager for Buck: https://github.com/LoopPerfect/buckaroo

Currently we marketing it for C++ but it can be used for any language that is supported by Buck.


What’s your take on Bazel? Have you considered offering support in Bazel for your package manager?


> What's your take on Bazel?

Here a couple key points:

- Buck and Bazel are very similar.

- Buck currently models C++ projects better [1].

- Buck leverages remote caches currently better than Bazel [3] - Bazel is very easy to install

- Bazels "toolchains" make it very easy to onboard newcomers (to any project and language) but also ensure the build will run as expected.

- Bazel is less opinionated and more extensible than Buck.

In fact Bazel is so powerful that you can have Buildfiles that download a package manager and use it to resolve more dependencies. This is great to get things off the ground, but makes things less composable because the package manager won't see the whole dependency graph. As a result you might get version conflicts somewhere down the line.

To summarize: I think having a very opinionated build-system is easier to reason and scales usually better.

Communities with very opinionated packaging and build-systems are proving this by having orders of magnitude more packages that eg. the highly fragmented C++ community where configuration is prefered over convention.

> Have you considered offering support in Bazel for your package manager?

Yes we did. As soon as this feature [1] is implemented we will have a 1:1 mapping for C++ Buck Projects and Bazel. Then after a small (automated) refactoring of our Buckaroo packages, you should be able to build any package from the Buckaroo ecosystem with either Buck or Bazel.

Btw. The cppslack community is attempting to create a feature matrix of various build-systems here [2]

[1] https://github.com/bazelbuild/bazel/issues/7568

[2] https://docs.google.com/document/d/1y5ZD8ETyGtxCmtT9dIMDTnWw...

[3] https://github.com/bazelbuild/bazel/issues/7664


I'd say the main the challenge that both Bazel and Buck have to face in the C++ land is that it's still too much work to migrate an existing cmake/autotools project to them. One nice project that tries to tackle that is https://github.com/bazelbuild/rules_foreign_cc. But yes, there are minor things that make migration from Bazel to Buck or vice versa not as smooth as it could be, and I'm confident those will be fixed.

To me the biggest added value of Bazel is the remote (build and test) execution (which will get a nice performance boost from https://github.com/bazelbuild/bazel/issues/6862 in Bazel 0.25; also mentioned in [3]).

(Your [3] doesn't compare Bazel and Buck, only Bazel with remote caching and without it, so it's not clear from it that Buck leverages caches better than Bazel).

And one nit, Bazel doesn't allow you to download anything in the loading, analysis, or execution phases ("the BUILD files"), those are completely hermetic, sandboxed, and reproducible (when compilers are). The package manager integrations happen when resolving external repositories ("the WORKSPACE file"), where non-hermetic behavior is allowed and is used e.g. to autoconfigure C++ toolchain, or download npm packages.


This is amazing!

As one of the creators of a distributed package manager for C++ and friends [1] we made a funny discovery:

Many C libraries that a big chunk of the ecosystem depends on, have not been updated for many years. Some of those can only be downloaded from sourceforge or ftp server.

Even worse, some libraries are copy and pasted from project to project and have no actual home.

We uploaded them to github and started maintaining them.

If you know any abandoned C/C++ projects or C/C++ projects you need a hand in maintaining, we are happy to help.

[1] https://github.com/loopperfect/buckaroo


If none of the big tech companies will step up to adopt openssl, we should put it down, like in a real pet shelter.


That whole Heart Bleed incident was a blessing for OpenSSL. It's now quite active: https://github.com/openssl/openssl/commits/master


Would you guys like to become Code Shelter maintainers?


I signed already up. Let's see how it goes...


Fantastic! Did you send us an email as well? It's generally unlikely we'll see the application if you didn't (although I'll look now).


That feels like a broken process. How will you mitigate that?


It's by design, as a test that the maintainers have read the instructions and told us a bit about themselves and what they like, as well as just getting to know them. It generally works well, I think, except in this specific case where the meeting was done out-of-band.


Alright, I've added you, thanks!


This is great! This will make bootstrap more attractive and competitive as a framework.

JQuery was one of the most important frameworks in Javascript history. It has enabled us to built real webapps. However since then differences between browsers shrunk significantly and we learned how to build maintainable and scalable apps in a more declarative fashion (hello React, Angular and friends)


I am still more fond of jQuery's ajax functionality than anything that followed, including the fetch API.


I've always liked to use axios which provides a jQuery-ish AJAX library. Works in the browser and server-side with NodeJS.

https://github.com/axios/axios

Much nicer to use than fetch IMHO


This.


Why? fetch returns a native promise, which are _much much much_ nicer to work with than XHR ever was or desired to be.

Good riddance.


Straightforward error handling both for actual http numbered errors, and for transport errors (site down/DNS/etc).


For one, fetch() doesn’t support progress reporting.


It does. The fetch response object exposes a `ReadableStream` through the `body` property, which you can loop through with a `for await ... of` loop. On each iteration, simply update your progress.



Yep, that’s a cute hack. Now what would be the equivalent of that for file upload? There’s no such API for fetch() because there’s no writeable body stream for requests. QED


I'm not particularly familiar with fetch, but does this do what you want? https://github.com/SitePen/javascript-streams-blog-examples/...


This is a hack that works only for downloads via streams. It won’t work e.g. for file uploads (unless you read the file into memory first, which is highly undesirable).


Missing .abort() on safari is one reason. https://caniuse.com/#feat=abortcontroller



jquery also returns a promise


I’d say that modern web patterns exist only because of jQuery. jQuery adapted the DOM and did for JS what LINQ did for C#. Replace datastore for DOM.

JavaScript itself has subsumed most of jQuery and replacements like lodash would not have existed without it.

Browsers will catch up badly.


fetch is very bleh, so that's not a high bar. What I'd love to see is more Observable-based offerings in that space... RxJS has some rudimentary client, but seeing that pushed further would be great.


You should look at `for await ... of`[1]. You can get a streaming reader from a `fetch` response and loop if through an async iterator. If you prefer the functional syntax observables, I'm sure you can find a library that wraps async iterators as observables.

[1]: https://developer.mozilla.org/en-US/docs/Web/JavaScript/Refe...


It's not to get a stream so much as it is to get automatic aborts in mergeMaps and such. Amazing for things like auto-complete from an API as you type.


Whoosh!

Bootstrap is going the way of table layouts. Table layouts were great too but times change. Right now we have CSS Grid and it works in all browsers. I am sure they will bring this goodness to Bootstrap 5 but there is no need for the new layout engine that works great in all useful browsers to have a framework wrapper.

Similarly with the jQuery, it was once useful in a table layout kind of way but nowadays you might as well just write ES6 javascript rather than learn how to do things the special jQuery way. Frameworks are dead in a similar way to how the internal combustion engine is.

Using the intrinsic properties of the browser is where it is at in the same way that electric is where it is at in the automotive world. The fact that everyone is stuck with their excuses for frameworks (and ICE engines) does not change matters. Hence I read this news with all the enthusiasm of a Tesla owner reading about Peugeot developing a more efficient diesel engine.


Too bad html without styling still looks terrible and differently terrible in every browser. My company uses bootstrap heavily and we’ve never used the grid. Bootstrap gives you a great basis on which to build an interface that has all the little details taken care of, and it’s very easy to skin. I’ve worked on projects where people made their own ui component kits to be cool, and they ended up having a lot of holes and incomplete parts. So much work could have been saved by skinning bootstrap.

Note: you can substitute any well done component framework for bootstrap in the above statement.


>Bootstrap is going the way of table layouts. Table layouts were great too but times change. Right now we have CSS Grid and it works in all browsers.

Which is irrelevant, as Bootstrap is much more than a grid system.

Not to mention it's not meant for the types to mess with their own custom grid layout anyway.


Well I just checked to see if I missed anything with it and I didn't. The utilities and components are the same as I remember them being.

I was the biggest fan of Bootstrap ever about five or more years ago, a total evangelist. But it seems to be just another flavour of div soup really. I have moved on.

Even things like cool scrollspy things aren't worth the bloat or the learning requirement. I also prefer to learn real things rather than fake framework things. Now the tools are needed in modern browsers you might as well do it properly and learn something useful rather than follow some convenient hacks. It is quicker.


>Right now we have CSS Grid and it works in all browsers

Globally support seems to be at around 85%-87% [1], all browsers is misleading.

[1] https://caniuse.com/#feat=css-grid


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: