Bayes7's comments

Bayes7 · 2024-10-19T06:22:27 1729318947

Is there any good article/paper that describes how it actually works or is implemented not just in high-level and hand-waving terms?

unsnap_biceps · 2024-10-19T06:29:37 1729319377

https://hackerfactor.com/blog/index.php?/archives/931-PhotoD... is linked from https://anishathalye.com/inverting-photodna/ and is likely a good starting point

Springtime · 2024-10-19T07:57:16 1729324636

Yeah Hacker Factor's multi-post critiques are where I first saw it analyzed. For reference they run the popular fotoforensics.com image analysis site.

They also have scathing critique (eg [1]) about the Adobe-led C2PA digital provenance signing, having themselves been part of various groups that seek solutions to the provenance problem.

[1] https://www.hackerfactor.com/blog/index.php?/archives/1013-C...

Bayes7 · 2024-10-19T07:17:43 1729322263

thanks!

bawolff · 2024-10-19T07:41:21 1729323681

There tends to be more information under the search term "perceptual hashing"

majke · 2024-10-19T07:34:01 1729323241

The secrecy of the inner tech is intentional.

Bayes7 · 2024-08-17T20:04:27 1723925067

great project!

Bayes7 · 2024-07-19T06:08:48 1721369328

Check out sioyek, it’s great and can open epubs like normal pdfs:

https://sioyek.info/ https://github.com/ahrm/sioyek

flobosg · 2024-07-19T07:58:22 1721375902

Sioyek uses the MuPDF engine, which supports EPUB: https://mupdf.com/

Ringz · 2024-07-19T08:45:11 1721378711

I stumbled over it but didn’t installed it because the website doesn’t mention the ebook functionality.

“Sioyek is a PDF viewer with a focus on technical books and research papers”

zeke_the_cat · 2024-07-19T09:51:10 1721382670

Nightmarish with a tablet.

Bayes7 · 2024-07-04T14:46:43 1720104403

This was a great read, thanks a lot! One a side note, any one has a good guess what tool/software they used to create the visualisations for matrix multiplications or memory outline?

salykova · 2024-07-04T17:11:28 1720113088

excalidraw <3

Bayes7 · on Jan 18, 2024

Summarised by https://xkcd.com/2494

Bayes7 · on Nov 28, 2023

"[...] modern neural network (NN) architectures have complex designs with many components [...]"

I find the Transformer architecture actually very simple compared to previous models like LSTMs or other recurrent models. You could argue that their vision counterparts like ViT are conceptually maybe even simpler than ConvNets?

Also, can someone explain why they are so keen to remove the skip connections? At least when it comes to coding, nothing is simpler than adding a skip connection and computationally the effect should be marginal?

SuchAnonMuchWow · on Nov 28, 2023

Skip connection increase the live range of one intermediate result across the whole part of the network skiped: the tensor at the beginning of a skip connection must be stored in memory for longer while unrelated computation happen: it increase the pressure on the memory hierarchy (either the L2, or scratchpad memory).

This is especially true for example for inference for vision transformers, where it decrease the batch size you can use before hitting the L2 capacity wall.

Bayes7 · on Nov 28, 2023

Okay, I see that for inference. But for training it shouldn't matter because I need to hold on to all my activations for my backwards pass anyways? But yeah, fair point!

jksk61 · on Nov 28, 2023

also removing skip connections leads to a rougher loss landscape, hence it should be harder to find the optimal weights.

sdenton4 · on Nov 28, 2023

Yes there's very good theoretical reasons for skip connections. If your initial matrix M is noise centered at 0, then 1+M is a noisy identity operation, while 0+M is a noisy deletion... It's better to do nothing if you don't know what to do, and avoid destroying information.

I appreciate the sibling comment perspective that memory pressure is a problem, but that can be mediated by using fewer/longer skip connections across blocks of layers.

Bayes7 · on March 12, 2023

hey, off topic but can you explain or link a post which explains what the benefits of the alias -> function definition are over just defining the function directly? Thanks!

prmoustache · on March 12, 2023

I am puzzled as well, why not define the function and call it wiki?

capitainenemo · on March 12, 2023

I'm just used to keeping all my aliases together for easy location with alias command and I like having ones with arguments with the others.

Aside from that, no benefit really that I can think of.

So yes, to be clear to anyone else you can just put:

  wiki(){ w3m -F "https://en.wikipedia.org/w/index.php?search=${1}&title=Special:Search&ns0=1"; }

in your .bashrc - and if you're me you just forget how you defined it and you have to cat it every once in a while :)

Oh, and you might be entertained by this silly alias adapted from an IOCC entry...

  $ cstdin
  printf("Hello World\n");
  Hello World

  alias cstdin='gcc -pedantic -ansi -Wall -o ~/.stdin ~/.stdin.c -lm && ~/.stdin'


  $ cat ~/.stdin.cc
  #include <stdio.h>
  <snip many headers>
  int main(int argc, char **argv)
  {
  #include </dev/tty>
  return 0;
  }

I'm sure it would also make way more sense as a dedicated script. I have a C++ one in there too.

Bayes7 · on March 13, 2023

ah I see, cool thanks!

Bayes7 · on Feb 2, 2023

Very cool! However, I often feel like the process of generating the question/answer necessary for the Anki card is an important part of the learning process because it forces you do deeply think about the material and reflect on it. So I think there is a trade-off involved

KVFinn · on Feb 2, 2023

Absolutely you lose a lot of effectiveness when you automate card creation. But it's still better than thinking, "Oh, I should make a card about this so I don't forget this important thing..." and never doing it.

For me the biggest risk is being tempted to making way too many cards (because it's so fast and easy now!), end up with way too many reviews, and declaring Anki bankruptcy and uninstalling the app after a few months. I may done this more than once...

rjh29 · on Feb 2, 2023

imo if you don't make the card then either: the word isn't important, or it comes up again, and you finally do make the card. It's not a big deal.

I'm about 10 years on my Anki deck and have reached the same conclusion as you about keeping deck size down. A single card has a huge time investment if you add up the reviews. Even more if it's a bad card and you fail it a lot. As the words you learn get more and more niche, it's important to weigh up whether a card is worth making or keeping. I actively delete cards that make me feel 'meh' when I see them, or that I fail a lot, so I don't lose motivation.

barking_biscuit · on Feb 2, 2023

These days I have a policy to just suspend once the interval hits six months, so that deck size has a max cap and can eventually go to zero if adds stop for a long enough period. Long enough to bootstrap niche words and hopefully maintain them through reading.

Bayes7 · on Jan 31, 2023

Robert Sapolsky's Human Behavioural Biology [0] It's like a (good) netflix series you can binge watch!

[0] - https://www.youtube.com/watch?v=NNnIGh9g6fA&list=PL150326949...