Hacker News new | past | comments | ask | show | jobs | submit | cschep's comments login

this is so cool. hats off to the creators. I bet it's a ton of logistics to manage. I surfed around the site but couldn't find any information re: volunteering with them? I wonder if they need them and what kind of commitment it would be? I'd be interested!

They mostly seem to be "by and for high schoolers". This is what I was told when I was asking how I could help with my son setting something up.

Hack Club is run by adults (you can see the team page here: https://hackclub.com/team), and every aspect of Hack Club is also built by teenagers contributing to everything from HCB (our fiscal sponsership platform) to High Seas (https://highseas.hackclub.com), a program we're running right now.

Happy to help if you still need help setting something up for your son :)


someone probably googled “kayak” from your wifi?

Hmmm, I live alone and in a pretty remote area (on an island). I keep my wifi pretty secure, so, that seems pretty unlikely. But who knows, anything is possible I suppose. Maybe because I live on an island.

>Hmmm, I live alone and in a pretty remote area (on an island).

A kayak seems like a practical purchase if you live on an island.


You live on an island, see ads for kayaks and the first thing you went with was: "somebody is listening to all my conversations even though my phone battery wouldn't be able to handle a slightly long phone call"?

That's what people mean when they say there is a lot of data points that can explain things very easily without having to resort to a convoluted explanation.


Yeah, if ad targeting networks know you live on an island (easily done, location data is the most common type they get their hands on) it's pretty sensible for them to try and sell you a kayak.

Or they googled fitness or outdoors and live near water

Edit: yep


Weird it just popped up. Probably they cranked up a new algo that said, if they live near water, hit 'em with kayak ads. Why not paddle boards or boats too? Whatever.

What has the highest margin and therefore the most ad spend? Probably fancy folding kayaks. Kayaks are also popular with fishing in a way paddle boards aren’t. And either that person or his neighbors or friends are into fishing and Google knows this

why would you call it expecting it to do nothing?


why is the wrong question, because it doesn't matter


of course it matters. not all workflows are worth preserving at all costs. there could be a really important trade off lurking, I'm not familiar, which is why I asked!

https://xkcd.com/1172/


I enjoyed reading the feedback, and their response, and felt no negativity whatsoever. Keep honestly interacting it's good for everyone!


Mostly I agree with you but sharing URL’s to resources is vastly better on the web. So is distributing updates.


How would we train it? Don't we need it to understand the heaps and heaps of data we already have "tokenized" e.g. the internet? Written words for humans? Genuinely curious how we could approach it differently?


Couldn't we just make every human readable character a token?

OpenAI's tokenizer makes "chess" "ch" and "ess". We could just make it into "c" "h" "e" "s" "s"


We can, tokenization is literally just to maximize resources and provide as much "space" as possible in the context window.

There is no advantage to tokenization, it just helps solve limitations in context windows and training.


I like this explanation


This is just more tokens? And probably requires the model to learn about common groups. Consider, "ess" makes sense to see as a group. "Wss" does not.

That is, the groups are encoding something the model doesn't have to learn.

This is not much astray from "sight words" we teach kids.


No, actually much fewer tokens. 256 tokens cover all bytes. See the ByT5 paper: https://arxiv.org/abs/2105.13626


More tokens to a sequence, though. And since it is learning sequences...


Yeah, suddenly 16k tokens is just 16kb of ASCII instead of ~6kwords


This is just more tokens?

Yup. Just let the actual ML git gud


So, put differently, this is just more expensive?


Expensive in terms of computationally expensive, time expensive, and yes cost expensive.

Worth noting that the relationship between characters to token ratio is probably quadratic or cubic or some other polynomial. So the difference in terms of computational difficulty is probably huge when compared to a character per token.


aka Character Language Models which have existed for a while now.


That's not what tokenized means here. Parent is asking to provide the model with separate characters rather than tokens, i.e. groups of characters.


"it seems many Rust and Golang positions go unfilled."

I was under the impression that everyone wanted a job writing Rust, but had to settle for making React components. Point me to these unfulfilled Go jobs please. :)


Both might be true. I've looked at what is out there for Rust and Go and it seems there is a lot of the activity is in foreign countries with jurisdictional (and maybe even cultural) barriers to working there. They can't fill the jobs and we can't fill them.


I am curious as well as someone exploring switching back to an IC role.


Desktop computers are amazing these days.


well said!

as I was just sitting down to another day of ruby on rails (that I am grateful for!) I was thinking.. I wonder what hobby/open source projects could use some of my attention later..

.. what projects my attention could use later .. :D


you might like https://janet-lang.org/ !


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: