More

jkachmar · 2024-12-11T20:07:30 1733947650

have you experienced particularly slow pushes with large repositories at all, and if so were you able to resolve them?

I did some profiling & it looks like the issue lies with `libgit2`, but I haven’t been able to replicate the issue outside of that work codebase[0].

[0]: https://github.com/martinvonz/jj/issues/1841#issuecomment-23...

neongreen · 2024-12-17T00:56:56 1734397016

No slow pushes but I’ll be on the lookout for them.

jkachmar · 2024-10-02T18:36:59 1727894219

reposting my comment from another time this discussion came up:

"Cosmopolitan has basically always felt like the interesting sort of technical loophole that makes for a fun blog post which is almost guaranteed to make it to the front page of HN (or similar) purely based in ingenuity & dedication to the bit.

as a piece of foundational technology, in the way that `libc` necessarily is, it seems primarily useful for fun toys and small personal projects.

with that context, it always feels a little strange to see it presented as a serious alternative to something like `glibc`, `musl`, or `msvcrt`; it’s a very cute hack, but if i were to find it in something i seriously depend on i think i’d be a little taken aback."

sublimefire · 2024-10-03T08:20:34 1727943634

The problem with this take is that it is not grounded in facts that should determine if the thing is good or bad.

Logically having one thing instead of multiple makes sense as it simplifies the distribution and bytes stored/transmitted. I think the issue here is that it is not time tested. I believe multiple people in various places around the globe think about it and will try to test it. With time this will either bubble up or be stale.

wmf · 2024-10-03T02:10:40 1727921440

But what's actually bad about it?

jkachmar · 2024-04-08T13:19:41 1712582381

i don’t support hardware development directly, but i’m a software infrastructure engineer working adjacent to the teams that do so.

can’t comment on specifics, but imo our hardware team punches above its weight class in terms of # of people and time spent in design.

jkachmar · on Feb 19, 2024

i can't comment about sora specifically, however the architecture can support workloads beyond just LLM inference.

our demo booth at trade shows usually has StyleCLIP up at one point or another to provide an abstract example of this.

disclosure: i work on infrastructure at Groq and am generally interested in hardware architecture and compiler design, however i am not a part of either of those teams :)

jkachmar · on Feb 10, 2024

i live in NYC and have traveled to plenty of other international cities.

none of the things that you're saying are true compared to my experiences (or those of my friends) in any way that i can think of as meaningful.

the only city i've been to that feels like it's captured the same "vibe" as NYC, for me, has been Paris.

Tokyo was more impressive in its sprawl and history (and obviously cleanliness), but there is a sense of Japanese monoculture that saturates everything in a way that is almost tactile. not in a bad way, but definitely such that i felt like something was "missing" during my visit.

Singapore gets really close to the same feeling, but for all of its heterogeneity there's an undercurrent of authoritarian sterility that made it very difficult to feel comfortable (Disneyland with the Death Penalty, indeed).

anyway this is already pretty long winded so i should probably stop talking, but NYC has a lot "going for it" besides the rest of the US just sort of being a suburban hellscape. at some point i'll move out, but living here has been a really comforting reminder that international views such as yours of American cities are incorrect.

throw__away7391 · on Feb 11, 2024

I was born in Manhattan and lived in the city for over a decade and still own an apartment downtown. I know a thing or two about the place. It's cool that you get a vibe from being a transplant here for a couple years, that has literally nothing to do with anything I said. The lawlessness is also quite a different experience for women--I am guessing having random guys off the street try to force your door open and follow you into your building or corner you on a subway or follow you around riding a bike aggressively catcalling you is probably not something you are dealing with on a regular basis.

The day I left I moved out over a pool of dried blood from a stabbing in front of my door the night before. I've lived in over 20 countries since then and not experience anything similar except maybe in Canada, which has similar drug problems as the US.

RHSeeger · on Feb 12, 2024

With the caveat that I moved away (due to work) a little under a decade ago... what you describe doesn't match my experience with NYC at all. Maybe back in the 80s, before it was cleaned up... but I was less frequently there back then. Before you said you lived in the city, the message from your first post made me assume you were talking about the city as someone who learned everything they know about it from the news.

clearcoat · on Feb 10, 2024

Visiting another city is not in any way comparable to living there. Or would you defer to the opinion of some tourist who visited NYC for a random weekend?

jkachmar · on Dec 22, 2023

this is running on custom hardware, if you’re curious about the underlying architecture check the publication below.

https://groq.com/wp-content/uploads/2023/05/GroqISCAPaper202...

EDIT: i work at Groq, but i’m commenting in a personal capacity.

happy to answer clarifying questions or forward them along to folks who can :)

m3kw9 · on Dec 23, 2023

Is it fixed to a certain llm architecture like llama2? How does it deal with different architectures like MOE for example

tome · on Dec 23, 2023

It's not fixed and our chip wasn't designed with LLMs in mind. It's a general purpose, low latency, high throughput compute fabric. Our compiler toolchain is also general purpose and can compile arbitrary high performance numerical programs without the need for handwritten kernels. Because of the current importance of ML/AI we're focusing on PyTorch and ONNX models as input, but it really could be anything.

We can also deploy speech models like Whisper, for example, or image generation models. I don't know if we have any MOE architectures, but we'll be implementing Mixtral soon for sure!

cicce19 · on Dec 23, 2023

Will you be selling individual cards? Are you looking for use cases in the healthcare vertical (noticed its not on your current list)? Working in the medical imaging space and could use this tech as part of the offering. Reach out at 16bit.ai

tome · on Dec 23, 2023

You can buy individual cards. For example Bittware is a reseller: https://www.bittware.com/products/groq/

But it might be best if you just contact us to explain your needs: https://groq.com/contact/

I can also pass your details on to our sales team.

m1sta_ · on Dec 23, 2023

How easy is it for companies to setup private local servers using Grow hardware (cost and complexity). I've got money. I want throughout.

tome · on Dec 23, 2023

We've built and deployed racks at a number of organizations. Can you write a message to sales explaining your needs? https://groq.com/contact/

Or if you give me your contact details I can pass them on.

mlazos · on Dec 23, 2023

How many chips are used for this demo? Do they have dram? I remember the earlier versions did not have dram.

Are they also used for training or just inference?

tome · on Dec 23, 2023

I think we use a system with 576 Groq chips for this demo (but I am not certain). There is no DRAM on our chip. We have 220 MB of SRAM per chip, so at 576 chips that would be 126 GB in total.

Graphics processors are still the best for training, but our language processors (LPUs) are by far the best performance for inference!

convexstrictly · on Dec 23, 2023

Could you explain the blockers to getting back-propagation working well on your chips?

tome · on Dec 23, 2023

Our language processors have much lower latency and higher throughput than graphics processors so we have a massive advantage when it comes to inference. For language models particularly, time to first token is hugely important (and will probably become even more important as people start combining models to do novel things). Additionally, you probably care mostly about batch size 1. For training, latency is not the key issue. You generally want raw compute with a larger batch size. Backpropagation is just a numerical computation so you can certainly implement it on language processors, but the stark advantage we have over graphics processors in inference wouldn't carry over to training.

Does that answer your question?

convexstrictly · on Dec 23, 2023

Everything you say makes sense. Training is definitely more compute intensive than inference.

Training is both memory throughput and compute constrained. Much research in speeding up training goes into optimizing HBM to SRAM communication. The equivalent for your chips would be communication from the SRAM of one chip to the SRAM of another, where it sounds like your architecture has a major memory throughput advantage over GPUs. So I assume you don't have a proportional compute advantage?

By the way, it's great to see a non von Neumann architecture showing a major performance advantage in a real world application. And your chips are conceptually equivalent to chiplets; you should have a major cost advantage on bleeding edge process nodes if you scale up manufacturing. Overall very impressive!

tome · on Dec 23, 2023

I'm not an expert on the system architecture side of things. Maybe a Groqster who is can chime in. But the way I understand it is that you can't improve latency just by scaling, whereas you can improve throughput just by scaling, as long as it's acceptable to increase batch size. Increasing batch size is generally fine for training. It's a batch process! On the other hand, if someone comes up with a novel training process that is highly sequential then I'd expect Groq chips to do better than graphics processors in that scenario.

moneywoes · on Dec 23, 2023

what’s the cost?

jkachmar · on Dec 23, 2023

right now we’re providing this access to public, anonymous users via this demo chat interface as an alpha test.

we’ll be publishing information about API access, and pricing, shortly after the new year.

tome · on Dec 23, 2023

Yup, we will be price competitive with OpenAI, and much faster!

Frummy · on Dec 23, 2023

You should add latex rendering

jkachmar · on Dec 19, 2023

unless i'm misunderstanding `whisper.cpp` seems to support streaming & the repository includes a native example[0] and a WASM example[1] with a demo site[2].

[0]: https://github.com/ggerganov/whisper.cpp/tree/master/example...

[1]: https://github.com/ggerganov/whisper.cpp/blob/master/example...

[2]: https://whisper.ggerganov.com/stream/

java_beyb · on Dec 19, 2023

have you tried it? i mean for fun, it wouldn't hurt for sure and ggerganov is doing amazing stuff. kudos to him.

but whisper is designed to process audio files in 30-second batches if I'm not mistaken. it's been a while since whisper released, lol. These workarounds make the window smaller but it doesn't change the fact that they're workarounds. you can adjust, modify, or manipulate the model. You can't write or train it from scratch. check out the issues referring to the real-time transcription in the repo.

can you use it? yes would it perform better than Deepgram? -although it's an API and probably not the best API- I am not sure. would i use it in my money-generating application? absolutely not.

jkachmar · on Dec 1, 2023

> i, personally, would not accept money from the company actively militarizing the southern US border but that's just me

food for thought, and that was even before they were advertising offensive weapons technologies of this sort.

gantron · on Dec 1, 2023

Scattering sensors in the desert is militarizing the border? That’s an awfully dramatic characterization, no?

mplewis · on Dec 1, 2023

Think for a second about what the sensors will be used to do.

gantron · on Dec 1, 2023

I fail to see how a country maintaining awareness of passages across its borders is a bad thing.

ansible · on Dec 1, 2023

Especially when it comes to smuggling drugs across the border.

Of course, we could fix the drug smuggling by decriminalizing drug addiction and treating it like other public health issues. But the "War on Drugs" still rages on, after decades of failure.

atdrummond · on Dec 1, 2023

Prevent illegal incursions across the border? Including violent cartels, not just “migrants”?

kortilla · on Dec 1, 2023

Convert the straights into gays and the gays into straights?

No, they would be used to detect people crossing the border. What else would you use them for!?

fastball · on Dec 1, 2023

What do you want the policy to be on the US southern border?

silenced_trope · on Dec 1, 2023

Counter-point: our borders should be protected

jkachmar · on Sept 7, 2023

extremely disappointing that y’all needed the venue to impose this decision.

as said by a friend this morning:

> i, personally, would not accept money from the company actively militarizing the southern US border but that's just me

mlindner · on Sept 7, 2023

> as said by a friend this morning:

> > i, personally, would not accept money from the company actively militarizing the southern US border but that's just me

Anduril is not "militarizing" the southern border. They're adding observation stations (funded by the Biden administration mind you) so that who is crossing the border illegally is known. There's no weapons on these things or something.

joepie91_ · on Sept 8, 2023

No, the weapons come later. The "observation stations" just determine who they will be pointed at. That's different, right?

mlindner · on Sept 8, 2023

There is no contract to provide weapons on the border nor has any official stated they are interested in doing so. It's just pointless hyperbole.

pxc · on Sept 9, 2023

People die crossing the border trying to avoid those cameras, because it forces them through the most dangerous parts of the desert.

The same tech is deeply wrapped up in the Israeli occupation of Palestine.

It's absolutely absurd to pretend that the whole business of border enforcement tech is not stained with blood.

If you're interested in the relevant US history and why it's characterized as militarization and violence by those opposed to it, I recommend this book: https://www.goodreads.com/book/show/7861.No_One_Is_Illegal

(The book has a point of view and makes no apologies for that. But if you want to see why people who disagree with you see the US border and its enforcement the way they do, it will provide that for you.)

jkachmar · on July 28, 2023

fwiw in my experience it's entirely possible to avoid this if you don't submit an application "cold" (i.e. through the careers page), but through a recruiter or referral _IF_ you have a strong portfolio of open source work and/or exposure in technical spaces.

i'm more than happy to do take-home assignments or complete reasonable timed assessments, but i have (politely) refused to complete leetcode-style gotcha screens when they have been presented to me as "just another part of the application process".

idk, though. maybe that's a privileged statement based on my position, but from what i can tell grinding leetcode seems to be much less reliable these days & the toll it takes on a lot of folks is pretty significant.