Hacker News new | past | comments | ask | show | jobs | submit | leopoldj's comments login

> there are eunuchs who have made themselves eunuchs

This appears to be a bad translation. Modern translation is "and there are those who choose to live like eunuchs for the sake of the kingdom of heaven". This is referring to celibacy. The issue is explained in more detail by Paul in 1 Corinthians 7 verse 8-9.


Google search is weirdly hallucinating saying "Theodore Scott Glenn is an American actor and Distinguished Professor at Rutgers University". As far as I can tell the actor and professor are two different people. Am I wrong? Can't tell what is true/false anymore.


I ran into a similar issue when researching Bill Paxton, a computer scientist who worked on the Mother of All Demos. Google's AI told me that he was also known for his roles in Aliens and Titanic, but that's a different person. I told Bill Paxton (the computer scientist) about this and he found it amusing.


Search engines are getting less usable..I assume this is because they're leaning into LLMs


For what LangChain does, most of the time I see no need for any framework. I would rather directly work with a vendor's official package. LangGraph is different. It is a legitimate piece of workflow software and not a wrapper framework. Now, when it comes to workflow there are many other well established engines out there that I will consider first.


>it can run on my laptop

Has anyone run it on a laptop (unquantized)? Disk size of the 32B model appears to be 80GB. Update: I'm using a 40GB A100 GPU. Loading the model took 30GB vRAM. I asked a simple question "How many r in raspberry". After 5 minutes nothing got generated beyond the prompt. I'm not sure how the author ran this on a laptop.


32B models are easy to run on 24GB of RAM at a 4-bit quant.

It sounds like you need to play with some of the existing 32B models with better documentation on how to run them if you're having trouble, but it is entirely plausible to run this on a laptop.

I can run Qwen2.5-Instruct-32B-q4_K_M at 22 tokens per second on just an RTX 3090.


My question was about running it unquantized. The author of the article didn't say how he ran it. If he quantized it then saying he ran it on a laptop is not a news.


I can't imagine why anyone would run it unquantized, but there are some laptops with the more than 70GB of RAM that would be required. It's not that it can't be done... it's just that quantizing to at least 8-bit seems to be standard practice these days, and DeepSeek has shown that it's even worth training at 8-bit resolution.


Maybe he has a 64GB laptop. Also he said he can run it, not that he actually tried it.


This is a faithful reproduction of the original Transformer paper [1]. Except, these days we use trainable parameters for positional embedding. The paper used a static calculation for positional embedding using sine and cosine.

Figure 1 in the paper can be seen implemented in the forward() method of the GPT class in model.py. Here are the rough steps:

1. Tokens are embedded using a nn.Embedding layer. 2. Tokens are positionally embedded using a nn.Embedding layer. 3. The two embedding values are added to make the input x. 4. A sequence of N number of transformer blocks are then executed. This is the grey box in the left of the Figure 1. This is where all the magic happens. Chiefly in the self attention calculation. You can see this in the forward() method of the CausalSelfAttention class. 5. A regular nn.Linear layer is executed. 6. Finally the output token probabilities are calculated using F.cross_entropy (shown as softmax in the figure).

I hope this helps a little. Please feel free to suggest improvements and additions.

[1] https://arxiv.org/pdf/1706.03762


You should be able to make a few small changes to support "mps".

In TrainingConfig set the device to "mps". The run training.

In sample.py modify parse_args() and add support for mps as a possible value for the --device argument.


Thanks! I'll try. I didn't bother believing that if this was developed heavily on CUDA, it was likely going to use kernels that were missing in MPS.


The article goes into details of what the resize tool does:

> It uses generative AI to stretch the backgrounds of images to fit these required dimensions...


I have created a templating system in Java that uses the Vue syntax. Many of the common features are present. To minimize dependency, I chose to use Java's built-in XML parser to parse the template. Which means, the template has to be a valid XML. I've been using it mainly to generate email content.

https://github.com/bibhas2/Zippy


These "backdoors" are for court authorized wiretapping (The Communications Act of 1934). This is not some kind of a dark conspiracy. Our government argues that wiretapping is an essential tool for law enforcement. [1]

The NSA does have a program to intercept traffic [2]. But that's not what got hacked here.

I do blame CISA and NSA though for not protecting the system well enough.

[1] https://www.americanbar.org/content/dam/aba/publications/lit...

[2] https://en.wikipedia.org/wiki/PRISM


The system could never be protected well enough. It is flawed by design, and now (or however long it has been) those flaws have been exploited by others.


While I agree with you, this is the release of a research paper [1] and some accompanying demos on GitHub [2]. This is not a finished product fine tuned for high quality output.

[1] https://d1qx31qr3h6wln.cloudfront.net/publications/FUGATTO.p...

[2] https://fugatto.github.io/


Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: