Hacker Newsnew | past | comments | ask | show | jobs | submit | seedless-sensat's commentslogin

Came to make the same comment. I got through like 12 paragraphs, and it still hadn't explained what a Semantic Layer is, so I gave up


The landing great lever is shaped like a wheel as a design affordance. It would be VERY hard to confuse


Very hard to confuse if you are thinking about it. Doesn't say anything about the possibility of an action slip.


I think you jumped to the wrong conclusion. Another news source suggests proper groundbreaking 10C charging:for a standard EV

> the BYD’s pile supports the 10C charging. It can charge 400 km in 5 minutes. It is two kilometers in one second! During the live test, this station reached the 1 MW level of power in 10 seconds (while charging Han L EV and Tang L EV). The car’s charging time from 7% to 50% was just 4.5 minutes.


A nitpick on the math you cite: 400km in 5 min (300 sec) is not 2 km per sec.

The bigger open question is what is the degradation penalty for 10C charging. This doesn't matter for a demonstrator but is critical for consumer use.


Looks interesting. Just wondering, how did you decide to use NATS as the transport, instead of Kafka?


Hi! thanks for checking it out. Mostly based on my own familiarity and interest at the time, especially with NATS Jetstream being out now. That said, depending on the feedback and roadmap evolves - I have thought about, say, supporting different intermediaries where it makes sense. The API interface is more/less left open from that perspective. I am also planning on also making NATS optional, for smaller and simpler setups.


In the article, they are using a PPS output into a GPIO IRQ. I don't think they're using serial/NMEA for the timestamping


I’m not familiar with Chrony. With NTPsec, the PPS driver docs say [0]:

> While this driver can discipline the time and frequency relative to the PPS source, it cannot number the seconds. For this purpose an auxiliary source is required;

And so (with NTPsec), you need to define two sources even though it’s coming from the same device. One for the PPS signal for clock discipline, the other for the clock value.

> refclock pps ppspath /dev/gpspps0 prefer

> refclock nmea baud 57600 prefer

0: https://docs.ntpsec.org/latest/driver_pps.html


Sure, but that aux data should not be used for any sub-second accuracy information. The PPS is the end-all be-all definition of the start-of-second. Improving performance of another band should never affect the performance of the sub-second jitter.

They should hook up a scope to that PPS output and compare it to a solid reference. I suspect if they're experiencing intermittent dropouts on a poor GPS module that the PPS signal likely is not a high quality reference. Those ublox counterfeits might be okay, but I've been really impressed with Navspark's pin-compatible ublox "knockoffs". Super cheap, super performant.


Obviously this is subjective, I just wanted to say that I personally found it's production to be incredibly beautiful.


Why are you protecting Google's internal architecture onto to AWS? Your Google mental model is not correct here


I think this con is very real:

> Related tables and indexes are not necessarily stored together, meaning typical operations such as joins and evaluating foreign keys or even simple index lookups might incur an excessive number of internal network hops. The relatively strong transactional guarantees that involve additional locks and coordination can also become a drag on performance.

You handwaved this away saying you can just store an entire table on a single node, but that defeats many of the benefits of these sharded SQL databases.

Edit: Also, before attacking the author's biases, it seems fair to disclose you appear to work at Yugabyte


In the case of YugabyteDB, here is how we avoid "excessive number" of networks hops

- true Index Only Scan. PostgreSQL doesn't store the MVCC visibility in indexes and have to look at the table even in case of Index Only Scan. YugabyteDB has a different implementation of MVCC with no bloat, no vacuum and true Index Only Scan. Here is an example: https://dev.to/yugabyte/boosts-secondary-index-queries-with-... This is also used for reference table (duplicate covering indexes in each regions)

- Batching reads and writes. It is not a problem to add 10ms because you join two tables or check a foreign key. What would be problematic is doing that for each rows. YugabyteDB batches the read/write operations as much as possible. Here are two examples: https://dev.to/franckpachot/series/25365

- Pushdowns to avoid sending rows that are discarded later. Each node can apply PostgreSQL expressions to offload filtering to the storage nodes. Examples: https://dev.to/yugabyte/yugabytedb-predicate-push-down-pbb

- Loose index scan. With YugabyteDB LSM-Tree indexes, one index scan can read multiple ranges, which avoids multiple roundtrips. An example: https://dev.to/yugabyte/select-distinct-pushdown-to-do-a-loo...

- Locality of transaction table. If a transaction touches to only one node, or zone, or region, a local transaction table is used, and is promoted to the right level depending on what the transaction reads and writes.

Most of the times when I've seen people asking to store tables together, it was premature optimization, based on opinions rather than facts. When they try (with the right indexes of course) they appreciate that the distribution is an implementation detail that the application doesn't have to know. Of course, there are more and more optimizations in each release. If you have a PostgreSQL application and see low performance, please open a git issue.

I'm also working for Yugabyte as Developer Advocate. I don't always feel the need to precise it as I'm writing about facts, not marketing opinions, and who pays my salary has no influence on the response time I see in execution plans ;)


Hey Franck, just wanted to say I appreciate your database writings. I read a whole bunch of them over the years, and always found them interesting and educational.


> You handwaved this away, saying you can just store an entire table on a single node, but that defeats many of the benefits of these sharded SQL databases.

I just clarified one-liners listed under the closing "Cons" section. My intention was not to say that the author is utterly wrong. Marco is a recognized expert in the Postgres community. It only feels like he was too opinionated about distributed SQL DBs while wearing his Citus hat.

> Also, before attacking the author's biases, it seems fair to disclose that you appear to work at Yugabyte.

I'm sorry if I sounded biased in my response. I'm with the YugabyteDB team right now, but that's not my first and I bet not the last database company. Thus, when I respond on my personal accounts, I try to be as objective as possible and don't bother mentioning my current employment.

Anyway, I'm very positive to see that this article got traction on HN. As a Postgres community member, I truly love what's happening with the database and its ecosystem. The healthy competition within the Postgres ecosystem is a big driver for the database growth that's becoming the Linux of databases.


What is RAG? That's hard to search for


This one seems like a good summary

Retrieval-Augmented Generation for Large Language Models: A Survey

https://arxiv.org/abs/2312.10997

The photos of this post are also good for a high level look

https://twitter.com/dotey/status/1738400607336120573/photo/2

From the various posts I have seen people claim that phi-2 is a good model to start off from.

If you just want to do embeddings, there are various tutorials to use pgvector for that.


Retrieval Augmented Generation - in brief, using some kind of search to find relevant documents to the user’s question (often vector DB search, which can search by “meaning”, by also other forms of more traditional search), then injecting those into the prompt to the LLM alongside the question, so it hopefully has facts to refer to (and its “generation” can be “augmented” by documents you’ve “retrieved”, I guess!)


So, as a contrived example, with RAG you make some queries, in some format, like “Who is Sauron?” And then start feeding in what books he’s mentioned in, paragraphs describing him from Tolkien books, things he has done.

Then you start making more specific queries? How old is he, how tall is he, etc.

And the game is you run a “questionnaire AI” that can look at a blob of text, and you ask it “what kind of questions might this paragraph answer”, and then turn around and feed those questions and text back into the system.

Is that a 30,000 foot view really of how this works?


The 3rd paragraph missed the mark but previous ones are in the right ballpark.

You take the users question either embed it directly or augment it for embedding (you can for example use LLM to extract keywords form the question), query the vector db containing the data related to the question and then feed it all of LLM as: here is question form the user and here is some data that might be related to it.


Essentially you take any decent model trained on factual information regurgitation, or well any decently well rounded model, a llama 2 variant or something.

Then you craft a prompt for the model along the lines of "you are a helpful assistant, you will provide an answer based on the provided information. If no information matches simply respond with 'I don't know that'".

Then, you take all of your documents and divide them into meaningful chunks, ie by paragraph or something. Then you take these chunks and create embeddings for them. An embedding model is another type (not an llm) that generates vectors for strings of text often based on how similar the words are in _meaning_. Ie if I generate embeddings for the phrase "I have a dog" it might (simplified) be a vector like [0.1,0.2,0.3,0.4]. This vector can be seen as representing a point in a multidimensional space. What an embedding model does with the word meaning is something like if I want to search for "cat" that might embed as a vector [0.42]. Now, say we want to search for the query "which pets do I have" first we generate embeddings for this phrase, the word "pet" might be embedded as [0.41] in the vector. Because it's based on trained meaning, the vectors for "pet" and for "dog" will be close together in our multidimensional space. We can choose how strict we want to be with this search (basically a limit to how close the vectors need to be together in space to count as a match).

Next step is to put this into a vector database, a db designed with vector search operations in mind. We store each chunk, the part of the file it's from and that chunks embedding vector in the database.

Then, when the LLM is queried, say "which pets do I have?", we first generate embeddings for the query, then we use the embedding vector to query our database for things that match close enough in space to be relevant but loose enough that we get "connected" words. This gives us a bunch of our chunks ranked by how close that chunks vector is to our query vector in the multidimensional space. We can then take the n highest ranked chunks, concatenate their original text and prepend this to our original LLM query. The LLM then digests this information and responds in natural language.

So the query sent to the LLM might be something like: "you are a helpful assistant, you will provide an answer based on the provided information. If no information matches simply respond with 'I don't know that'

Information:I have a dog,my dog likes steak,my dog's name is Fenrir

User query: which pets do I have?"

All under "information" is passed in from the chunked text returned from the vector db. And the response from that LLM query would ofc be something like "You have a dog, its name is Fenrir and it likes steak."


Stupid Question: Eli5; Can/Does/Would it make sense to 'cache' (for lack of a better term) a 'memory' of having answered that question.... and so if that question is asked again, it knows that it has answered it in the past, and can/does better?

(Seems like this is what reinforcement training is, but I am just not sure? Everything seems to mush together when talking about gpts logic)


You can decide to store whatever you like in the vector database.

For example you can have a table of "knowledge" as I described earlier, but you can just add easily have a table of the conversation history, or have both.

In fact it's quite popular afaik to store the conversation this way because then if you query on a topic you've queried before, even if the conversation history has gone behind the size of the context, it can still retrieve that history. So yes, what you describe is a good idea/would work/is being done.

It really all comes down to the non model logic/regular programming of how your vector db is queried and how you mix those query results in with the user's query to the LLM.

For example you could embed their query as I described, then search the conversation history + general information storage in the vector db and mix the results. You can even feed it back into itself in a multi step process a la "agents" where your "thought process" takes the user query and breaks it down further by querying the LLM with a different prompt; instead of "you are a helpful assistant" it can be "you have x categories of information in the database, given query {query} specify what data to be extracted for further processing" obv that's a fake general idea prompt but I hope you understand.

Well there's technically no model training involved here but I guess you consider the corpus of conversation data a kind of training, and yeah that would be RLHF based which LLMs learn pretty heavily on afaik (I've not fine tuned my own yet).

You can fine tune models to be better at certain things or respond in certain ways, this is usually done via a kind of reinforcement learning (with human feedback...idk why it's called this, any human feedback is surely just supervised learning right?) this is useful for example, to take a model trained on all kinds of text from everywhere, then fine tune it on text from scifi novels, to make it particularly good at writing scifi fiction.

A fine tune I would say is more the "personality" of the underlying LLM. Saying this, you can ask an LLM to play a character, but the underlying "personality" of the LLM is still manufacturing said character.

Vector databases are more for knowledge store, as if your LLM personality had a table full off open books in front of them; world atlases, a notebook of the conversation you've been having, etc.

Eg, personality: LLM fine tune on all David Attenborough narration = personality like a biologist/natural historian

Knowledge base = chunks of text from scientific papers on chemistry + chunks of the current conversation

Which with some clever vector db queries/feeding back into model = bot that talks like Attenboroughish but knows about chemistry.

Tbf the feedback model it's better to use something strict, ie instruct based model, bc your internal thought steps are heavily goal orientated, all of the personality can be added with the final step using your fine tune.


Off Topic;

It fascinates me how much variance there is in peoples searching skills.

some people think they are talking to a person when searching e.g 'what is the best way that i can {action}' I think the number one trick is to forget grammar and other language niceties and just enter concepts e.g. 'clean car best'


I used to do this. Then when Google's search results started declining in quality, I often found it better to search by what the average user would probably write.


and what would an average user write?


An entire question instead of a bunch of keywords.


Over the last couple of years, at least with Google, I've found that no strategy really seems to work all that well - Google just 'interprets' my request and assumes that I'm searching for a similar thing that has a lot more answers than what I was actually searching for, and shows me the results for that.


Some concepts seems to be permanently defined as a spelling error and will just be impossible to search for.


I found something very annoying while looking for technical data ( a service manual for an ancient medical device - build around 2001).

The same term was the name of the device + something about the power source.

The result from the client network - my phone/client computer nothing related to the search for 4-5 pages.

Same search from work - second result was what I was looking.

So it seems there is a relation with your search history, but somehow connected with the related search history from the same ip/network.


same experience. I'm generally getting better results at client's (VPN) network, we are all googling for the same stuff, I guess.

It must be possible to create a fixed set of google searches and rate the location based on the results. So you could physically travel to a Starbucks 20miles away to get the best results for the 'best USB-C dongle reddit'.


Unfortunately search engines have learned to, well, basically ignore user input.

Amazon is the worst.

I used "" and + and - for terms to get what I want, and its search engine still gives you the sponsored results and an endless list of matches based on what you might buy instead of what you searched for.

ugh.


That’s why they will love chatgpt


Retrieval-augmented generation, RAG + LLM will turn up more results.


Seems fairly easy to search for to me - top results are all relevant:

https://kagi.com/search?q=ml+rag

https://www.google.com/search?q=ml+rag


I had the same query and instead of just scrolling down, I copy and pasted the paragraph into Bing chat and asked it what it meant. It got it right, but I probably should have scrolled farther first lol.

It's retrieval augmented generation


"Retrieval augmented generation". I found success from "rag llm tutorial" as a search input to better explain the process.


RAG: having a LLM spew search queries for you because your search foo is worse than a chat bot alucinations.

or because you want to charge your client the "ai fee".

or because your indexing is so bad you hide it from your user and blame the llm assistant dept.


Ask chatgpt next time. "What is rag in context of AI?"


Or just using a traditional search engine and "rag" plus literally any ML/AI/LLM term will yield a half dozen results at the top with "Retrieval-augmented generation" in the page title.


Or people could just not use obscure acronyms when discussing specialised topics on an open forum?


Where do you draw the line though?


Or if GGP can't think of an AI-related term they can use HN search. Searching 'rag' shows the term on the first page of results:

https://hn.algolia.com/?dateRange=all&page=0&prefix=false&qu...


Searching for "RAG" on Kagi and Google give some AI-related results fairly high up, including results that explain it and say what it stands for.


Right? How does someone who browses this forum not know how to find knowledge online?


What percentage of people could you fool if you told them it was AI and replayed standard search results but with the "karaoke-like" prompt that highlights each word (as if we're 2nd graders in Special Ed learning how to string more than 2 sentences together)


I am looking at this while sitting on Amtrak. It is about 14mi behind my current position, but still very cool!


Yeah, I'm going to add a disclaimer about this. I've been watching the GO trains from my window this morning and they lag about a minute behind. My site grabs data on a minute interval, and I know some of the API's say they purposefully add GPS lag.


I've made something similar in the past and my experience with the specific API was that it was surprisingly well considered but the actual data it returned was unreliable at best. Not just "slightly obfuscated for paranoid physical security reasons" but actually missing trains and reporting incorrect names.


For Amtrak only you can use the official source https://www.amtrak.com/track-your-train.html too (it shows route and also delay)


Consider applying for YC's Winter 2026 batch! Applications are open till Nov 10

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: